Customers prefer to spend as little as they need to on
infrastructure. One of the tricks is to understand what you really need
and where you may be able to cut. With the price of disk capacity
dropping each year, customers would love to buy the largest/cheapest
drives that they can to support their workload. This results in
near-term savings when the gear is purchased, and longer term savings
from reduced power, cooling, and maintenance bills. But how large can
the drives be before they cannot sustain the business workload?
Many customers decided that the manpower was more expensive than just buying additional hardware 'to be safe.' Sure, you could buy a mix of drive types, but buying all the same drives and spreading the workload across all of them is much easier to plan for. And if you did it right, the workload would push the drives hard (to about 70% busy) about the same time you 'ran out' of space (at 75-90% full). Of course, if the workload turns out to have much different needs, you either waste space (the drives get too busy) or waste money (you bought more performance than you need).
The Problem
Storage tiering is not a new idea. Back in 2003, EMC started a big campaign to help customers save money by tiering their data. It was called ILM - Information Lifecycle Management. The idea was to help customers put the right data on the right storage at the right time, saving them money. There were only two problems: the tiering placements were manual (like this process example), and to place it correctly you needed to understand what the profile for the data was currently and what it would be in the future.Many customers decided that the manpower was more expensive than just buying additional hardware 'to be safe.' Sure, you could buy a mix of drive types, but buying all the same drives and spreading the workload across all of them is much easier to plan for. And if you did it right, the workload would push the drives hard (to about 70% busy) about the same time you 'ran out' of space (at 75-90% full). Of course, if the workload turns out to have much different needs, you either waste space (the drives get too busy) or waste money (you bought more performance than you need).
With storage capacities growing at a 60% annual rate for many large
customers, they have to do something new. They are running out of data
center space. They are running out of power and cooling. They cannot
afford to just estimate and hope that what they buy is right. But they
also cannot afford to have their staff constantly moving data around to
deal with configuration challenges.
In general, the plan is to replace an all high speed disk (10/15K drives) configuration with a tiered solution that combines Enterprise Flash, high speed, and bulk drives. The Flash drives are quite expensive for the capacity, so they can drive costs up. However, with FAST VP, the hottest data is moved to the small Flash capacity. By using about 3% Flash capacity, most workloads can move over 50% of their disk I/O to Flash (assuming 7.5 MB movements). Not only does this drive down the I/O requirements for the remaining drives, it also means that the top 50% of the disk I/O is happening at 1 ms, decreasing the response time for the most active space used by the applications.
So now that half of the workload is off of the remaining drives, using larger high speed drives presents less of a risk. After all, if the workload becomes hotter than was planned, additional Flash can be added to the array to take off more of the heavy I/O areas. Managing changes in I/O density is much easier with these automated policies.
And there is a lot of data in most systems that is just rarely accessed. In general, about 70% of the capacity does less than 10% of the overall I/O workload (assuming 7.5 MB movements). The bulk drives hold a lot of data and use less power, so the savings in purchase and operational costs per TB are both wonderful. And EMC prices all VMAX software 50% less for bulk capacity drives, providing even more savings. Many customer workloads will see improved performance, a smaller power and floor space configuration, and lower costs (both initial and ongoing) by moving to such a tiered design.
To map this out, EMC has Tier Advisor, which can analyze current workloads and model them on various storage configurations. When the array arrives, Symmetrix Management Console (SMC) makes it easy to define tiers and policies, and to apply them to Virtually Provisioned devices. Once FAST VP is running, Symmetrix Performance Advisor (SPA) gathers performance details on the array, metrics on data moving between tiers, and capacity levels of each tier.
The customer decided to build two policies, one for high priority production and the other for standard priority production. Since they were just getting started on growing into the array, they decided to start fairly conservatively. They set the high priority policy at 20% Flash and 100% high performance. With this policy, nothing gets demoted to bulk drives, and up to 20% can move up to Flash if the I/O density warrants it. They set the standard priority policy at 5%/65%/40%, which will force at least 30% of the space managed by this policy onto bulk storage. It also allows up to 5% to move up to Flash if it is justified. And note that these policies share the same 3 tiers of storage on the same pools and drives. The applications were running pretty well on the Fiber Channel space they were originally assigned to. Then FAST VP was activated, and response times for reads went down nicely. The blue line here is a day before FAST VP was live, and the green line is a week later. It was clear that the Flash drives were having the desired impact. (This discussion focuses on read time since the writes all go to cache, so the performance of the drives make little difference for writes as long as they can destage in time.)
The layout that resulted in the storage pools also validated the percentages they had planned. The Flash tier was holding 2.7% of the allocated capacity and driving over 50% of the workload. The ATA tier was holding 32% of the allocated capacity and getting less than 1% of the workload. And the FC tier was taking the data in the middle. Over time, as they chose to increase the percentage of capacity in the array that is ATA, and allow the policies to move more data down, the effective costs will continue to improve.
For each disk I/O that happens from the device, a counter is updated to note the activity. The counters decay over time, so while there is historical data, there is also a bias toward more recent activity. This activity level is continuously being checked against the thresholds, with promotions being even more heavily biased toward recent data (move hot things up quickly, move things that were hot for a while down slowly). A huge jump in activity on an area of a LUN can get that space promoted in a few minutes.
And FAST VP is ready to make lots of moves. If the performance profile changes justify it, a VMAX may move over 10 TBs of data each day even on the default relocation rate (priority of reacting to change). And since the change is ongoing, this presents a background load, not a huge burst to get in the way of normal operations. Since the research was done before VMAX was designed, the bandwidth to keep up with any needed moves without getting in the way of production workloads was part of the orginal specifications.
What makes this really different is the granularity of control this gives the storage manager over how a given device will be treated. Assume that a customer has an array with 3% of the capacity on flash, 27% on high speed drives, and 70% on bulk storage. Now they want to support 2 business needs: production needs all 3 tiers, but test does not need any flash, and does not need much high speed. With FAST VP, we can build policies of 5%/100%/100% for production and 0%/25%/100% for test. They can all share the same drives, so that when test is not running, production has the full use of every drive in the array. And no matter what changes the business may need to support changing I/O levels, a simple policy adjustment will reallocate the storage based on the best use given the historical performance data. If a few production devices need special attention, such as a desire to keep them on flash even when their performace may not warrant that, adding a new policy can make this happen rapidly.
Now consider what happens when tiering is built from RAID groups and combining drives into mixed pools. To allow test to have access to high speed capacity, I have to dedicate that capacity to the test pool in RAID group increments. To allow production to move data down to bulk storage, I again have to dedicate space for that in RAID group increments. And if I need to move space between different pools, that also has to be done in RAID group increments. And when customers are using 2+ TB drives, even a single RAID group can get pretty big. In the end, each of the mixed pools has to have the right physical drives in it to meet the current need, plus unused space to allow for things to move between tiers. This does not make for a pretty management picture.
The Solution
In December of 2010, EMC introduced Symmetrix VMAX Fully Automated Storage Tiering for Virtual Pools (FAST VP). This new software does what ILM set out to do 7 years before - place data on the right tier of storage in a timely manner. The difference is that now all of that storage fits in the same array, and the user gets to manage the movement by setting simple policies. Details on how this works appear in Barry's blog here and here, and the Enterprise Stratey Group did a nice white paper covering their testing of FAST VP with Oracle. And I have included some additional notes in a section below.In general, the plan is to replace an all high speed disk (10/15K drives) configuration with a tiered solution that combines Enterprise Flash, high speed, and bulk drives. The Flash drives are quite expensive for the capacity, so they can drive costs up. However, with FAST VP, the hottest data is moved to the small Flash capacity. By using about 3% Flash capacity, most workloads can move over 50% of their disk I/O to Flash (assuming 7.5 MB movements). Not only does this drive down the I/O requirements for the remaining drives, it also means that the top 50% of the disk I/O is happening at 1 ms, decreasing the response time for the most active space used by the applications.
So now that half of the workload is off of the remaining drives, using larger high speed drives presents less of a risk. After all, if the workload becomes hotter than was planned, additional Flash can be added to the array to take off more of the heavy I/O areas. Managing changes in I/O density is much easier with these automated policies.
And there is a lot of data in most systems that is just rarely accessed. In general, about 70% of the capacity does less than 10% of the overall I/O workload (assuming 7.5 MB movements). The bulk drives hold a lot of data and use less power, so the savings in purchase and operational costs per TB are both wonderful. And EMC prices all VMAX software 50% less for bulk capacity drives, providing even more savings. Many customer workloads will see improved performance, a smaller power and floor space configuration, and lower costs (both initial and ongoing) by moving to such a tiered design.
To map this out, EMC has Tier Advisor, which can analyze current workloads and model them on various storage configurations. When the array arrives, Symmetrix Management Console (SMC) makes it easy to define tiers and policies, and to apply them to Virtually Provisioned devices. Once FAST VP is running, Symmetrix Performance Advisor (SPA) gathers performance details on the array, metrics on data moving between tiers, and capacity levels of each tier.
The Results
I got a note last week from one of our Technical Consultants who had helped his customer to implement a FAST VP solution. They were very pleased with the implementation and the results - Oracle was running faster, and they could see how this was going to make storage management easier AND make storage cheaper for them going forward.The customer decided to build two policies, one for high priority production and the other for standard priority production. Since they were just getting started on growing into the array, they decided to start fairly conservatively. They set the high priority policy at 20% Flash and 100% high performance. With this policy, nothing gets demoted to bulk drives, and up to 20% can move up to Flash if the I/O density warrants it. They set the standard priority policy at 5%/65%/40%, which will force at least 30% of the space managed by this policy onto bulk storage. It also allows up to 5% to move up to Flash if it is justified. And note that these policies share the same 3 tiers of storage on the same pools and drives. The applications were running pretty well on the Fiber Channel space they were originally assigned to. Then FAST VP was activated, and response times for reads went down nicely. The blue line here is a day before FAST VP was live, and the green line is a week later. It was clear that the Flash drives were having the desired impact. (This discussion focuses on read time since the writes all go to cache, so the performance of the drives make little difference for writes as long as they can destage in time.)
The layout that resulted in the storage pools also validated the percentages they had planned. The Flash tier was holding 2.7% of the allocated capacity and driving over 50% of the workload. The ATA tier was holding 32% of the allocated capacity and getting less than 1% of the workload. And the FC tier was taking the data in the middle. Over time, as they chose to increase the percentage of capacity in the array that is ATA, and allow the policies to move more data down, the effective costs will continue to improve.
Additional Information
There are several products in the market that are talking about automated tiering within or between arrays. And the fundamental concepts of Hierarchical Storage Management (HSM) are nothing new. But there are several things that make FAST VP different.Engineered for Smart Efficiency
VMAX was designed with FAST VP in mind. EMC did extensive reseach, collecting I/O profiles from millions of workloads on thousands of arrays. Based on this research, the VP extents (allocation units) were set at 768 KB. Looking at how to isolate the hot data from the warm data from the cold data, we worked to use the smallest size without too much overhead. A general goal was to get 50% of the workload onto something near 3% of the overall capacity (which we would place on Enterprise Flash Drives - EFDs) for over 90% of the workloads we reviewed. Our research showed that managing data at the sub-1 MB level, we could hit the 50% workload with about 2.5% of the capacity. With 7.5 MB (10 VMAX VP extents), the number was 3% - a small increase, and it cut the management overhead by a factor of 10. Managed at the 40 MB level, the number jumped to around 6%. And at 1 GB, the number jumped again to around 12%. So we went with the 10 VP extents per unit we would move with FAST VP (called an extent group).Fast Reactions
FAST VP policies set the percentage of data that can live on each of 3 tiers. Based on the percentages in the policy and the historical activity patterns, FAST VP sets performance thresholds for promotion and demotion. These thresholds are adjusted for each device in the array every 10 minutes.For each disk I/O that happens from the device, a counter is updated to note the activity. The counters decay over time, so while there is historical data, there is also a bias toward more recent activity. This activity level is continuously being checked against the thresholds, with promotions being even more heavily biased toward recent data (move hot things up quickly, move things that were hot for a while down slowly). A huge jump in activity on an area of a LUN can get that space promoted in a few minutes.
And FAST VP is ready to make lots of moves. If the performance profile changes justify it, a VMAX may move over 10 TBs of data each day even on the default relocation rate (priority of reacting to change). And since the change is ongoing, this presents a background load, not a huge burst to get in the way of normal operations. Since the research was done before VMAX was designed, the bandwidth to keep up with any needed moves without getting in the way of production workloads was part of the orginal specifications.
Simple Tier Management
FAST VP supports 3 tiers of storage, and it does this without mixing the drive types into mixed pools. The virtual pools are constructed from protect space on a given technology (flash, high speed 10/15K drives, or bulk drives) and RAID type. FAST VP Tiers are built from up to 4 pools that are based on the same technology. FAST VP Policies are used to combine 2 or 3 tiers based on different technologies.What makes this really different is the granularity of control this gives the storage manager over how a given device will be treated. Assume that a customer has an array with 3% of the capacity on flash, 27% on high speed drives, and 70% on bulk storage. Now they want to support 2 business needs: production needs all 3 tiers, but test does not need any flash, and does not need much high speed. With FAST VP, we can build policies of 5%/100%/100% for production and 0%/25%/100% for test. They can all share the same drives, so that when test is not running, production has the full use of every drive in the array. And no matter what changes the business may need to support changing I/O levels, a simple policy adjustment will reallocate the storage based on the best use given the historical performance data. If a few production devices need special attention, such as a desire to keep them on flash even when their performace may not warrant that, adding a new policy can make this happen rapidly.
Now consider what happens when tiering is built from RAID groups and combining drives into mixed pools. To allow test to have access to high speed capacity, I have to dedicate that capacity to the test pool in RAID group increments. To allow production to move data down to bulk storage, I again have to dedicate space for that in RAID group increments. And if I need to move space between different pools, that also has to be done in RAID group increments. And when customers are using 2+ TB drives, even a single RAID group can get pretty big. In the end, each of the mixed pools has to have the right physical drives in it to meet the current need, plus unused space to allow for things to move between tiers. This does not make for a pretty management picture.
0 comments:
Post a Comment