To be fair, things are moving that way already. If you look at 3Par, compellent, EMC and some of the new up and coming players such as Tintri or Nutanix, they're all adopting an SSD first approach. The themes come from two angles being auto tiering and shuffling blocks/chunks of data in its final resting place in the disk with varying degrees of granularity and regularity and the other approach being a tiered cache approach from PCI-E in the server, through to extending cache with SSD in the array. Some vendors are proposing caching appliances on the fabric. Its interesting as the former approach lends itself to trending IO workload over whatever period of time to figure out where to move blocks about and the latter is more for bursty workloads. I think in part the approach taken will depend also on how application storage workloads evolve, as today some make good use of cache, others doing.. some workloads can be trended, others not so much (such as VDI in a non persistent environment, as LBA addresses used for desktops are trashed and then new addresses used as desktops are refreshed).
9 posts • joined 12 Mar 2010
Re: How do you identify I/O hotspots and how do you address them?
Personally a combination of three things come into play here . Firstly proper planning of LUN/RAID group layout, which considers the applications and workload types, not just a pool of IO and capacity given to a hypervisor (we're not there yet, unless you consider some upcoming vendors such as Tintri and Nutanix who are built from the ground for virtualisation). Proactive avoidance of hotspots is always alleviate the degree to which you need to be reactive.
Secondly, an array which leverages a good API interaction with your hypervisor. Being able to translate an a suffering application to a specific LUN ID quickly is crucial to fast resolution for me. Many vendors give you good hypervisor visibility into the storage for mapping a VM back through the abstracted layers to RAID groups and LUN names, but also being able to look at the storage GUI and identify which VM's reside on a given LUN is masively helpful.
Thirdly. A good storage vendor will have the array intelligence to allow you to set threshold alerts for things such as queue depth, service time/disk latency and utilization being able to respond to that alert and quickly identify the application VM causing the contention will allow you to intelligently decide on how you deal with the issue when you don't have the luxury of just adding spindles is a major benefit (enter point number 2).
There are also a number of third party vendors which have both hypervisor and array vendor awareness which can help correlate between the two (such as Solarwinds since their Tek tools aquisition) or even VMware's own Enterprise Operations Manager (providing VMware is your hypervisor of choice and your storage vendor has worked with VMware to allow themselves to be reported on in this way, which many have).
In terms of how you address these hotspots, there are a few methods which vary from simply migrating VM workloads to less stressed LUN's/datastores/RAID groups to storage based QOS (can get a little complicated unless you really know what your IO goals or thresholds are. Do you want to rob Peter to pay Paul?), better cache management (look for LUN's which are just filling cache and forcing it to flush with no benefit to response time and disable them so that LUNs that really benefit from cache can use it), technologies which boost cache with either server side PCI-E based SSD or SSD in the array can level out hotspots depending on the workload (there are also a number of fabric based SSD caching appliances coming on). Then we have the marvel or auto-tiering of data where at sub LUN level, chunks or blocks of data are dynamically moved between more or less performing tiers of disk (in reality, the larger arrays have this down and some of the mid market arrays are catching up. Also the profile of your data and how much of your capacity is driving majority IOPS is a massive factor as to whether it will actually benefit you here align with the application workload profile itself).
Anyway, just a few thoughts from my perspective.
I find this highly amusing and couldn't be better timed after there recent slating of vmware cloud products upon the opening of vmworld. back in ya box !
So its basically an NS20 with 64 bit processors and a bit more cache. Did they hint at whether the easy management suite was a variant on the new unisphere platform or something completely new and as to whether it was running the new flare and dart to support some of the new plugins to. Support the VAAI portion of the vstorage API set from vmware ?
So it has some block level awareness of whats going on to avoid block level commonalities across VM's and implements its own version of change block tracking (which we have on vsphere anyway).. thats all well and good, but Pancetera doesn't mention application consistency once on their site that I can see.. any ideas on how they handle that ?
I don't really think EMC have much to worry about. Proposing that Sepaton grad architecture is better than DD's offering when EMC have Avamar in their portfolio already is a little weak. EMC Partners will most likely position DD for point solutions, but also have Avamar in their armoury if having a Grid architecture is really a deal maker/bearker. Bearing in mind that Avamar also does dedupe at source and is already on its 3rd Generation. I'd say Sepaton are a little late to the party.