_How Many IOPS?: “A question I get asked occasionally is; ‘How many IOPS can my RAID group sustain?’ in relation to Enterprise class arrays.”
(Via The Storage Architect.)_
An excellent little reminder about how to estimate your storage bay’s performance capabilities. It also makes me wonder if we aren’t approaching the question of shared storage the wrong way. Instead of asking each vendor what the best performance they can get, how about we ask them the worst we can expect?
Everyone knows that you can tune a storage system and the IO test profile to pull out massive IOps or MBps (but not both at the same time). Wouldn’t a more realistic measure of the value of the storage system be based on how it performs when being bombarded with a completely random IO profile?
This approach (while somewhat brutal) looks to carry more value as an analytic tool for sizing storage used for virtualisation projects where all of your different IO profiles turn into a smear. If you buy your storage based on the worst case, rather than an optimistic evaluation of the maximum capacity, you’ll know that in real world use you’ll come out ahead.
We know that we can calculate with some precision the worst case IOps per disk based on the measured latency - what we need to know is how much the value is being added by the array with the various tiers of cache and the algorithms used.
On paper, a standard RAID group of 5 disks (based on the FC disks noted in the Storage Architect article) we can count on a minimum of 925 IOps. Obviously we rarely see performance this bad since the disks themselves also carry 32Mb of local cache, and then there is the cache on the controller and so on up the chain (not to mention that real-world IO is never totally random, even with the smear factor taken into account). I think that the question we should be asking storage vendors is: How much value do you add to the equation? (1)
I think that instead of trying to tune our bays and tests to determine the maximum values, we should develop a multi client test where the bay is bombarded by random IO of varying sizes over a representative surface in order to see the delta that is achieved by the intelligence of the storage solution.
Ideally, I’d like to see storage array’s performance rated as a factor over the baseline capacity of the physical disks. As Mr. Evans notes in the article, IOps are a function of speed, not capacity. The buying decision then comes down to two criteria :
Speed factor
Using a performance factor of two, this means that under maximum stress the storage array should be able to double the number of raw disk IO operations. For a 5 disk RAID, the 925 IOps guaranteed by the disks are multiplied by 2 for a baseline of 1850 IOps.
Storage volume
Once I have my IOps requirements, I simply choose the size of disk required to cover my storage needs.
I’ve already done some basic benchmarking based on this approach using multiple virtual machines (5-20) each using a different IOMeter profile with a mix of read/write operations, plus wildly variable different block sizes. On a name brand bay the results were pretty painful and the performance factor was actually a negative value (less than one). I did the same tests on a storage virtualisation solution that gave a performance factor of 2.7. (2)
Now we need to learn what an averaged IO profile looks like for a virtual server implementation so that we can add in a reasonable “real-world” usage factor. My currect guesstimate is that this will be on the order of 3-5, but I’m being conservative here.
Other issues that come into play in order to build a useful benchmark structure are details like standardizing a number of RAID groups in order to validate different possible optimizations, the use of multiple LUNs based on the same RAID group, IO going to one LUN or shared out over multiple LUNs, number of active paths to the storage array, number of concurrent IO sources, both physical servers and virtual machines, … Obviously there are a lot of factors to consider, but if we can normalize a bunch of them, this could be a very useful metric.
Comments anyone?