HOW TO CHOOSE DISK ARRAY – CHAPTER 3
While in previous two chapters of our popular online series we talked about technical parameters of possible solutions, today’s chapter will be about practical recommendations that should eliminate disappointment from the chosen one. How to put up an assignment?
Let’s say that the CIO already knows what is required – or rather knows what is not required. Many suppliers are invited and each one of them presents their product as number one. How to know the ropes and not to swallow the bait?
SET DOWN CLEARLY QUALIFIABLE PARAMETERS
These are required capacities, performance and software functionality like Automated Tiering, Dynamic Provisioning, Snapshots, Clones, Replication...
Regarding architecture it is good to set down a kit of acceptance tests verifying whether the disk system is actually redundant and failure of one component does not cause unavailability of data.
But let’s get back to the performance, because it is the most issuable one. You can measure entirely different performance parameters of disk array when reading or recording is measured, random or sequential operations, in small or large blocks, on one or many streams and so on. If you as the submitter do not specify the kind of operation then the supplier chooses the test conditions that will result in their favour and then you close the deal with something that is a huge compromise from the very beginning.
HOW TO SET DOWN PERFORMANCE TESTS
It is necessary to strictly set down what program should be used for testing and what parameters should be set. There are quite a number of programs – i.e. IOMeter, IOZone and who needs an assurance by some authority (i.e. Microsoft) can choose testing utility SQLIO.
It is important to test not only on one but on many competing LUNs so all processors and controllers partake of processing of the load. Only this way it can be proven how individual LUNs will influence each other in a planned operation with live ammunition.
Tests should respect the logic of operation that is generated by common applications – these usually communicate with block of size of 4-16 kB for random operations and >64 kB for sequential operations.
The utility that can always reliably flood a disk array is above mentioned SQLIO. Because following test results are based on this system we should introduce it to you. It is a tool which creates workload towards a disk array. It is possible to define the size of blocks, type of operation (random/sequential), and numbers of tested competing files and so on.
An example of syntax:
Sqlio.exe – kR – s360 – frandom – o8 – b8 – LS – Fparam.txt
Where – kR: random operation, - s360: test length in seconds, - b8: size of block 8 kB, - Fparam.txt: link to the tested files.
The listing or parameters is not complete, you can look up the details in the manual. The result is besides IOPs and MB/s primarily latency Histogram. It is complex information, which predicates a lot about the future behaviour of storage. Histogram does not predicate only about reached performance but also about its stability. It means whether the latency sticks to the average value or it chaotically moves away from it. That is the instability caused by the architecture of disk array.
PERFORMANCE TESTS IN PRACTICE
Three disk systems standing in the Enterprise category ended at the base of a tender that truly went through:
- Hitachi HUS-VM (Flash + disks)
- Competing system no. 1 (SSD + disks)
- Competing system no. 2 (Full Flash Array)
It needs to be said that all three systems achieved similar results regarding the amount of served operations per second (IOPs). All three systems also managed to process 100 % operations till 5 milliseconds.
Nevertheless the essential information is hidden in details – in the stability of results:
- The HUS-VM system achieved the best results; it served 75 % operations up to 1 millisecond, 100 % operations up to 2 milliseconds.
- The competing system no. 1 did alright itself – 71 % operations served up to 1 ms. But the results were not completely stable and because of the internal overhead of work with metadata the latencies sometimes bounced to the limit of 3-4 ms.
- The competing system no. 2 (Full Flash Storage) was not able to serve any of the transactions up to 1 ms.
WHAT’S THE CONCLUSION?
HITACHI HUS-VM system confirmed its leading position especially during the tests of real data. Compared to the synthetic tests the real operation confirmed how significant part the size of cache takes, which in the case of HUS-VM was 256 GB. Not only relatively large size of cache but primarily the advanced logic of its operating where only Write operations are reflected (not Read, it can by loaded directly from the disk in case of cache module failure) caused the high percentage of operations handled right from cache.
Whereas the synthetic tests made Hitachi HUS-VM system a slight leader in the process of selection procedure, the practical tests of loads of database operations clearly confirmed its superiority.
The order party thanks to the clear and non-discriminating setting of required qualities and performance characteristics gained high-quality hardware, which covers the needs of their business with performance potential qualified to solve requirements for extension of performance and capacities for the next 5 years.