Before I go into details, here is a quick introduction to performance and bottlenecks. Performance of a system has four major areas: memory (speed or capacity), processing power, network bandwidth, and disk (IOPs and capacity). Every system is always limited in one of these areas at any given time, you just hope that your system isn't hitting the current bottleneck! Performance tuning is simply the art of trying to move each bottleneck back so it isn't in the way.
In days of the Intel 5400 and ESX 3.5, my customers were almost always bound by memory capacity. If I had more memory in the system, I could put more virtual machines in the cluster. Also, many customers were "new" to virtualization and were getting their feet wet by virtualizing the "low hanging fruit". These virtual machines often didn't consume a lot of resources in all areas. Because memory was the smallest pool, it was typically exhausted first.
Fast forward to today: The memory limitation has been removed and we are now comfortable putting high I/O, critical boxes on the virtualization infrastructure. What is the next bottleneck I am seeing? It is now disk IOP's. You may have enough capacity, but you don't have the disk performance you need. Since IOPs really isn't included in vCenter and we haven't had to consider it until today, many people forget all about it!
How many of you actually calculate the impact of disk IOP's before adding a virtual machine? Do you know how to do it? Why do you need to do it?
To start, every disk drive in a SAN array generates a number of IOP's. This is how many reads or writes the disk can do per second. I will list the common drive types and an average of the IOP's below. To add up the total number of IOP's, you simply add the number of drives in the datastore and multiply by the IOP's average. Have you ever heard the term "You need more spindles for that"? What they mean is you may have enough disk to support the capacity you need, but you need to add more drives (spindles).
Let me demonstrate with an example. This is oversimplified but hopefully it will prove the point. You have four 1TB SATA drive in a raid group. This means the pool has 4TB of capacity and generates a total average of 300 IOPs (75x4). You now decide you want to create fifty server virtual machines with a 20GB virtual hard disk. You just consumed 1TB of the 4TB of space but are you really going to boot and run fifty virtual machines on four SATA disks? As one of my managers used to say to me, you won't get there from here.
Something to think about...
Average IOPs per drive
- 7200 RPM SATA = 75-100
- 10k RPM SAS/FC = 100-130
- 15k RPM SAS/FC = 150-190