The problem of VDI boot storms is a fairly straightforward one. Virtual desktop workloads are predictable; they’re based on the work hours of desktop users, which typically run from about 8 a.m. to 5 p.m. each workday. The overall storage I/O that an average virtual desktop generates is quite low compared with that of a server workload, and so the density of desktop virtual machines on a host is typically much greater than with server virtualization. Conversely, the initial startup of a desktop is very resource-intensive, where the operating system and applications do a large amount of reading from disk while loading and executing.
A boot storm occurs when many virtual desktops all boot up during a short window of time (for example, between 8 a.m. and 9 a.m.), which causes intense concentrated storage I/O that can easily overwhelm a storage subsystem. If the storage subsystem isn’t designed to handle the heavy I/O load, you can effectively end up with a denial-of-service attack on your storage subsystem.
In such a case, desktop users will experience extreme slowness on their virtual desktop to the point where it becomes almost unusable. If this situation occurs on a daily basis, you can be sure that your users will be constantly complaining, your VDI project will be perceived as a failure and users will be screaming for their physical desktops back. You should avoid this situation at all costs since it makes a good technology solution that has many advantages look bad as the result of poor design decisions.
Fixing a poorly designed storage subsystem for VDI after it has been implemented is possible, but it can cost much more than if you had made the right decisions upfront -- such as in cases where the system needs to be entirely replaced because it can’t be upgraded to support your needs.
Once users boot up, log in and load applications, the storage I/O typically settles down to a minimal level. The IOPS difference between a desktop VM that is booting and after it has booted is extreme, which can make architecting storage for VDI environments a challenge. A typical desktop VM running Windows 7 will generate from 50-100 IOPS while is it booting; once it is running normal workloads, the average IOPS drops to about 5-10. Therefore, to successfully meet the I/O demands caused by boot storms, your storage needs to be designed to handle the worst-case scenario.
Architecting storage to be able to handle the IOPS requirements of boot storms can get very expensive. Typically, to add more IOPS to a storage array, you need to add more spindles so the load can be spread across more hard drives. This means you’ll have far more storage capacity than you need. Imagine designing a highway with eight lanes just to handle an hour or two of rush-hour traffic each day when the rest of the time two lanes are sufficient. The end result is an extremely expensive highway that must be maintained.
Using SSDs to solve the VDI boot storm problem
Instead of outfitting your whole storage array to handle the required IOPS to weather boot storms, there’s a better solution that takes a more surgical approach. Rather than building an eight-lane highway, you could add a two-lane HOV lane (in the form of SSDs) to handle the peak traffic periods.
SSDs perform much quicker than traditional mechanical drives, which are limited by their rotational speeds. Whereas a typical 15,000 rpm SAS drive might deliver a maximum of 180 IOPS, an average SSD delivers about 5,000 IOPS. This performance comes at a greatly increased cost, of course. While using a storage device consisting of all SSDs for your virtual desktop would be great, for most the cost is much too prohibitive.
But using a limited number of SSDs to shoulder the high I/O that occurs during boot storms is much more cost-effective. Doing this allows you to use mostly lower-cost SAS or SATA disks in your storage array to handle the capacity you need and a small amount of SSD disks to handle the performance you need during peak I/O loads.
Option A: Put certain files on SSD. There are several ways that you can implement this type of solution; the first is to use a pool of SSD storage to store VM master base images and replicas on. When using Linked Clones (in VMware View) or Machine Creation Services (in XenDesktop) with VDI, the base image is a central read-only disk that all desktop VMs share. It’s the master copy of the deployed desktop operating system. Individual writable snapshots are then kept for each VM to hold any changes that are made to the base disk.
When desktop VMs are going through the boot process, most of the disk activity comes from the base image, where most of the operating system and application files are stored. Therefore, storing the base images and replicas of it on SSD storage can eliminate boot storms. All of the individual VM snapshot disks can be stored on lower-tier (SAS or SATA) storage.
Option B: Use SSD as a caching layer. Another option to handle boot storms is to use a caching layer of fast SSDs in front of a slower storage pool made up of SAS or SATA drives. FalconStor offers such a solution with its NSS SAN Accelerator for VMware View, which consists of an appliance that contains SSD disks; the appliance is placed between a host and its storage device. The appliance acts as a caching layer, and all storage I/O funnels through it to get to the back-end storage device. The caching appliance can identify frequently used disk blocks and automatically cache them so they’re read from the fast SSDs instead of the slower back-end storage. It can dynamically adjust to any high I/O demands as needed and eliminate boot storms by caching common data such as the VM base image.
Obviously, the best way to implement these kinds of solutions is during the design phase of your VDI project. There are many storage devices available today that have storage tiering to support pools of drives with different performance characteristics. But they can also be used to fix existing storage I/O bottlenecks that occur as the result of boot storms. By adding a small pool of SSD storage you can re-locate VM base images from existing slower storage tiers to the SSD storage to handle the high I/O caused by boot storms. Adding a FalconStor appliance to an existing infrastructure can be an easy fix as it can be dropped in between existing hosts and storage devices with minimal effort and modifications.
Sizing the SSD
When implementing SSDs as a tier of storage, it’s important to size them correctly to handle the peak I/O that will be occurring during your boot storms. To determine how much SSD to buy, you should calculate the maximum amount of I/O that your virtual desktops will be generating. While you can use an estimated number based on a typical environment, it is best to use performance analyzer tools to measure actual I/O on your existing physical desktops, using a tool such as Lakeside Software’s SysTrack VDI assessment tool, since every environment will be different. Multiply the number of virtual desktops on your hosts -- say, 500 -- by the typical IOPS -- say, 60 -- that a desktop will generate when booting, to determine the total IOPS that will be generated if every desktop logs in simultaneously (500 x 60 = 30,000). It’s unlikely that every desktop will log in at the same time, so you can probably scale that down a bit. But it’s better to architect for too many IOPS than too few.
Once you have your IOPS requirements, you need to size your SSD storage tier accordingly. If a single SSD can handle 5,000 IOPS, then six of them will provide 30,000 IOPS. (Note that these are general numbers. To implement the right-sized solution for your environment, you should do a proper assessment of your requirements and work with a storage vendor to implement an SSD solution that will meet those requirements.)
Preventing and solving boot storms doesn’t have to be overly expensive or complicated, and SSDs provide a good solution for handling one of the big problems that occur in virtual desktop environments. VDI projects can be costly to implement, and getting funding for VDI can be difficult in many companies because the ROI with VDI is not the same as with server virtualization. Mixing SSDs with lower-cost storage allows you to keep the cost of your project down and still deliver the performance necessary to handle boot storms. Once you have a properly architected storage system, you can enjoy the benefits that VDI provides without having to worry about your storage system becoming a bottleneck for your users.
This was first published in April 2011