Setting up a virtual desktop infrastructure (VDI) environment can be complicated; storage administrators must deal with many critical considerations, storage being the most vital. The success of a VDI environment is dependent on user experience, and storage may be the area that has the biggest impact on user experience. If you fail to design, implement and manage your virtual desktop storage properly, you may run into problems.
How VDI environments impact storage
The largest management issue for storage in VDI environments is keeping up with the periods of peak usage during the highest storage I/O times. “Boot storms” are the biggest cause of I/O spikes; they occur when a big group of users start up and load applications at the same time. Starting up desktops takes up a lot of resources—the operating system and applications both perform a lot of reading from disk. And if you multiply that activity by hundreds of desktops that users are booting up, the amount of storage I/O generated can quickly bring down a storage array. Boot storms aren’t to be taken lightly and often can have a big impact on performance—they can last anywhere from 30 minutes to two hours.
Storage I/O settles down after the initial login and application loads, but other things can cause high storage I/O throughout the day, such as patching desktops, antivirus updates/scans and users signing off at the end of the day. It’s crucial for environments to have a data storage infrastructure that can handle these intense peak periods.
Cost is another pain point. VDI has a different ROI than server virtualization, and it’s not as easy to sell management on VDI environments. Plus, it can cost quite a bit to get a proper storage infrastructure for VDI. Many storage administrators must buy more data storage capacity than they need in order to have the required I/O operations per second (IOPS).
Be prepared to either have a bigger administrative team or spend more on administration. Hundreds or thousands of virtual disks for the virtual desktops will have to be created and maintained; this can be a burdensome and difficult task.
Determining storage requirements
To properly design a VDI infrastructure you need to understand the resource requirements needed by virtual desktop users. Don't make assumptions; to properly calculate resource requirements you need actual statistics from the users whose desktops will be virtualized. Profiling the users and measuring their resource usage is the key to determining storage requirements. Products from vendors like Lakeside Software Inc. and Liquidware Labs Inc. can collect data from users' desktops so you can perform an assessment of your environment and determine your needs. The longer you collect data, the less likely it will be affected by unusual or periodic activities.
The key measurement for storage is IOPS. A number of factors can affect IOPS (caching, block size), but the base calculation is derived from hard drive mechanics: rotational speed (rpm), latency and seek time. A typical 7,200 rpm drive might be capable of 75 IOPS, a 10K drive 125 IOPS, a 15K drive 175 IOPS and a solid-state drive 5,000 IOPS. Spread across a RAID group, you can multiply the number of drives in the RAID group times the IOPS of the drive to get the total IOPS the RAID group is capable of (e.g., six 15K drives x 175 IOPS = 1,050 IOPS). There are other factors, such as caching, that can increase IOPS, while RAID overhead and latency in network storage protocols can decrease it.
You should always measure actual user resource usage, but there are accepted averages you can use as a starting point. The averages are based on the characteristics for certain types of users.
Click here to get a PDF of the Accepted user resource averages.
Don't design your VDI storage to handle just average I/O loads; it has to accommodate peak I/O loads to provide a good user experience. Having enough storage capacity is obviously important, but how it performs is more important. Because the number of spindles plays a big part in a storage array's performance, you may end up with more capacity than you need just to get the required IOPS.
Fibre Channel vs. iSCSI vs. NAS
The type of storage is often dictated by budgets and available existing storage infrastructure. A Fibre Channel (FC) SAN would provide ample performance, but acquiring it may make VDI too expensive to implement. iSCSI and NAS (NFS) are attractive alternatives, but you need to ensure they can meet I/O requirements. Using 10 Gb Ethernet (10 GbE) can dramatically increase the throughput to iSCSI and NAS devices, but if you haven't implemented 10 GbE yet it could be just as expensive as implementing FC.
Peak IOPS loads may exceed the number of IOPS an iSCSI or NAS (NFS) device can handle. But adding cache or an accelerator in front of the storage device may improve performance sufficiently. Both iSCSI and NFS add CPU overhead to the host server; for iSCSI this can be offset with hardware initiators. Accelerator solutions typically won't work with NAS, but there are other caching solutions available for NAS (NFS).
LUN sizes and RAID
When sizing LUNs/volumes for VDI, don't focus on performance rather than capacity to ensure that your LUNS can provide the required IOPS. There truly is no magic number for LUN sizes as many factors come into play. Generally, the more spindles you have in the RAID group that makes up your LUN the better. You also shouldn't size your LUNs too small for the number of virtual desktops you'll have on them. Whether or not you're using full virtual disks or linked clones will also influence sizing as the latter requires much less disk space.
You have a range of RAID options to achieve either better protection or better performance. A key factor that will influence your RAID choice is the read/write ratio of your virtual desktops. When reading data from a RAID group there's no I/O penalty associated with the RAID overhead, but there's an I/O penalty when writing. The more protection you want, the more it will cost you in I/O penalties. For example, RAID 1 has an I/O penalty of two as writes have to be written to both drives; with RAID 5 it increases to four and for RAID 5 it's six. If your I/O workloads will involve more writing than reading, you want to use a RAID level that has less of a penalty when writing. Having a larger write cache in your array controller or using a custom RAID level like NetApp's RAID-DP can also help.
SAS drives offer better performance but SATA drives can lower storage costs. Fast 15K drives can speed things up but at an increased cost compared to 10K drives. Solid-state drives (SSDs) offer blazing performance but have a hefty price tag. Choosing drives to handle virtual desktop infrastructure workloads usually comes down to buying the best drives you can afford. Slower performing SATA drives typically aren't desirable for most VDI workloads, so SAS drives are a better choice.
The platters of a 15K drive read and write data faster, and overall latency is reduced, but the head actuator that moves across the drive to access data doesn't. So even if the drive is spinning 50% faster, overall performance increases by approximately 30%, which results in higher IOPS.
You can mix and match drive types to provide faster storage where needed and use cheaper, slower storage for less demanding workloads. You might store the master disks for linked clones on fast SSD storage and the delta disks on SAS storage. You could take this a step further and use an automated tiering application to automatically balance workloads based on demand.
Caching and SAN accelerators
Using a caching device or a SAN accelerator can make up for slower performing storage devices and provide more IOPS to deal with boot storms and other periodic I/O peaks. It can also save money because you may be able to use less-expensive storage devices but still be able to handle your VDI I/O workloads. Caching device like NetApp's Flash Cache can make a huge difference and can greatly increase the number of IOPS your storage is capable of. Configure your caching for the appropriate areas; events like boot storms generally are very read intensive so a larger read cache will make a big difference.
Other helpful storage features
Storage arrays come bundled with many features that can help offload methods and processes that might normally be done elsewhere. Allowing the storage array to handle the things it does best can increase efficiency and performance. Here are some storage array features that can be beneficial in a VDI environment.
Data protection. Features, like Microsoft Volume Shadow Copy Service (VSS), that save previous versions of changed files, can make it easier for users to restore their own files. But implementing this feature on all user desktops can cause undesirable overhead and increase storage array I/O. With FalconStor's NSS SAN Accelerator, you can load an agent into the VDI gold master desktop template that allows the virtual desktop to communicate with the NSS SAN Accelerator appliance so any file changes that occur inside the guest OS are backed up by the appliance. Files can be recovered by users who can browse through previous versions and restore files to their desktop without involving the back-end storage device.
Data deduplication. Data deduplication can greatly reduce the amount of storage you'll need for your virtual desktops, especially if you're using full image virtual machines (VMs) instead of linked clones. If you have 100 desktops, each with a 20 GB disk, you'd need approximately 2 TB of desktop space. But VDI users typically run the same OS and use many of the same applications, so there's lots of duplicate data. Data deduplication can reduce the amount of disk space needed when using full image virtual desktops by as much as 90% and reduce the 2 TB to 200 GB. With linked clones, a single master disk is shared with all writes saved to a delta file, which may be only 2 GB to 5 GB. But if you plan to use full images, data deduplication is a must.
Thin provisioning. Linked clones are already space efficient, so thin provisioning won't provide much of a benefit. But when using full image virtual desktops, thin provisioning can be a huge space saver, allowing you to overallocate storage. Thin provisioning coupled with data dedupe can provide tremendous space savings when using full images. Thin provisioning can be done at the storage array layer or virtualization layer. While you can implement it at both layers in a dense VDI environment, it might make more sense to offload it to the storage array so there's less overhead on the virtualization layer. It also simplifies management by only having to monitor and manage thin disks in just one area.
VMware vStorage APIs for Array Integration. VMware Inc.'s vStorage APIs for Array Integration (VAAI) allow storage-related tasks normally performed by the virtualization layer to be offloaded to the storage system, including data copy operations (cloning, Storage vMotion), disk block zeroing and vmdk file locking. Leveraging VAAI in a VDI environment can provide benefits, as disk operations can be completed quicker and more efficiently than can normally be done by the hypervisor. VAAI is still rather new, and adoption and integration by storage vendors is still a work in progress, but storage arrays that support VAAI can provide some good benefits today and probably even more as the technology matures.
Know your needs
There are many things to consider when designing storage to support a virtual desktop infrastructure environment. While budgets may limit some of your options, there are a number of creative solutions available that can help you get the performance your virtual desktops will require. But the first step is to know your requirements; a proper assessment will help you define storage requirements that will, in turn, help you implement a properly sized storage solution. With a right-sized storage system in place, you can enjoy the benefits of VDI without worrying that your storage system will become a bottleneck for your users.
Use linked clones to save storage
Linked clones can be an invaluable feature to use in a virtual desktop infrastructure (VDI) environment. Linked clones work by having a single master virtual machine (VM) that holds an image of the base operating system the desktops will use. All virtual desktops read from this image with any writes captured in a separate delta file created for each VM. Delta files are typically small, although they can grow if every disk block was written to -- but that's unlikely to happen. Linked clones can be periodically refreshed to include patches and operating system and application updates. Linked clones offer clear advantages, but they can be more complicated to maintain than full disk images.
Virtual machine RAM and paging
The amount of RAM assigned to a virtual machine can have a big impact on its performance. If you don't assign enough RAM, the operating system will start paging to disk, which can greatly increase the amount of disk I/O -- a situation you want to avoid as the needless storage I/O can degrade performance. Assigning too much RAM can cause swapping at the virtualization layer if a host has overcommitted memory, which can also degrade storage performance. It's OK to overcommit host memory and it's commonly done with virtual desktop infrastructure (VDI); just make sure you don't completely exhaust your host memory.
SAN accelerators are a great way to add a high-performance caching layer in front of your existing storage device. FalconStor's Network Storage Server (NSS) SAN Accelerator for VMware View is an easy-to-deploy appliance that can improve a storage system's performance. It may even let you use low-cost SATA drives for your VDI storage and still get adequate performance.
Avoiding I/O spikes
Events that cause I/O spikes like boot storms can't be avoided, but other operations that cause I/O spikes can be. Use staggered schedules when performing antivirus scans/updates, as well as patching and updating operating systems and applications. By spreading the load across a longer period of time, you can avoid concentrated I/O on your storage system. And you can offload antivirus processing from the guest OS layer and move it to the virtualization layer where it can run more efficiently. VMware Inc.'s vShield Endpoint can offload antivirus scanning to a dedicated virtual appliance eliminating the need to run A/V software inside the guest OS. This greatly reduces the number of instances of antivirus you have to run on your hosts and, because it's centralized, it's easier to manage and the resource usage is greatly reduced.
This story was previously published in Storage magazine.
This was first published in May 2011