What you'll learn: Learn how file virtualization can alleviate rapid file storage growth in NAS systems. We discuss the advantages and disadvantages of shared-path and split-path file virtualization products, and how each product type can be used to solve NAS sprawl issues.
File virtualization can help alleviate the rapid file storage growth occurring with many of today's NAS systems. File virtualization systems separate the physical location of a file from the representation of that file. File virtualization systems essentially eliminate the requirement for a user or application to know exactly where their files are stored as they see only a single global namespace (GNS). Depending on how it's implemented, file virtualization allows transparent file access, load balancing, data storage tiering, file migration, and even snapshots and replication for multiple homogeneous or heterogeneous NAS systems.
File virtualization implementations can usually leverage Microsoft's Windows Distributed File System (DFS) and/or Linux/Unix automounters by acting as a management layer. This allows them to automatically update the DFS Namespace to include NAS filers and file servers, while also providing common management for multiple dissimilar NAS systems. F5 Network Inc.'s ARX file virtualization appliance also provides available disk space monitoring, while others (Avere Systems Inc.'s FXT Series and EMC Corp.'s Celerra NS with FAST) provide storage tiering.
No additional software is required to leverage DFS Namespace and Linux/Unix automounters. If the file virtualization technology fails, the file maps for Windows and mounts for Linux/Unix remain intact, allowing users and applications access to their files. Not all the file virtualization systems work with DFS or automounters, and some that do don't necessarily require them.
Types of file virtualization products
There are two types of file virtualization products: shared path and split path.
Shared-path file virtualization systems share the control and data path, which means that all connections to the NAS and all data to/from the NAS flow through the virtualization system. Shared-path file virtualization systems are full proxies that touch every file and every packet in the path before it's written or read.
Shared-path file virtualization advantages
- Allows files to be migrated in real-time even when in use; the file virtualization system updates the global namespace with the new physical location of the file
- Easy to operate
- Protects current investment
- Transparent retirement of older NAS or file systems
- Individual file-level granularity
- Heterogeneous NAS and/or file server support; eliminates NAS system lock-in
- Definable policies using file metadata such as file type, creation date or when last accessed
Shared-path file virtualization disadvantages
- Added latency to pass through file virtualization system can be a bottleneck affecting response times and IOPS
- Single point of failure; a dead-box failure cuts off all access to the NAS and/or file systems
- Scalability is limited by the throughput of the shared-path file virtualization system
Split-path file virtualization systems separate the control and data paths, so the NAS connections and all data to/from the NAS don't pass through the file virtualization system. Split-path file virtualization is typically deployed as an x86 appliance connected to the LAN switch. They manage the namespace to direct files to the appropriate NAS or file system without intercepting any packets.
Split-path file virtualization advantages
- Nondisruptive implementation for applications/users
- Highly scalable
- File virtualization system failure won't cut off access to data
- Protects current investment in NAS and file systems
- Relatively easy file migration
- If it uses Microsoft DFS for the namespace, DFS will always have the most recent namespace configuration allowing users and applications to access their files
- Heterogeneous NAS support
- Easy to operate
Split-path file virtualization disadvantages
- Usually requires agents on application servers and workstations for transparent file migration; agents must be managed and maintained
- Tends to be Windows (CIFS) focused with limited NFS support
Shared-path and split-path systems are typically mutually exclusive. But EMC's Rainfinity is primarily a split-path system except when moving files when it's configured as shared path. That eliminates the need for split-path agents for file migrations and the shared-path scalability, performance and single-point-of-failure issues.
Shared-path systems include Avere Systems' FXT Series and F5 Network's ARX series, and EMC's Rainfinity when performing data migration. Split-path options include AutoVirt Inc.'s AutoVirt 3.0 and EMC's Rainfinity.
File virtualization systems have continued to evolve, solving more NAS sprawl issues. Avere Systems' FXT automates NAS storage tiering by hosting the most active files requiring the highest performance on its system of solid-state drive and 15K rpm SAS drives. Using policies, it automatically moves files to heterogeneous back-end NAS systems based on access frequency, performance, age, etc.
EMC's Rainfinity provides similar functionality within its Celerra NS NAS systems. FAST (fully automated storage tiering) on Celerra NS uses the Rainfinity engine for transparent file movement (it currently doesn't support heterogeneous systems). F5 Network's ARX uniquely solves NAS sprawl data protection by managing snapshots and replication for distributed heterogeneous NAS systems.
BIO: Marc Staimer is president of Dragon Slayer Consulting.
This was first published in April 2010