Data replication is great for protecting critical data and ensuring quick recoveries. Find out where you should deploy replication: in your array, network or servers.
By Jacob Gsoedl
Data replication as a means of data protection has seen continuous and increasing adoption since it first emerged in storage systems after the first World Trade Center bombing in 1993. Over time, it has evolved into an indispensable component of disaster recovery (DR), as well as for operational backup for applications that require shorter recovery point objectives (RPOs) and recovery time objectives (RTOs) than what traditional tape backups can offer. Firms are also adopting data replication for remote- and branch-office data protection; in a hub-and-spoke architecture, branch-office data can be replicated back to central data centers, thus eliminating unwieldy tape-based backup procedures at the branch sites.
The growing adoption of replication services has been driven by a wide array of data replication products, more lower cost replication offerings, faster and less-expensive networks, and an overall maturing of the technology itself. "Replication-based data protection is among the top three priorities of 60% of our clients, which is very different from only a few years ago," said Tim Bowers, global product manager, storage services at EDS, a Hewlett-Packard (HP) company.
Not all replication is equal
At a macro level, data replication copies data from one storage location to one or more other local or remote storage systems. But venture beyond that basic task and you'll find that data replication products vary in several key aspects:
Location: One of the main differentiators among products is where replication occurs. The replication service or software can reside on the storage array, in the network or on the host (server). Array-based replication has been dominating the replication market up to now.
"We did a recent study that shows that in 2007, 83.7% of worldwide revenue for storage-based replication was done using array-to-array replication, followed by host-based replication with 11.5% and network-based replication with 4.8%," said James Baker, research manager, storage software at Framingham, Mass.-based IDC. But according to the same study, both host- and network-based replication are catching up. Host-based replication is expected to grow at a compound annual growth rate (CAGR) of 18.2% until 2012, while a CAGR of 15.4% is anticipated for network-based replication. Both are expected to expand significantly faster than the 10% forecasted annual growth for array-based replication.
Mode: Replication can occur synchronously, where data is written to the primary and secondary storage systems simultaneously; or it can be performed asynchronously, where data is replicated to replication targets with a delay. In synchronous replication, the primary storage system only commits I/O writes after the replication target acknowledges that data has been written successfully. Synchronous replication depends on sufficient bandwidth and low latency, and supported replication distances range from 50 km to 300 km. It's typically used in applications where zero RPOs and RTOs are required, such as high-availability clusters and mission-critical applications that demand 100% synchronicity between the primary and target systems. Conversely, asynchronous replication writes data to the primary array first and, depending on the implementation approach, commits data to be replicated to memory or a disk-based journal. It then copies the data in real-time or at scheduled intervals to replication targets. Unlike synchronous replication, it's designed to work over long distances and greatly reduces bandwidth requirements. While the majority of array- and network-based replication products support both synchronous and asynchronous replication, host-based replication offerings usually only come with asynchronous replication.
Type: Replication products can replicate blocks of data on volumes or logical unit numbers (LUNs), or replication can be performed at the file level. With the exception of network-attached storage (NAS), which can support both block- and file-based replication, array-based replication products usually operate at the block level. The same is true for network-based replication products. In contrast, most host-based replication offerings operate at the file-system level. Block-based replication is platform-agnostic and will work seamlessly across various OSes. File-based replication products are very much operating system-specific and the majority of available host-based replication products are written for Windows. Unlike file-based replication, block-based replication products have no knowledge of the attached platform, file system or apps, and depend on auxiliary services like snapshots for any type of application integration. As a result, most storage arrays with replication support also provide snapshot capabilities that are more or less integrated with the file system and key apps like Exchange and SQL Server databases.
|Data replication trends|
These data replication-related trends are gradually changing data protection and disaster recovery.
In array-based replication, the replication software runs on one or more storage controllers. It's most prevalent in medium- and large-sized companies, mostly because larger firms have deployed higher end storage arrays that come with data replication features.
With more than 15 years of history, array-based replication is the most mature and proven replication approach, and its scalability is only constrained by the processing power of the array's storage controllers. "Customers scale replication performance in both our Clariion and Symmetrix arrays by distributing data replication across a larger number of storage processors," explained Rick Walsworth, director product marketing replication solutions at EMC Corp.
With the replication software located on the array, it's well suited for environments with a large number of servers for several reasons: it's operating system-agnostic; capable of supporting Windows and Unix-based open systems, as well as mainframes (high-end arrays); licensing fees are typically based on the amount of storage rather than the number of servers attached; and it doesn't require any administrative work on attached servers. Because replication is offloaded to storage controllers, processing overhead on servers is eliminated, making array-based replication very favorable for mission-critical and high-end transactional applications.
The biggest disadvantage of array-based replication is its lack of support of heterogeneous storage systems. And unless the array provides a storage virtualization option -- as Hitachi Data Systems does for its Universal Storage Platform (USP) -- array-based replication usually only works between similar array models. Besides a high degree of vendor lock-in, entry cost for array-based replication is relatively high, and it could be particularly expensive for companies that have to support a large number of locations. In general, array-based replication works best for companies that have standardized on a single storage array vendor.
Almost all vendors of midsized to high-end arrays provide a replication feature. The replication products of these leading array vendors have made significant inroads and gained market share:
- EMC Symmetrix Remote Data Facility (SRDF) for both synchronous and asynchronous replication, and EMC MirrorView for synchronous and asynchronous replication of Clariion systems.
- Hitachi Data Systems TrueCopy for synchronous replication and Hitachi Data Systems Universal Replicator software for asynchronous replication.
- HP StorageWorks XP Continuous Access and Continuous Access EVA for both synchronous and asynchronous replication for HP XP and EVA arrays.
- IBM Corp. Metro Mirror for synchronous replication and IBM Global Mirror for asynchronous replication.
- NetApp SnapMirror for synchronous and asynchronous block-based replication, and NetApp SnapVault for file-based replication.
Even though these replication products are similar in many aspects, a close technical analysis reveals subtle differences. For instance, the efficiency of the handshake between primary and target storage systems used during synchronous replication greatly impacts the distance a replication product can support. "Metro Mirror is able to write data to the target system with a single handshake, enabling it to support distances of up to 300 km," said Vic Pelz, consulting IT architect at IBM. That distance goes well beyond the 50 km to 200 km cited by other storage vendors.
Differences can also be found among asynchronous replication implementations. While EMC buffers data to be replicated in memory, IBM Metro Mirror tracks changes with so-called bitmaps, continuously transmitting changes and periodically re-synchronizing the source and target to ensure they stay in sync. On the other hand, Hitachi Data Systems uses change journals stored on disk in its Universal Replicator software.
"The combination of disk-based change journals that are pulled by the replication targets instead of pushed by the source, makes it extremely resilient, capable of automatically recovering from elongated disruptions," said Christophe Bertrand, senior director, solutions and product marketing business continuity at Hitachi Data Systems. "Because changes are pulled by replication target arrays, valuable processing cycles are offloaded from primary arrays to secondary target arrays."
|Comparing data replication methods|
In host-based replication products, the replication software runs on servers so, unlike array- and network-based replication, it doesn't depend on additional hardware components. That makes host-based replication the least-expensive and easiest replication method to deploy.
"Deploying host-based replication only requires installing the replication software on source and target servers and you are ready to go," noted Bob Roudebush, director of solutions engineering at Double-Take Software Inc. It's well suited to work in heterogeneous environments, supporting the widest range of storage options that include both network- and direct-attached storage. While most products support Windows, Linux and Unix support is more tenuous and, therefore, platform support is clearly one of the critical evaluation criteria when selecting a host-based replication product.
On the downside, host-based replication adds processing overhead to servers and the installed replication software carries the risk of introducing unknown behavior. "For critical and high-end application servers, IT managers tend to favor array-based replication over host-based replication because it keeps server resources dedicated to the app and doesn't expose it to potential bugs or flaws in the replication software," said Lauren Whitehouse, an analyst at Milford, Mass.-based Enterprise Strategy Group. Furthermore, licensing costs and system administration duties increase proportionally with the number of servers, giving both array- and network-based replication an advantage in environments with a large number of servers. In addition, visibility in host-based replication is typically limited to source and target servers. This is very different from the centralized architectures of array- and network-based replication offerings that enable a more holistic view into the replication infrastructure.
The target markets for host-based replication products are typically small- to medium-sized businesses (SMBs) that can't afford more expensive replication alternatives, enabling them to deploy data protection and disaster recovery architectures that, until a few years ago, were only seen in larger firms. CA, Double-Take, InMage Systems Inc., Neverfail Inc. and SteelEye Technology Inc. are some of the vendors that have enabled smaller companies to deploy replication-based DR and data protection at a fraction of the cost of array- and network-based replication. Although each of these products replicates data from one location to another, they differ in features such as efficiency, bandwidth throttling, management, high-availability failover capabilities, platform support and application integration. Only a thorough product evaluation will reveal which product offers the best fit for a given environment.
In addition to these standalone offerings, backup software vendors are integrating host-based replication into their backup suites with the hope of expanding their reach into the lucrative remote- and branch-office data protection business.
"We see a convergence of DR and data protection, and consider replication to be a feature and not a standalone product," said Marty Ward, senior director, product marketing for the Data Protection Group at Symantec Corp. Most backup software vendors are already offering host-based data replication options for their backup suites; some examples include BakBone Software Inc.'s NetVault: Real-Time Data Protector; CommVault Continuous Data Replicator (CDR); EMC RepliStor to complement EMC NetWorker; Symantec Backup Exec Continuous Protection Server (CPS); and Symantec NetBackup PureDisk with a deduplication option, as both a standalone product and a NetBackup option.
The main advantage of combining traditional backups and replication is the ability to manage replicas and backups within a single tool. Aside from their host-based replication options, backup software vendors have been working on integrating their backup suites with leading storage arrays and network-based replication products to enable customers to manage all replicas and backups with the same tool.
"Just like with Continuous Data Replicator, array-based replicas of supported arrays are integrated into the backup application index and catalog, allowing users to restore an array-based snapshot by simply right-clicking it within our application," said Brian Brockway, vice president of product management at CommVault. Similarly, Symantec's Veritas NetBackup is integrated with more than 40 arrays and virtual tape libraries (VTLs), and EMC NetWorker offers tight integration for EMC's RecoverPoint network-based replication product.
|Choosing a data replication solution|
1. Selection of a data replication method should start with a business impact analysis to determine required recovery time objectives (RTOs) and recovery point objectives (RPOs).
In network-based replication, the replication occurs in the network between storage arrays and servers. I/Os are split in an inline appliance or in a Fibre Channel (FC) fabric; the I/O splitter looks at the destination address of an incoming write I/O and, if it's part of a replication volume, forwards a copy of the I/O to the replication target. Network-based replication combines the benefits of array-based and host-based replication. By offloading replication from servers and arrays, it can work across a large number of server platforms and storage arrays, making it ideal for highly heterogeneous environments. Most network-based replication products also offer storage virtualization as an option or as part of the core product.
Contemporary network-based replication offerings are either inline appliances or fabric based. With inline appliances, all I/Os need to pass through the replication device. Technically, the appliances terminate all incoming I/Os and initiate new I/Os that are forwarded to the primary and, in case of write I/Os, to replicated storage targets. The inline approach has been plagued by performance and scalability issues. The poster child for inline appliances is IBM's SAN Volume Controller (SVC).
A scalable architecture and plenty of cache have not only enabled it to overcome performance and scalability limitations but, aided by the simplicity of the inline appliance approach compared to the more complex fabric-based implementations, it has become one of the successes in the network-based replication and virtualization market.
In fabric-based replication products, the splitting and forwarding of I/Os is performed within an FC fabric. By taking advantage of FC switching and the separating data and control path, it's the best performing and most scalable approach. The majority of fabric-based replication products run on intelligent switches from Brocade Communications Systems Inc. and Cisco Systems Inc. Even though both Brocade and Cisco offer Data Mobility Manager (DMM) for local data center replication, third-party vendors like EMC and FalconStor Software Inc. offer more advanced fabric-based replication products that run on Brocade and Cisco intelligent switches. A case in point is EMC RecoverPoint, which provides fabric-based, asynchronous continuous data protection (CDP) with application integration that's on par with commensurate host-based CDP products. Despite obvious benefits, fabric-based replication has seen lackluster adoption. "Switch-based replication and virtualization have been over-hyped, but there are people who are working on it and over time it will become more common," said Greg Schulz, founder and senior analyst at Stillwater, Minn.-based StorageIO Group.
LSI Corp.'s StoreAge Storage Virtualization Manager (SVM) straddles the line between inline appliances and fabric-based products that depend on expensive intelligent switches. The combination of SVM and LSI's Data Path Module, which plugs into existing Fibre Channel switches to perform switch-based forwarding and eliminates the need for intelligent switches, combines the simplicity of IBM SVC with the performance and scalability benefits of a split-path architecture. HP seems to concur, and is offering the LSI product as HP StorageWorks SAN Virtualization Services Platform (SVSP) to complement its host- and array-based replication offerings with a network-based replication and virtualization product.
Even though the market share for array-, host- and network-based replication will shift over time, there will be appropriate places for all three approaches. While each has its own set of advantages and shortcomings, specific environments and situations will best determine where replication should occur.
BIO: Jacob Gsoedl is a freelance writer and a corporate director for business systems. He can be reached at firstname.lastname@example.org.
This was first published in March 2009