SAN multipathing: Implementation and key considerationsDate: Oct 17, 2012
In storage networking, the physical path between a server and the storage device that supports it can sometimes fail. That's a problem if there's only one path between the two. But a technique called SAN multipathing prevents the problem by establishing multiple routes between the hardware; if one path fails, I/O is routed through another path. Multipathing can also aid in load balancing. In the following video, storage expert Howard Marks discusses some best practices for implementing SAN multipathing. Watch the video or read the transcript below to learn more.
Anybody who's worked with storage area networks [SANs] for any period of time has experienced a path failure. People break cables, unplug the wrong cable [or] turn the power to a switch off. And while the server and the storage may still be working fine, there may be no valid path between them. So it's important to have some mechanism to support multipathing.
Like Microsoft, VMware has built primitive multipathing into the core product with a facility for array vendors or other third parties to provide plug-ins to make the multipathing more efficient. The default multipath method for VMware is failover. If you have a two-port Fibre Channel adapter in your host, those two connections go to two separate switches, and your disk array is connected to both of those switches -- all the traffic is going to take one path unless that path fails. You can switch to round robin, which will alternate requests across the paths, but because responses don't always come on the path you expected them to, [having] two paths with round robin doesn't give you twice the effective bandwidth. It gives you more like 1.6 times the effective bandwidth.
Other vendors have come up with more sophisticated [SAN multipathing] mechanisms. Some vendors,
like EMC, charge for it, so you have to pay per host for PowerPath VE. Others, like Dell
EqualLogic, provide more sophisticated multipathing free as part of their vCenter plug-in.
However, be sure that all the hosts in the cluster are doing multipathing the same way. If you have
a host failure or a path failure and different hosts are using different provider-specific
plug-ins, you may have paths that aren't being recognized as still being available.
Also ... if you have a traditional dual-controller modular storage array, like an EMC Clariion, NetApp or an HP EVA, make sure that you talk to your vendor specifically about how to do multipathing. In some of those arrays, sending requests to the controller that actually owns the LUN you're talking to will put substantially less load on the array controller than sending requests to the other controller and having it forward them across the backplane. There are active-active controller systems, where both controllers talk to all the disks all the time. There are active-passive systems, where one controller is running all the time; the other one is sitting there, ready to take over when it fails. And there are what we call dual-active systems that are asymmetrical -- some LUNs are owned by one controller and some LUNs are owned by the other. How you do multipathing depends on which [LUN] your product fits in. This is one of those times there's going to be a white paper from your vendor that says "best practices for doing vCenter or Microsoft Windows multipath with your product" -- and reading it is really worthwhile.