Exaforge

Cloud, DevOps, Evangelism

VPLEX / vSphere Pathing Recommendation Updates

Hot Off The Presses!

I just received word that our current recommendation of pathing policy for VPLEX in VMware environments (Fixed) is changing.  Here's the details until the updated tech book comes out.

  • The recommended multipathing setting is Round Robin for VPLEX Local, VPLEX Metro (non-cross-connect), and VPLEX Geo. The I/O Limit value should be left at the default setting of 1000.
  • For VPLEX Metro cross-connect with VMware, PowerPath/VE is highly recommended.
    • PowerPath/VE 5.8 includes the auto-standby feature which allows each ESXi host to automatically prefer to send I/O to its local VPLEX cluster over the remote cluster. The host paths connected to the local VPLEX Cluster will be the active paths whereas those connected to the remote VPLEX Cluster will be the standby paths. [ed. yet ANOTHER reason to use PowerPath/VE!]
    • For more information on PowerPath/VE and the auto-standby feature, see the support page:https://support.emc.com/products/1800_PowerPath-VE-for-VMware

Why did we make this change?  There are three problems with using NMP for VPLEX Metro cross-connect environments:

  • Round-robin path policy for a host connected to both VPLEX clusters will incur extra read and write latency for I/O operations to the remote cluster. Roughly half of the I/O will be local and half will be remote. WAN bandwidth for front-end host traffic will be consumed. Additional VPLEX inter-cluster cache-coherency traffic will be sent between clusters.
  • Fixed path policy requires a lot of manual administrative work to have all ESXi hosts and all volumes on both clusters to prefer their local cluster. For a handful of hosts and only a few volumes this might be acceptable. But for hundreds of hosts and thousands of volumes this is too onerous.
  • In addition, should the single preferred path fail for whatever reason, the new path chosen by a host might be at the remote cluster. And it's entirely possible that multiple hosts could by the luck of the draw unfortunately all choose the same new remote director and thus overload that one director. A manual re-balancing of paths would be required at the new cluster, and then when the old cluster is back online, the exercise has to be repeated all over again.