VPLEX Metro & Cluster Witness – Pivotal to Business Continuity

In previous posts, I have talked about how robust EMC VPLEX Metro implementations have been over the past 3 years and how they rapidly continue to grow. VPLEX Metro can provide absolute business continuity with the proper placement of VPLEX Witness and a strong adherence to required SAN, Array and Server connectivity and ease of mobility throughout the core environment.  In the VPLEX install base, the majority of implementations are VPLEX Metro yet some implementations do not have the ‘Witness’ installed to provide the continuous availability that the product achieves and what customers would expect from its foundational purpose.

Remind me: what is VPLEX Witness?

EMC VPLEX Metro uses a cluster guidance mechanism known as VPLEX Witness to provide continuous availability in the event of an entire site failure.    Using VPLEX Witness ensures that Continuous Availability can be delivered by VPLEX Metro.  Continuous Availability means that regardless of site or link/WAN failures; data will automatically remain online in at least one of the locations.

When setting up a single or a group of distributed volumes preference rules are configured.   It is the preference rule that determines the outcome after failure conditions such as site failure or dual WAN link partition.  The preference rule can either be set to Site A preferred, Site B preferred or no automatic winner for each distributed volume and/or group of distributed volumes. Overall, the following effects to a single or group of distributed volumes can be observed under each of the failure conditions list in Table 1:

Let’s review what failure scenarios look like on a VPLEX Metro WITHOUT VPLEX Witness:

Table1

Table 1.

Table 1 shows that with the use of just preference rules (without VPLEX Witness) then under some scenarios manual intervention would be required to bring the VPLEX volume online at a given VPLEX cluster (For example, if Site A is the preferred site, and Site A fails, Site B would also suspend).

This is where VPLEX Witness assists DRAMATICALLY improves the situation.   It can better diagnose failures as the independent fault domain isolation and network triangulation ensure that Witness can provide guidance to each of the clusters.  In addition, the distributed volumes are place into consistency groups which drives the witness communication to the clusters and their policy adherence. This allows VPLEX Metro to provide an active path to the data in both the dual WAN partition and full site loss scenarios as shown in Table 2:

Now, let us review failure scenarios with VPLEX Witness in a Metro configuration (psst…will get you to a 7-9’s availability model in your environment)

Table2

If you are a VPLEX user and it sits in the CORE of your data center, manages your virtual storage and compliments your most flexible of clustered applications then WHY WOULDN’T YOU MEET the requirements of installing VPLEX Witness for continuous availability?Table 2 Shows the results when VPLEX Witness is deployed — failure scenarios become self-managing (i.e. fully automatic).

What seems to be the hindrance from implementation of Witness? Let’s go through what I “think” and somewhat “know” to be the backfire in what should be a “no-brainer” scenario.

  1. Customer doesn’t have a 3rd fault domain to install the witness on a VM.

VPLEX Witness requires a 3rd fault domain, this is true. When VPLEX clusters are installed, they are placed in separate domains. There is a 3-way VPN that secures the communication between the witness and both clusters at the management server level. This 3rd domain has to be off-prem and have no attachment to catastrophic events tied to either cluster domain. It is perfectly fine if the 3rd failure domain with a witness goes away because the clusters will continue to operate against their policies without issue until it is recovered. The challenge in this scenario is a double failure within (2) domains, (1) including the witness but that are highly unlikely.

So, without a 3rd failure domain, what is a customer’s option? There is always an option of installing the witness at a service provider or cloud services, owned by the customer and provided to delivery as a location to install. There are customers doing this today without any documented guidance from EMC other than the properties to install witness onto. Go for it! We hope and plan for the 2015 timeframe to be more specific on what cloud offerings to choose but in the meantime, DO EVERYTHING YOU CAN to provide a 3rd failure domain for a witness installation. If you as an END USER of VPLEX want the benefits it provides then what are you waiting for?

chair

Take a look at this picture of the 3-legged chair. Take a look at the picture of the 3-legged chair without a 3rd leg. Wobbly at best isn’t it? Something to think about…pictures do sometimes say a thousand words.

  1. Failure handling and Witness is complicated or maybe you as an END USER are just not aware of how important it is….maybe you were “undersold” or are underselling based on FEAR? UNCERTAINTY? DOUBT? Well it is time to work through it and start to discover the benefits. Don’t mistake VPLEX Witness as an OPTION. It is a requirement for Continuous Availability.

If you are traditionally use to the Active / Passive replication model; ask yourself this one question. This applies to sellers and customers alike. When the last time you had a site outage, did you invoke a failover to the second site? I would bet a majority (and I polled a live audience on this question so I have some vague notion of the majority rule) of you said it never happened. The outage was endured and instead of a failover, the decision to invoke was never enacted upon. This is what is known as DTO. Decision time objective. How long will it take for someone to make a decision to invoke failover in an Active / Passive environment?

It is time. Take the next step. Put the witness out there and never have to consider DTO or RTO.

Special thanks to Don Kirouac who provided the guidance tables; thank you to Olly Shorey for introducing DTO. (More on DTO in weeks to come)

[Blatant plug: the new VPLEX Witness tech book is under construction and will also be blogged about in the upcoming weeks, stay tuned, but in the meantime, here is the older version: VPLEX Witness Tech Book]

About the Author: Jennifer Aspesi

Jen Aspesi is Sr. Consultant Solutions Marketing for Dell Technologies Data Protection Division, primary focus on Cloud Solutions, Kubernetes and High Value Workloads. Prior to this role, Jen was Director of Advanced Customer Engineering for several areas of Dell EMC product and field enablement including storage replication, storage virtualization, backup and recovery. Jen has a Masters of Innovative Technology from Worcester Polytechnic Institute, MA.