Start a Conversation

This post is more than 5 years old

Solved!

Go to Solution

11994

April 3rd, 2014 07:00

SPA vs SPB paths and failover

I have a relatively basic newbie question, but I'm not clear on how paths work with the two SP inputs on a CX-4.

I apologize now if this isn't clear, it's hard to explain.

Currently we have two Brocade 5100 fabric switches in one fabric;  we need them separated as separate fabrics, one switch per, with different zone sets.

Our server team doesn't always use multipathing, so it's necessary to have each switch standalone, especially as we're still straddling this new SAN with the old due to a recent migration. Not the best practice, I know.. they know too. But EMC had wired our old SAN that way and everything worked out fine.

Now, as it stands, each switch has just one connection to the CX-4, Switch "A" is connected to SP-A input, and switch "B" is connected to the SPB input.  I would figure we should have each switch connected to both SPs but our data center guy -or Dell- not sure which did the hardware install, didn't do that.  I believe we're out of cables now too, and the state's ordering process will ensure headaches and delays getting them.

Now that I've painted the picture, The Question is this: If a server has LUNs currently owned by SPA, but is connected only to the switch that will be going to the SPB input, then after the separation, will that LUN get cut off without a manual trespass or running failover as AULA mode 4 (active/active)?  Does the SPA input automatically internally talk to SPB without special settings and protocols?

Reason being, all the LUNs are setup right now as PNR mode 1, passive/active.

Another possible solution to avoid this issue is to very simply connect each switch to both SPs, by simply running 2 more cables (when I can get them),  *but* would that cause the switches to see each other (through the CX-4) and join in a single fabric again? Because that would break everything.

I'm hoping this is as simple as running the two extra cables to ensure no server gets cut off from it's LUN, no matter which SP owns it or controls it, while allowing the switches to remain separate fabrics.

Thanks!

Paul

2 Intern

 • 

20.4K Posts

April 3rd, 2014 07:00

Paul,

save yourself headaches and future outages, buy some  LC-LC cables and connect additional CX ports to each switch

Switch 1:

SPA0

SPB1

Switch 2:

SPB0

SPA1

at least something like that, fabrics will NOT merge, CX is not an FC router . If your system admins do not buy Powerpath they can still use native multi-pathing. Your zones would look something like this:

Switch 1:

host1-hba1-SPA0

host1-hba1-SPB1

Switch 2:

host1-hba2-SPB0

host1-hba2-SPA1

1 Rookie

 • 

51 Posts

April 3rd, 2014 07:00

That is the best case scenario, definitely!

But in the event we can't get the cables in the short term..  would that cut off some LUNs?

If so, I can use that information to pressure the powers to be to expedite the cable order.

Some of the servers they multipath;  once the migration is complete and the Scalar tape libary is bought over to the new SAN, I expect more of their servers will go multipath, but there's always a few test servers they set up without it.

2 Intern

 • 

20.4K Posts

April 3rd, 2014 09:00

if servers have multiple HBAs, there is no reason not to use multipath, win2k8/win2k12/Linux/HPUX/AIX all have native multi-pathing software available, just needs to be configured.

today since you are not using multipathing software (and failover mode is 1 or 0) , even if you are zoned to multiple SPs, the minute your "primary" path becomes unavailable server will lose access to storage. On systems where you are using multipathing and failover mode is set to 4 you "should" be able to survive with one SP going away as your server requests should be serviced through the CMI bus from another SP. Honestly i have never tested this, i always have at least two paths, one going to each SP.

1 Rookie

 • 

51 Posts

April 3rd, 2014 10:00

Maybe after everything is squared away, and hosts aren't straddling two SANs, we can go all multipath.

However,  they seem to prefer MPIO to Powerpath, but from the documentation I've read, AULA doesn't play that nicely with Linux MPIO, It can load balance and autorestore but won't detect a non responding or hung SP.

(h2890-emc-clarion-asymm-active-wp.pdf  page 17)

To make it worse, I think they ran into issues trying to get Powerpath to even run on the OracleOS one of their servers is running on.

Also, just last week I added a new host /initiator record for them, accidentally set with AULA, where as the other servers in the storage group were using PNR ;  they said this server couldn't see one of the LUNs in the storage group that the other servers could see;  I saw and edited the problem host for PNR, immediately it saw the LUN.

So AULA actually appears to break things?

The two path option is our saving grace.

2 Intern

 • 

20.4K Posts

April 3rd, 2014 11:00

ALUA works just fine as long as you meet host side requirements. My physical window environment is dwindling so the few physical hosts that are left are using Microsoft MPIO and ALUA on VNX. Everything is working just fine, they survive Cisco FI reboots, VNX Flare upgrades. On the linux side (RHEL 5,6) we are using PowerPath, system admins like emcpower device persistence versus "flakiness" of DM-MPIO.

1 Rookie

 • 

51 Posts

April 4th, 2014 07:00

It just occurred to me.. if I do this, as you recommended:  (I made a diagram too)

Switch 1:

SPA0

SPB1

Switch 2:

SPB0

SPA1

SAN layout2.bmp

without any multipath software, I still shouldn't have any "ghost" duplicates of the LUN showing up on the host, right. because only one SP controls it..?  (We would definitely have that issue with our old Symmetrix, but that was a different set up)

Now, what if, say,  SPA fails, will the CX-4 at least be smart enough to trespass all it's LUNs to SPB regardless of host failover or software settings? That's what the connections to SP1 and SB1 would do for us?

The way our old Symmetrix used to be hooked up was very straightforward, the way I had originally planned to do our CX-4, but with the different SP model,it's become apparent that won't work, or at least, not unless I check every LUN to see which SP controls it, and manually trespass as necessary. 

SAN layout.bmp

1 Rookie

 • 

51 Posts

April 4th, 2014 08:00

without multipathing you will see "ghost" entries if your HBA is zoned to multiple ports, it could be the same SP ..doesn't have to be another SP.

Ah, I see what you mean.  So, SPA0 and SPA1, going to both HBAs ultimately, would create ghosts.  Darn. Good point.

LUN trespass is initiator by host multipathing software, unless SP becomes unavailable (panic or NDU upgrade)

Is the keyword here, "unless"?  Meaning, if it's not the HBA or switch at fault, but one of the SPs, will the CX-4 automatically trespass everything over to the other SP?

It should be simple for me to read the docs and glean this kind of info, but they throw so much at you it gets confusing. I had no training on this stuff.

You still need to multipath if you are zoned/masked to multiple FAs.

Got 'cha.   Understood.

Thank you for all your help.

2 Intern

 • 

20.4K Posts

April 4th, 2014 08:00

without multipathing you will see "ghost" entries if your HBA is zoned to multiple ports, it could be the same SP ..doesn't have to be another SP. LUN trespass is initiator by host multipathing software, unless SP becomes unavailable (panic or NDU upgrade), just because your host lost connectivity on one HBA will not trigger LUN trespass (again, without multipathing software). You don't have this issue with a Symm, every path is active so there no such thing as trespass. You still need to multipath if you are zoned/masked to multiple FAs.

2 Intern

 • 

20.4K Posts

April 4th, 2014 11:00

NJSPhigg wrote:

LUN trespass is initiator by host multipathing software, unless SP becomes unavailable (panic or NDU upgrade)

Is the keyword here, "unless"?  Meaning, if it's not the HBA or switch at fault, but one of the SPs, will the CX-4 automatically trespass everything over to the other SP?

yes, for example if SPB encounters hardware issues (say a memory module goes bad), it will panic and may or may not come back online. As soon as it panics, all LUNs get trespassed to SPA. If host uses multipathing software it simply marks paths to SPB as dead and continues to use paths to SPA without any disruption to storage access. That's why you need to re-emphasize to your sysadmin admins how important it is to use and properly configure multipathing. If they are having configuration issues you can always engage EMC support for assistance (granted you have an active support contract for EMC array).

No Events found!

Top