Start a Conversation

Unsolved

This post is more than 5 years old

L

1635

February 28th, 2013 08:00

Unknown cause of dead paths in powerpath 5.1 on W2K3 cluster

Has anybody seen anything like this in the past i was not involved in the incident but i'm looking at the RCA.

We have a windows 2003 cluster that attached to two clariion CX4 arrays it reported dead paths to all luns on one of the arrays on both switch fabrics and failed.

It then had problems with reservations but I believe this was a symptom of the first problem as it lost access to the disks.

I've checked the switch logs and messages and no indication of links failing or  messages on either the host switch ports or the array front end ports.

Clariion checked events and none reported at the time of the failure.

No other hosts (and there are quite a few) reporting any errors and they attach to the same front end SP ports.

Powerpath has now been updated to a supported level since the incident, but im struggling to find what caused the dead paths in the firstplace.

2 Intern

 • 

1.3K Posts

February 28th, 2013 13:00

what are the flare version on them? Did you collect the host report and run through the E-Lab? Any Red spots other than the power path you mentioned?

March 1st, 2013 02:00

LEOG,

Dead paths can occur for various reasons. See the driver levels (storport, HBA and Powerpath) need to work very closely with the Flare code on the array.

There are many enhancements between version 5.1 and the lastest version. Too many to compare, and even when you list all these, many of them are described in terms of protocol or terms of buffers/HBA terminology.

This terminology may be very hard, even for the seasoned storage expert, to relate back to the actual symptoms you have experienced. It would probably be a very time consuming excersise to compare this to try to find out which one of those enhancements, or which ones of those enhancements, is the fix for your symptoms. If you find the culprit at all, as the issue is fixed, there is no longer an opportunity to debug the actual symptom.

As regards to supported versions, and current versions, a general recommendation would be to check at least once a year for the latest drivers, both HBA as well as Powerpath. Realizing that in some configurations, high availability configurations, downtime is at a premium, or even not possible.

Saying that, the fact that you run cluster can work in your advandtage, as you can "upgrade the passive" then move, and rotate to upgrade the remainder nodes.

Again realizing that keeping abrest with all versions, software, drivers is a daunting task, and that in some situations it cannot be done easily;

HTH,

Edwin.

No Events found!

Top