Start a Conversation

Unsolved

This post is more than 5 years old

1038

June 18th, 2015 13:00

Connection Failure errors on AIX P7, VPLEX


We have a persistent issue where randomly, but often enough, AIX P7 hosts connected to VPLEX throw errors such as:

Connection Failure hdisk12

CONNECTION FAILURE''hdisk100

We see these quite often, across multiple hosts and yet the hosts never experience any performance hit or data loss. We have engaged EMC and sent grabs and everytime they review, they do not see an issue. SA's have engaged IBM and also, IBM sees no issue from the client side.

Internal SAN review also does not yield any smoking gun or errors of significance on the SAN.

Our setup is not complicated - AIX P7 host <<< ciscoMDS >>> VPLEX <<< ciscoMDS >>> VMAX

These P7's are using a Nexus module similar to Cisco UCS Fiber Interconnect (excuse my misuse of nomenclature) to manage SAN connections and virtual WWN's. We suspect our issue may lie here within but still no definite resolution.

I would appreciate any feedback from anyone utilizing AIX P7 with VPLEX and VMAX.

Thanks

100 Posts

April 12th, 2016 10:00

Wondering if you ever figured this out? Have a customer with a similar issue.

1 Rookie

 • 

63 Posts

April 12th, 2016 11:00

Tim. Yes, finally!

We use this SAN monitoring product called VI (Virtual Instruments Virtual Wisdom), which taps into SAN port connections. We noticed many messages that were showing "target busy" randomly but repeatedly for connections between hosts and VPLEX. VI also showed us for which hosts this was happening most often. Again, everyone was perplexed because no host (or SA) was complaining about this, but both EMC and VI both assured us, this is not normal. So we performed a deeper dive review on all of our zones and VPLEX connected hosts and again using VI statistics we could see many VPLEX ports were over-burdened and still others were under-utilized. Mind you, no host performance issues were being experienced, though VI was picking up these target busy messages fairly regularly.

This, we concluded could eventually lead to issues, so we decided to re-work our zoning to achieve a greater spread of IO across all VPLEX ports (in a nutshell).

Well, it was a few months of heavy lifting, but that actually worked! Lo and behold, those nagging "connection failure" errors on AIX all subsided. Obviously, AIX hosts were more sensitive to this config issue than others, but again in spite of the error messages, no host performance ever degraded with the former zoning in place. Hope this helps.

100 Posts

April 12th, 2016 11:00

Thank you very much for the reply. In this case the two AIX hosts dropped their connections during the time which customer was provisioning VPLEX luns. Could be performance related but the NAR files checked out okay. Ongoing...

Thanks again this helps.

No Events found!

Top