This post is more than 5 years old
73 Posts
0
4647
Any issues with XtremIO upgrades from 4.0.15-20 to 4.0.15-24?
I am asking because I want to know if anyone else has experienced disruptive upgrades specifically from 4.0.15-20 to 4.0.15-24?
I am trying to figure out if I should be asking the community before proceeding with upgrades.
I have a support case open and I am sure DELL/EMC will have answers soon on what cause the issue, but I would like to avoid issues when it comes to affecting production.
tim.koopman
73 Posts
0
June 27th, 2017 07:00
The XtremIO upgrade went well last week, All AIX servers involved stayed up. Successful NDU! The fix is below. Customer Upgrade Preparation Guide - XtremIO EMC added the below line. * Is using AIX, please review KB491002 prior to NDU. KB491002 Referenced the IBM fix. * Apply the IBM Authorized Program Analysis Report (APAR) mentioned in IBM IV84862 - Improve Handling of Aborted Commands on the host side. Note the fix has been rolled into a service pack. My AIX systems that made it through the Successful NDU were running the below AIX OS level. # oslevel -s 7100-04-03-1642
Kumar_A
727 Posts
0
May 8th, 2017 08:00
4.0.15-20 to 4.0.15-24 is an NDU (non disruptive upgrade) process. Dell EMC Support will run pre-upgrade checks in your environment to make sure we are aware of any potential issues before the NDU process starts.
Swapnil_Pandey
86 Posts
1
May 8th, 2017 09:00
Hi,
We have got this upgrade done (same versions) as there was an advisory recommending upgrade. We didn't faced any issues. Maybe EMC can suggest more on this.
tim.koopman
73 Posts
0
May 8th, 2017 09:00
Thank you, the above is great news, glad to hear your upgrade went smoothly.
tim.koopman
73 Posts
0
May 8th, 2017 09:00
Avi, thank you for your comment, and yes my earlier updates were NDU, but the 4.0.15-20 to 4.0.15-24 was NOT a NDU.
EMC did all the pre-checking and everything passed, but a production database crashed during the update.
I will update this post when I get the RCA results, because many things can cause issues during an update.
In my case the previous update was done less than 90 days earlier without issues.
I believe my hosts were all configured correctly, so that is why I was asking the community if anyone else has experienced issue with the specified version jump from 4.0.15-20 to 4.0.15-24.
I was very surprised the upgrade had issues because the previous upgrade had no issues at all.
scotthoward
64 Posts
1
May 12th, 2017 00:00
4.0.15-20 to 4.0.15-24 is about as simple an upgrade as they come. There are no "firmware" changes in this version, so there's no need to reboot any of the storage controllers - just a quick blip as we reload the new XIOS code, and that's it.
This is one additional step that the person carrying out the upgrade will do due to the fact that there wasn't an reboot, but that's completely transparent.
There's certainly no expectation of any problems for this (or any other) upgrades. Most of the times we see issues during upgrades it's down to things like multipathing or timeouts not being set correctly on the host, but I'm sure support will be working with you to try and work out exactly what went wrong in this case and get it fixed. We're actually working on a set of scripts that will validate the host-sided configuration before an upgraded (or at any other time) to help avoid such issues - the one for VMware is in final testing, and (physical) Windows and Linux will follow shortly.
tim.koopman
73 Posts
0
May 12th, 2017 09:00
Hi Mdeitrick and Scotthoward, thank you for your input.
EMC Support suggested I open a support case with the switch vender. Which I have done and that investigation is taking place. No root cause identified yet, but we are still digging.
tim.koopman
73 Posts
0
May 17th, 2017 07:00
Just an update, the switch vender did not find any FC switch issues shortly before, during, and after the upgrade.
tim.koopman
73 Posts
0
May 17th, 2017 12:00
mdeitrick,
EMC support just requested I have the vender check for flapping on the port. No flapping was detected.
Regarding SR #
Service Request Number
07017682
Former Service Request Number
85818360
tim.koopman
73 Posts
0
May 24th, 2017 06:00
I am still waiting for official RCA. Preliminary info that was provided.
Referenced two possible reasons for the Outage.
The AIX fix which public knowledge, I just did not know about it.
Make sure your AIX systems have IBM IV87492: IMPROVE HANDLING OF ABORTED COMMANDS APPLIES TO AIX 7100-04 - United States installed.
The other possibility that EMC referenced, was an XtremIO bug, but since the preliminary info has EMC confidential all over it, I will let someone from EMC disclose EMC bug issue and details.
tim.koopman
73 Posts
2
June 20th, 2017 07:00
EMC Final RCA indicates the issue was the missing AIX patch. I have a XtremIO upgrade scheduled this week. The upgrade scheduled this week is a bigger jump in code 4.0.2-80 to 4.0.15-24. I have the AIX patch applied to the AIX servers.