14 Posts
0
2247
EqualLogic PS6100E network ports not working after battery failure
Hi there,
I have an EqualLogic PS6100E storage machine which had battery failure alert on one of it module but while we where awaiting to replace the controller module battery another module battery also failed hence the worse happened power failure, Since then the network ports for both modules did not work. After about three weeks we managed to replace two batteries for both controller modules. Now when we power it on still no success both modules network ports of the modules are offline. I have tried to connect the console cable to the module to try figure out if i can access it got these massage bellow.
CLI> show
The storage array is still initializing. Limited commands will be available until the initialization is complete. Please try again later.
CLI> support exec raidtool
You are running a support command, which is normally restricted to PS Series Technical Support personnel. Do not use a support command without instruction from
Technical Support.
Driver Status: *Admin Intervention Requested*
RAID LUN 0 Ok.
raid status unrecoverable.
12 Drives (0,2,4,6,8,1,3,5,7,9,10,11)
RAID 6 (64KB sectPerSU)
Capacity 9,601,932,984,320 bytes
RAID LUN 1 Ok.
raid status unrecoverable.
11 Drives (12,13,14,15,16,17,18,23,20,21,22)
RAID 6 (64KB sectPerSU)
Capacity 8,641,739,685,888 bytes
Available Drives List: 19
CLI>
CLI> support exec 768:3:eqllogger: 1-Nov-2014 10:04:33.350003:EQLLogFile.cc:210:WARNING::16.3.0:Logger daemon is losing messages because offline disks are generating more events than the daemon can handle.
support exec uname -a
You are running a support command, which is normally restricted to PS Series Technical Support personnel. Do not use a support command without instruction from
Technical Support.
NetBSD 5.0_STABLE NetBSD 5.0_STABLE (EQL.PSS) #0: Fri Oct 31 09:36:54 EDT 2014 build@m64:/buildarea/V7.1.2__Fri_Oct_31_2014_09_29_23_EDT/bin/destdir.sbmips.64.release/EQL.PSS.64 sbmips
CLI> support exec diskview -j
You are running a support command, which is normally restricted to PS Series Technical Support personnel. Do not use a support command without instruction from
Technical Support.
Enc/Drive State Write Read Power Drive Bad ForceWrite Reset Read Scan Max Max
Retrys Retrys Cycles Timeouts Blocks Retrys Fail Timeout Errors Cominits HrstMsecs
______________________________________________________________________________________________________________________
0/ 0 Online 0 0 0 0 0 0 0 0 0 0 0
0/ 1 Online 0 0 0 0 0 0 0 0 0 0 0
0/ 2 Online 0 0 0 0 0 0 0 0 0 0 0
0/ 3 Online 0 0 0 0 0 0 0 0 0 0 0
0/ 4 Online 0 0 0 0 0 0 0 0 0 0 0
0/ 5 Online 0 0 0 0 0 0 0 0 0 0 0
0/ 6 Online 0 0 0 0 0 0 0 0 0 0 0
0/ 7 Online 0 0 0 0 0 0 0 0 0 0 0
0/ 8 Online 0 0 0 0 1 0 0 0 0 0 0
0/ 9 Online 0 0 0 0 0 0 0 0 0 0 0
0/10 Online 0 1 0 0 0 0 0 0 0 0 0
0/11 Online 0 0 0 0 0 0 0 0 0 0 0
0/12 Online 0 0 0 0 0 0 0 0 0 0 0
0/13 Online 0 0 0 0 0 0 0 0 0 0 0
0/14 Online 0 0 0 0 0 0 0 0 0 0 0
0/15 Online 0 0 0 0 0 0 0 0 0 0 0
0/16 Online 0 0 0 0 0 0 0 0 0 0 0
0/17 Online 0 0 0 0 0 0 0 0 0 0 0
0/18 Online 0 1 0 0 0 0 0 0 0 0 0
0/19 Online 4 0 0 11 12 0 0 0 0 0 0
0/20 Online 0 0 0 0 0 0 0 0 0 0 0
0/21 Online 0 0 0 0 0 0 0 0 0 0 0
0/22 Online 0 0 0 0 0 0 0 0 0 0 0
0/23 Online 1 0 0 0 1 0 0 0 0 0 0
CLI> support exec raidtool -Z
You are running a support command, which is normally restricted to PS Series Technical Support personnel. Do not use a support command without instruction from
Technical Support.
Active RAID LUNs: 0
Driver Status = **ADMIN INTERVENTION REQUIRED**.
Malloc Bytes = 0KB
Outstanding Active I/O's = 0
Pending I/O's = 0
Pending Resource Reqs = 0
Outstanding StripeLocks = 0
Allocated Sectors = 0
Device = 000
status = 008
outio = 00000000
drives = 12
disk luns: 11 10 9 7 5 3 1 8 6 4 2 0
Device = 001
status = 008
outio = 00000000
drives = 11
disk luns: 22 21 20 23 18 17 16 15 14 13 12
disk count:24
disk lun= 0 status=0x00000400 drive active device=0
disk lun= 1 status=0x00000400 drive active device=0
disk lun= 2 status=0x00000400 drive active device=0
disk lun= 3 status=0x00000400 drive active device=0
disk lun= 4 status=0x00000400 drive active device=0
disk lun= 5 status=0x00000400 drive active device=0
disk lun= 6 status=0x00000400 drive active device=0
disk lun= 7 status=0x00000400 drive active device=0
disk lun= 8 status=0x00000400 drive active device=0
disk lun= 9 status=0x00000400 drive active device=0
disk lun=10 status=0x00000400 drive active device=0
disk lun=11 status=0x00000400 drive active device=0
disk lun=12 status=0x00000400 drive active device=1
disk lun=13 status=0x00000400 drive active device=1
disk lun=14 status=0x00000400 drive active device=1
disk lun=15 status=0x00000400 drive active device=1
disk lun=16 status=0x00000400 drive active device=1
disk lun=17 status=0x00000400 drive active device=1
disk lun=18 status=0x00000400 drive active device=1
disk lun=19 status=0x00001000 hot spare no-device
disk lun=20 status=0x00000400 drive active device=1
disk lun=21 status=0x00000400 drive active device=1
disk lun=22 status=0x00000400 drive active device=1
disk lun=23 status=0x00000400 drive active device=1
CLI>
Regards,
Buzotz.
dwilliam62
3 Apprentice
3 Apprentice
•
1.5K Posts
1
March 15th, 2022 07:00
Hello,
You are very welcome. If this is your DR, then the risk is pretty low. The clearlostdata will only discard what was lost in cache. Which isn't gong to be a lot of data. If log transaction is lost and the array doesn't boot up, there are two options. If you are in a region that supports one time support cases for a fee, you can do that and get the LV log rebuilt. Alternatively, reset the array, configure it again and replicate your data from the working 6210.
The risk of running the clearlostdata command would appear to be pretty low.
Regards,
Don
Dell- Maria J
Moderator
Moderator
•
278 Posts
0
March 10th, 2022 07:00
Hello buzotz,
I am sorry you faced with this issue.
Could you please share, what version of firmware is installed?
Are there any LED indication?
In regards to the messsage:
CLI> show
The storage array is still initializing. Limited commands will be available until the initialization is complete. Please try again later.
This can be caused when the passive controller is not able and/or allowed to read the RAIDset so it will always report itself as initializing. The command prompt is "CLI" rather than the group name, which is also a result of the passive controller not being allowed to read any configuration information from the RAIDset.
I would like also, if it possible, check logs from device. Could you please gather them and send me in Private Message? How to gather logs from Dell EqualLogic:
https://dell.to/3636mKK
Please, let me know if you have any questions.
Thank you.
buzotz
14 Posts
0
March 10th, 2022 21:00
Hi,
Thank so much for your response.
1. The firmware version is V7.1.2 (R402088) as I documented this long time ago (is there a way i could get this thru console to make sure).
2. At the back of the controllers there are no Any LED indicator except from the management port. (Eth0, 1, 2, and 3 are not working).
3. It is possible to check logs from device thru GUI as i only have access thru console, hence this link not for me ( https://dell.to/3636mKK )
Thanks for the assist.
Regards
Buzotz.
Dell- Maria J
Moderator
Moderator
•
278 Posts
0
March 11th, 2022 04:00
Hello Buzotz,
Thank you for your reply. I've check technical documentation in regards to your issue and would like to suggest follow steps:
1. What type of controllers are installed? Is it Type 11? In Type 11 controllers are used not batteries, but set of capacitors on a small daughter board attached to the controller. In this post from Dell Community forum there is information on replacing the battery. Have you replaced one like the one shown in the picture?
https://dell.to/3MEegem
2.Did you try to restart and reseat controllers?
3.Could you please check Health status of controllers, following the example:
Active CM0:
ecli> Health Status (0x0000080000000000): RED Conditions:
HARDWARE_COMPONENT_FAILURE_CRITICAL
Passive CM1:
ecli> Health Status (0x0000080000000000): RED Conditions:
HARDWARE_COMPONENT_FAILURE_CRITICAL
4.Did you check connection from switch side?
Please let me know if you have any questions.
Thanks
buzotz
14 Posts
0
March 11th, 2022 04:00
https://dell.to/3MEegem
Ans: Type 11.
And yes, I have replaced the batteries on both controllers this week
2.Did you try to restart and reseat controllers?
Ans: I only restarted the controllers thru console.
3.Could you please check Health status of controllers
Ans: The device is quite far but I will post back the result soon.
4.Did you check connection from switch side?
Ans: Yes I did check the connection and is quite ok both from the switch side, I have tested even the network cables are ok.
dwilliam62
3 Apprentice
3 Apprentice
•
1.5K Posts
0
March 13th, 2022 05:00
Hello,
The array is offline due to the "Admin intervention" message. When that condition occurs, the array will stop running the start up sequence. Reseating the controllers will not help.
At this juncture, there is lost cache. The only option that will likely help is running the command
CLI>clearlostdata
Unfortunately, especially with old firmware, you could lose an IO transaction when the cache is discarded.
Do you have a backup of your data?
If that resolves it the array will complete the boot up sequence, and enabled the Ethernet ports.
Regards,
Don
buzotz
14 Posts
0
March 14th, 2022 00:00
Hi Don,
Thanks for reply. This device is our offsite data recovery, we use it to keep our old data hence we do not have the backup of any data. Is this the only option that we can use?
Regards
dwilliam62
3 Apprentice
3 Apprentice
•
1.5K Posts
0
March 14th, 2022 08:00
Hello,
Yes. Other than to reset it back to factory defaults, and restore. There's no way to force it to complete the boot up sequence in this state. The "admin intervention" error indicates that.
Regards,
Don
buzotz
14 Posts
0
March 15th, 2022 06:00
Hi Don,
Thanks for the assist so far. I have to consult with the management on the issue to see whether we have to continue with the reset back to factory defaults and ignore our old data. I am sure now that the restore can not be done as we do not have the any backup of the data from the device to recover. We only do the backup on our another EqualLogic 6210 which is still working and where all our ongoing information are kept online. But for this EqualLogic 6100 (Our DR site device) which needs to be reset to keep it back online only had our old data finger crossed that those info won't be needed one day.
Regards.
buzotz
14 Posts
0
March 17th, 2022 02:00
Hi Don,
Thank you so much for the assist we managed to bring back the EqualLogic with all the data inside. I did not do anything after running CLI>clearlostdata cmd I just follow the instruction till the end and at the end everything was back online.
Thanks Again,
Buzotz.
dwilliam62
3 Apprentice
3 Apprentice
•
1.5K Posts
0
March 18th, 2022 03:00
Hello Buzotz,
That's great!
I would suggest two things. Run a filesystem check on all volumes, and investigate adding a back to that array. Even a couple of externa drives (or better external mirrored drives) would be enough.
If you are going to keep this unit you also might want to investigate getting a parts only support contract. That would also allow you access to the updated firmware for the array, and potentially for the disk drives as well Depending on the model and current firmware on the drives.
Regards,
Don
buzotz
14 Posts
0
March 18th, 2022 03:00
Thanks Again Don,
I will do the follow up on your suggestions.
But I also hope it is possible to manually download and install
the latest firmware from 7.1.2 that we have to the latest.
Regards,
Buzotz.
dwilliam62
3 Apprentice
3 Apprentice
•
1.5K Posts
0
March 27th, 2022 16:00
Hello Buzotz,
You are very welcome! I am very glad I could assist you.
Re: Firmware. Once you have a support contract in place you can download the firmware. However, without that contract you cannot access the firmware.
You will have to go through quite a few steps to get to 10.0.3.
It will be something like. 7.1.x->8.0.x->8.1.x->9.1.x->10.0.3
However, the firmware guide will cover the exact steps.
Regards,
Don