Start a Conversation

Unsolved

Closed

C

2 Posts

386

July 14th, 2023 01:00

MD3820f Cache Backup Device Failed

Hi all

So, we have MD3820f which shows the error.

Cache Backup Device Failed.
Component reporting problem: Cache Backup Device (USB 2). Status: Failed.
Location: RAID Controller Module 1, enclosure 0, device slot USB 2.

In the storage system profile i can see that status for one of the two devices is failed.

Recovery Guru Suggests that Cache Backup Device is replaced..
But Dell does not supply a Cache Backup Device (it's a 16GB M2 flash card).

I got the advice to reseat the Cache Backup Device, so i did.
Controller put offline. Cache Backup Devices Reseated. Controller put online again.
Problem still persist.
Then got advice that controller needed replacement, so we did. 
Everthing went fine, new controller adapted firmware, NVSRAM and config.

Controller now shows, that both controllers have both Cache Backup Devices in Optimal state.

But recovery guru still reports: Cache Backup Device Failed. Unable to retrieve detailed information.
And i am not able to activate "Recheck" that is, nothing happens, The "Turning Gears" does not appear, like the normal do, during a recheck.

I did clear the event log, but no difference.
Firmware is latest 08.25.14.60

Any suggestions on how to proceed?

Best Regards
Claus

Moderator

 • 

2.3K Posts

July 14th, 2023 06:00

Hi, this error may happen if the array is not shut down properly and the backup meta-database gets corrupted, or if the cache stick (a small SD card) gets loose.

To potentially resolve the error, you can try the following steps: remove the controller, allow the residual power to dissipate, reinsert the cache card securely, and then reinstall the controller. However, please note that this solution is not guaranteed. It's important to note that cache cards are typically not replaced in these controllers, and as I know the only internal component that is typically changed is the battery. Sometimes a system reboot can help resolve certain issues, including cache backup device failures.

2 Posts

July 18th, 2023 00:00

Hi

Thanks for replying.

We did in fact replace the complete controller, as dell does not supply cache backup device as a sparepart.

Also to try to resolve the issue we did set the new controller offline and pulled it out for a couple of minutes until the battery was drained. Reinserted it an put it back offline. With no succes.

What baffles me is that the hardware is now optimal, with no issues.
But recovery recover still has the entry, but is now reporting this:

Cache Backup Device Failed. Unable to retrieve detailed information.

To begin with it was reporting the location of the failed device:
Component reporting problem: Cache Backup Device (USB 2). Status: Failed.
Location: RAID Controller Module 1, enclosure 0, device slot USB 2.

For me it seems like a problem that the recovery guru isn't cleared, as the hardware is now reporting optimal.

Any suggestions.

Best Regards
Claus

 

Moderator

 • 

2.3K Posts

July 18th, 2023 01:00

Hi Claus, I came across a thread like this https://dell.to/3K4DEua I know it is an old thread but still the same action might be required.

Please take a look one more time to steps in here https://dell.to/3pWRFDh

Moderator

 • 

2.3K Posts

July 26th, 2023 23:00

Hi, sorry to hear that and I agree with you. Might be you can try monitor>report>event log and clear all then click recheck on recovery guru. And you can try to use a few commands via SMcli. I researched again and below command could be useful I found it on netapp websites.

SMcli -n MD3820f -c "clear controllerFault [a|b];"

 

MD Series: SMCLI Commands for the Controller https://dell.to/3Ozzb5u;

 

Dell PowerVault MD 34XX/38XX Series Storage Arrays CLI Guide https://dell.to/3OwQXpK

2 Posts

July 26th, 2023 23:00

Hi Thanks for the reply.

Sorry no, we have alreadey replaced the controller and the module reports state is optimal.

Afterwards we did take the controller offline and pull it out to power drain it (also pulled the battery for a couple of minutes).


But there still is a message in the Recovery Guru section, as i explained earlier. With the  "Unable to retrieve detailed information"

 

It seems like "recheck" is not doing it's thing.

Please read the initial post to understand.

Is there anyway to clear the message?

 

Best regards 

Claus

No Events found!

Top