Start a Conversation

Unsolved

C

1 Rookie

 • 

7 Posts

74

July 18th, 2024 16:26

PowerEdge r750 won't boot, drops to Driver Health Manager

I have a R750 that won't boot. It has a Perc H840 attached to a MD1400. The Perc is configured with 12 RAID0 Virtual disks. One of the disks failed. I replaced the disk and on reboot I was expecting to be able to get the Perc's configuration so I could set up the replacement disk.

How ever pressing F2 does not get me into the system BIOS I only get to the Driver Health Manager. I see no option that will allow me to configure the new disk.

Is there any way to get past this Driver Health Manager?

1 Rookie

 • 

7 Posts

July 18th, 2024 18:44

Wanted to add this to my previous post.

I am considering this:

Once I get to the Driver health manager and select the Perc H840 I want to power off the MD1400

The clear the configuration

Reboot and get into the system bios

Hopefully it will show foreign configurations.

Import the foreign config

and then create a new virtual disk as RAID0 and the unconfigured disk to that virtual disk.

Any thoughts.

Moderator

 • 

3.7K Posts

July 18th, 2024 20:45

Hello,

 

If you haven’t already try a flea power drain and check results:

 

drain flea power (shut down, disconnect power cables and Network cables, hold in power button 20 seconds with cords removed). After flea power drain, system has to set for 3 minutes for DRAC to reset without any power plugged in, then plug in NIC and power but wait 2 minutes before power on to give DRAC time to initialize.

 

At the Driver Health Manager you need to type a character, such as Y, in the input box, hit enter to proceed.

 

Then you should be able to get in to System Setup > Devices and check in the PERC controller for issues.

 

Also check you BIOS settings for boot mode is correct: UEFI or BIOS boot mode.

 

If you press F11 in POST it should bring up a boot menu. See if you can select your boot drive and boot.

 

Check in the DRAC in the System Event Log (SEL) and the LifeCycle Controller log for any errors related to the issue.

 

 

I wouldn't recommend clearing the configuration, at least not yet.

 

If you are replacing a RAID0 drive and the replacement shows up Foreign then you should clear the Foreign then you can create a RAID0 on the replacement drive.

 

1 Rookie

 • 

7 Posts

July 19th, 2024 00:47

@DELL-Charles R​ I guess I should have been a little clearer. We do the RAID0 virtual disks with a single physical disk to mimic a JBOD device. We have written in-house what we call the Distributed Data Manager, it basically a software raide but it distributes the "slices" across multiple servers. So on each disk each file is not really a file but slice stored as a file. When a real file is stored the system splits it up into slices and stores it on multiple nodes. When a file is requested it reassembles the correct slices into a file.

Now on reboot the server doesn't respond to any of the function keys it just dumps me into the driver health manager.

Have not tried the flea power drain yet, but I will give that a try.

Now to prevent this from happening in the future how do I replace a disk in this configuration?

When I replaced the disk when the system was up and running, perccli showed the new disk was recognized and showing unconfigured good.

Should have down a perccli start rebuild?

BTW, we are running Ubuntu 20.04 LTS.

1 Rookie

 • 

7 Posts

July 19th, 2024 00:51

@DELL-Charles R​ And in the driver health manager I can see the physical disks, all of the original disk show online, the replacement shows ready. The virtual drive section only shows 11 of the original 12 VDs.

Moderator

 • 

3.4K Posts

July 19th, 2024 02:44

Hi,

 

Do you have iDRAC configured? - You can do configuration checks on PERC. 

 

Just to clarify, the MD1400 VD is not your boot drive right? It is only for mass storage? You are using RAID0 for 12 of the disk, there is no redundancy and will not rebuild if you have replaced the failed disk. The UEFI Driver Health Manager is the UEFI equivalent to the BIOS boot mode messages that are typically seen during POST. I don't have an exact unit with the same configuration to check, but you mentioned you are able to access the PERC configuration now, which you are able to view the disk properties, are you able to see any virtual disk management? 

 

To my opinion, I don't think it is possible to rebuild RAID0, hence, even if you are able to get to PERC utility, there isn't much you can possibly rescue RAID0. Personal thought, I think what you can do is to force online the failed drive, making the RAID0 online back to do a backup. 

1 Rookie

 • 

7 Posts

July 19th, 2024 12:31

@DELL-Joey C​The MD1400 is just for mass storage. The boot drive is an internal disk(RAID1) connected to a H755.

We do not have iDRAC configured. I don't need to rescue that 1 RAID0 disk I just want the PERC to recognize it so I can create a new VD for it.

On some of our older servers I have been able to do this but the controllers are H730s not H840.

(edited)

Moderator

 • 

3.7K Posts

July 19th, 2024 13:30

Hello,

 

If the replacement drive is showing as Ready then you should be able to get in the H840 controller BIOS and configure it as a RAID0.

 

Check in the controller if it is showing as non-RAID or RAID capable. Convert to RAID capable if it is not already.

 

Convert to RAID capable page 71

https://dell.to/4cKtNWM

 

1 Rookie

 • 

7 Posts

July 19th, 2024 15:11

@DELL-Charles R​ That is the problem!

I can not get into the H840 controller BIOS.

I just keep getting thrown into the driver health manager.

When I am in the driver health manager I can see that physical disk is in a Ready state.

The other 11 disks are shown as online like I expect.

(edited)

1 Rookie

 • 

7 Posts

July 19th, 2024 15:26

@DELL-Charles R​ Actually I think clearing the configuration maybe the quickest and easiest solution. Our Data management system can handle a rebuild of all of data.

Moderator

 • 

3.7K Posts

July 19th, 2024 15:44

Hello Craig, Sounds good. Let us know how it goes.

 

You might consider contact Support directly 1-800-945-3355 and a technician can do a remote session with you to get a look.

The forum is not capable of that type engagement.

No Events found!

Top