Start a Conversation

Unsolved

M

1 Rookie

 • 

3 Posts

560

July 12th, 2022 08:00

PowerEdge R720 fans losing communication

Has anyone else seen this?

We have 3 PowerEdge R720 servers that have started showing loss of communication errors in OMSA on all System Board Fans over the past couple of weeks:

  1. All fans lose communication at the same time
  2. All fans regain connectivity less than a minute later and then read fine
  3. There is no common pattern that I can see (i.e. right after reboot).

The error messages look like this:

Severity: Warning, Category: System Health, MessageID: FAN0023, Message: Unable to read the System Board Fan3 sensor value.

All 3 servers are iDRAC7 and have been on the latest firmware (2.65.65.65) since 2020.

OMSA was updated on all 3 servers at the end of May this year (v 10.2.0.0)

I rana system diagnostic on one of the servers and it didn't report any problems.

Could it be a glitch in the latest version of OMSA?

It seems like it has to be...if not, it's a pretty big coincidence that all 3 of our R720's have the issue at the same time.

 

Moderator

 • 

3.7K Posts

July 12th, 2022 13:00

Hello Mark,

 

I would say that that would be a pretty big coincidence.

 

All fans happen at the same time. Does it happen on all three servers at the same time?

 

Are there other R720 in the environment that do not have the issue?

 

You mention it has no pattern. Does that mean it does not happen every day or certain time of day?

 

Did it start when OMSA was updated?

Did previous version report the fan error?

What was the previous OMSA version you were on?

 

Is the BIOS 2.9.0 up to date also?

 

When OMSA reports the fan issue is there also report in the iDRAC System Event Log (SEL)?

 

Have you had a direct look at the fans see if they are stopped?

Try reseat all fans, they are hot pluggable. Confirm they seat good and you don't see any damage in the connectors or the fans. You may try this on one first and monitor or do it on all three.

 

Moderator

 • 

3.7K Posts

July 13th, 2022 07:00

Hello Mark,

 

If it does not report in the SEL and only OMSA it may be a software issue.

 

You might monitor to see if it becomes more frequent. If it does you might try the previous version.

OMSA  v.10.1

https://dell.to/3O5A02m

 

1 Rookie

 • 

3 Posts

July 13th, 2022 07:00

Thanks Charles!

 

All fans happen at the same time. Does it happen on all three servers at the same time?

No, it's happened twice one one server (6/21/22 and 6/30/22) and once each on the other 2 servers (7/3/22 and 7/7/22).

Are there other R720 in the environment that do not have the issue?

No. These are the only R720's we have. We have an R710 and a couple of R730s that have not had the problem.

You mention it has no pattern. Does that mean it does not happen every day or certain time of day?

It's only happened the 4 times. The days of the week are all different, but all 4 occurrences have been shortly after midnight (12:42 am, 12:26 am, 1:33 am, 12:00 am)

Did it start when OMSA was updated?

OMSA was updated on 5/21/22 and the first fan issues was on 6/21/22, so it wasn't immediately after the update, but that's the only related thing that was changed.

Did previous version report the fan error?

No

What was the previous OMSA version you were on?

9.5.0

Is the BIOS 2.9.0 up to date also?

Yes

When OMSA reports the fan issue is there also report in the iDRAC System Event Log (SEL)?

No, this only shows in OMSA. Nothing in the iDRAC System log or Lifecycle log.

Have you had a direct look at the fans see if they are stopped?

No. Since it's random and only lasts for 30 seconds, checking while it occurs would be tough (plus the servers are not on site). If the fans were actually stopped for any length of time, I would expect to see spikes in temp (or a fried server).

Try reseat all fans, they are hot pluggable. Confirm they seat good and you don't see any damage in the connectors or the fans. You may try this on one first and monitor or do it on all three.

Not really feasible at the moment since the servers are offsite, plus we're back in the "pretty big coincidence" area if the fans on all 3 servers came unseated within a few weeks of each other.

1 Rookie

 • 

3 Posts

July 13th, 2022 11:00

Thanks Charles.

 

I'll monitor it to see what happens.

 

No Events found!

Top