Skip to main content
  • Place orders quickly and easily
  • View orders and track your shipping status
  • Create and access a list of your products
  • Manage your Dell EMC sites, products, and product-level contacts using Company Administration.

Dell PowerEdge 13G - Possible Reboot After "Correctable Memory Errors"

Summary: How to correct reboot after "Correctable memory error rate exceeded for DIMM_xx". on certain PowerEdge 13G servers

This article applies to   This article does not apply to 

Symptoms

iDRAC logs the following event: MEM0702 Correctable memory error rate exceeded for DIMM (Bank/Slot)

 

Cause

Table of Contents

1. Description
2. Solution
3. Further Information
 

 


Description

A Correctable Memory Error is a single bit error which occurs when a bit if it erroneously changes, from 1 to 0 or from 0 to 1, during a write or read operation. When the specific bit in error is identified, the error is corrected by complementing the erroneous bit. Dell certified DIMMs perform this correction automatically.
In rare instances, a server may reboot after a correctable memory error is recorded in the SEL log. This has only see in BIOS version 2.3.x.

Example:

MEM0701 Warning Correctable memory error rate exceeded for DIMM_xx.
MEM0702 Critical Correctable memory error rate exceeded for DIMM_xx.


LC Log example:

2017-03-07 23:08:02 SYS1003 System CPU Resetting.
2017-03-07 23:08:02 SYS1001 System is turning off.
2017-03-07 23:08:02 MEM0702 Correctable memory error rate exceeded for DIMM_xx.

 

 

Resolution


Solution

In order to resolve the reboot issue the BIOS should be updated to the most up to date version. If this is not possible for operational reasons, the BIOS should be brought up to the minimum versions as listed below:

 
R430 2.4.2
T430 2.4.2
R530 2.4.2
T630 2.4.2
R630 2.4.3
R730 2.4.3
R830 1.4.2
C4130 2.4.2
C6320 2.4.2
All modular blades 2.4.2
Table 1: Relevant BIOS versions and models
 
SLN305799_en_US__1icon The T130, R230, T330, R330, and R930 are not affected by this issue. 
SLN305799_en_US__1icon If correctable Memory errors occur after the update of BIOS a standard troubleshooting process should be implemented.

 


Further Information

This issue has primarily been reported in the PowerEdge R630 and R730, however the potential exists in all of 13G with a BIOS version of 2.3.x. A change was introduced in BIOS version 2.3.x for additional logging to Security Policy Database (SPD) which introduced this particular issue:

"A NULL pointer dereferencing in BIOS enhanced SPD logging after memory correctable error critical threshold exceeded, would cause system to machine check or lock up."

The previously quoted BIOS versions for the affected platforms will fix the server reboot issue in conjunction with the correctable error rate exceeded message.

The issue has primarily been reported in R630 and R730.  The potential exists in all PowerEdge 13G servers with BIOS version 2.3.x for the issue to occur.

Affected Products

PowerEdge c6320, Poweredge FC430, Poweredge FC630, Poweredge FC830, PowerEdge M630, PowerEdge M630 (for PE VRTX), PowerEdge M830, PowerEdge M830 (for PE VRTX), PowerEdge R430, PowerEdge R530, PowerEdge R530xd, PowerEdge R730, PowerEdge R730xd , PowerEdge R830, PowerEdge R930, PowerEdge T630 ...
Article Properties
Article Number: 000141221
Article Type: Solution
Last Modified: 18 Jul 2023
Version:  5
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.
Article Properties
Article Number: 000141221
Article Type: Solution
Last Modified: 18 Jul 2023
Version:  5
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.