Table of Contents
- Description
- Identifying a CPU IERR in the System Log
- Resolving a CPU IERR
- Operating System Issues
Description
The CPU Internal Error (CPU IERR) or CPU Machine Check error is usually
not an error of the CPU itself. But an indication that the CPU has detected an error in the system, or received an erroneous instruction from a system component. It is caused by a non-CPU event, such as a firmware mismatch, a system BUS interruption or a memory read/write interruption. The error can theoretically be caused by ANY system component, software, or hardware.
This article contains the best practice on dealing with these errors and is valid for all PowerEdge servers.
Warning: Do not remove CPU! CPU IERR errors are rarely caused by a CPU malfunction and the reference to the CPU is solely based on what module reported the error. Despite what you may read on some troubleshooting websites or forums it is imperative that you do not remove the CPU unless trained and equipped to do so.
Identifying a CPU IERR in the System Event Log
A CPU Internal Error shows in the System Event Log as "CPU 1 has an internal error (IERR)" or "CPU 2 has an internal error (IERR)."
Figure 1: DSET showing CPU IERR
Resolving a CPU IERR
To resolve this error, please follow a structured plan of troubleshooting to determine which component has caused the error and how to resolve it.
1. Check
system event log for any other errors occurring around the same time as the CPU IERR.
2. If any other errors are identified, resolve these errors first. How to resolve the errors would depend on the error identified.
3. Update the BIOS and iDRAC firmware to the latest version.
- Update the BIOS or the iDRAC using the iDRAC interface is explained here - Updating Firmware and Drivers on Dell EMC PowerEdge Server.
- If the iDRAC is not available, other update methods are listed in the following tutorial - How to update firmware remotely using the Integrated Dell Remote Access Controller (iDRAC) web interface
4. Clear System Event log. i.e. in Open Manage Server Administrator or iDRAC (for both, open the event log, scroll to the bottom and press clear log) Old CPU IERR errors will cause an alert after the error has been resolved unless they are cleared from the System Event Log.
5. If no errors are found, or the CPU IERR returns, shut down the system, remove the power cable and hold in the server power button for 20 seconds before plugging the power cable back and turning the system on again. This process is known as a Flea Power Drain.
6. If the error still persists, contact the technical support for further assistance. Contact options are provided below.
Operating System Issues
Some operating system events, can cause a CPU IERR to be recorded within the System Event Log. These include the following:
- Fatal kernel errors,
- Third-party program interactions,
- Runtime critical stops, or
- Resource overcommitment.
This is due to the CPU identifying the process as unrecognized and asserting the CPU IERR in response.
If the CPU IERR has been caused by an operating system event the Operating System Event Log should be checked and cross referenced with the Server System Event Log to identify the Operating System event that has caused the CPU IERR. Once this operating system event has been identified, the operating system provider should be contacted to assist with resolution.