- Fan failure, fan missing, fan damaged
- Outdated firmware
- Disrupted communication with the integrated Dell Remote Access Controller (iDRAC), Baseboard Management Chip (BMC) or Chassis Management Controller (CMC, OME-M for MX chassis)
- Installed unsupported hardware
- An incomplete second CPU upgrade (system type dependent) or general upgrade to the machine which requires different type of fans installed
- The temperature is exceeding normal fan speed coverage (Heavy workload leading to high CPU usage and temperature, poor airflow)
- The system cover is off or incorrectly installed. The intrusion switch might be triggered or not working.
- Configuration settings
- Inlet temperature sensor failed, false read out
In this scenario at least one fan or fan assembly (contains two fans) is either damaged (connector, fan blade, fan blade frame), missing or failed.
In order to identify the fan assembly or fan that is causing the issue, follow these steps in order:
- Check the front LCD or system event log to see which fan has been reported.
- Once we know which fan was reported as faulty, check the fan number positioning on the lid (or consult your server user guide) and see if the fan is running or not.
Caution: Be careful when opening the lid of the server without turning it off to check the fans. Elements of the inside might be hot or sharp or both.
- If the fan is slower turning, not turning at all or making irregular noises (scraping, scuffing), turn the machine off and remove the fan assembly for inspection.
- Scuffing and scraping of fans should leave visible scratches.
- Sometimes debris or dust can cause the fan to become irregular, a clean of the fan might help in this case.
- Check the connector on the motherboard or fan control board and the connector on the fan to see if there is any damage to either.
- If there is no fan damage or connection issue, reinstall the fan, shroud (if any), the chassis cover and turn the machine back on.
Note: PSUs have fans attached to them or integrated and should be checked for damage in addition to all assemblies.
Note: The modular chassis M1000E and VRTX have all the fans available for inspection on the outside. For more details,
consult your user guide.
If the fan is still reported as faulty, check the next possibility on this list.
Outdated firmware can cause fans to spin high (make noise) when nothing else is even wrong. It is common when parts of the firmware were updated and some element in the chain of sensor data collection has been omitted from being updated.
The following is a list of firmware versions that should be checked for updates as the next step of investigation:
- iDRAC, CPLD, BIOS
- PERC, BOSS, backplane, NVME drives, SAS/SATA drives
- NIC, any other PCIe card
- Power supplies (PSU)
- Any other hardware
Note: The first set of updates (iDRAC, BIOS, CPLD) must be done as single updates and should not be combined with any other updates.
When you want to use the iDRAC to update firmware (learn
here how to do this), the updates are listed in order of import left to right and top to bottom.
Each list item should be used as guide on which updates can be updated at once (
Not the
first items however).
Once the firmware is up-to-date move on to the next item on the list.
When the iDRAC, BMC or CMC/OME-M loses connection to the sensor suite, the fans return to the unmanaged speed (full) to protect the system from overheating.
This is the reason why you can hear the fans spin up before coming back down when turned on first. It takes a few minutes for the iDRAC, BMC, or CMC/OME-M to boot up and start regulating fan speed.
Note: When iDRAC or BMC is not ready, a timeout message should be present during the POST.
The LCD (if present) stays without text. If the system type is modular, it might not power on in the chassis as it cannot communicate with the CMC.
In this case,
contact our support team.
In order to troubleshoot this matter, do the following:
- For all iDRAC systems, press and hold the i-button for 16 s.
- For a system with BMC or if step 1 is not working:
- Power down the server
- Remove the power cables from it.
- Press and hold the power on button for 10 s
- Reconnect the power cables
- Wait for about 2 minutes
- Turn the server back on
- For systems with a CMC or OME-M:
- If two CMCs or OME-Ms are installed, follow the failover procedure to failover to the other unit.
- If only a single CMC or OME-M is installed, remove the module from the chassis, wait for 2 minutes, reinsert the module, wait for 20 minutes.
- In case reseating of the module or the failover did not work, a restart of the chassis will be required for a complete reinitialization.
- Schedule downtime for all servers and attached devices that rely on the chassis being up.
- Power the servers off, then power the chassis off
- Remove the power cables.
- Wait at least 10 minutes or press and hold the power on button (if any).
- Reconnect the power cables.
- Turn the chassis back on, wait for 20-30 minutes.
- Turn the servers back on.
- Reconnect to the chassis from external once all is back up and running without any errors or fan noise.
If you still experience the same fan noise, continue exploring the list.
Unsupported hardware or third-vendor hardware that has not or not yet been certified might cause the system to run the fans higher than normal or even at maximum speed.
To troubleshoot this, do the following:
- Check that the device is working.
- Check that the device is correctly installed [in the right type of slot (if applicable)]
- The iDRAC will potentially spin up the fans for specific devices or as default if it is unknown.
- In order to proceed here, remove the third party device and see if the fan noise returns to normal.
- If it is, consult with your third party vendor to see if they know of any mitigation or have any recommendations regarding usage of the device in a Dell PowerEdge server.
Note: Dell cannot support your third-party device and cannot guarantee its function within the system.
If you have followed the list up to this point and still need more support, continue to follow it further down.
If you have upgraded the system or are upgrading the system, some upgrades require additional parts (fan, memory DIMMs) or different fan types (upgrade from standard to silver or even gold fans).
These upgrades are (non-exhaustive list, consult your Sales Representative):
- Second CPU upgrade for systems which can be bought with a single CPU and can house 2 CPUs (system type dependent)
- This likely requires the removal of blanks, the additional CPU with the identical stepping, additional memory, and often one additional fan
- Some systems might even need all the fans to be upgraded from standard to silver or to gold fans (system and upgrade specific requirements)
- GPU or GPGPU upgrades for systems that do support that
- This likely requires additional risers and supporting cabling but also additional cooling depending on the original layout and fans already installed.
- Additional PCIe cards or NVME drives
- This likely requires more a check that all is in line with cooling expectations after the installation of the new parts as cooling requirements might dictate additional fans or different more powerful fan types.
If you have followed these and are sure that the issue is not listed so far, continue to follow the list.
When systems come under heavy load CPUs, but other parts can as well, do use more power and that results in a higher than normal cooling requirement.
It is also possible that over time the fan speed has increased if air flow is restricted either by being in a space that is not well ventilated or by obstruction typically seen as dust build-ups.
Check the following steps to see which issue is present here and what steps can be taken to mitigate or eliminate the issue:
- Check if the CPU usage is under constant high load (90-100%)
- If so, you might want to check why that is and if this is expected behavior (is this a normal workload causing it or something unknown for instance when it started to happen after a recent update or upgrade of the Operating System (OS))
- If the behavior is not considered normal, investigate the load further by understanding which application or service is causing the high load.
- If the behavior happens due to seemingly normal operations and there have been no recent updates (or reboots, intended or unintended) to the software of the machine, your machine might have reached the maximum of what it is designed to do and your workload has outgrown the hardware it is running on. Especially if you have multiple systems with a similar load running similar workload types and having the same problem, you want to talk to a sales representative and see what can be done about this matter in terms of scaling or upgrading.
- Check if the inlet vents are obstructed or restricted, or if the fans themselves are obstructed or restricted in any way
- Over time buildup of dust is relatively normal. A 100% dust-free environment is sometimes hard to maintain to unrealistic depending on the circumstances. As such regular maintenance of the machines that frees the machine physically from dust and keeps the air flowing is a must and should be integrated in all maintenance schedules to be carried out every year at least once (more often the more the machine is exposed to dust).
- If you find your vents or fans are obstructed, schedule maintenance for the machine and clear them of all dust and obstructions. You can find some details in Guidance for Keeping Your Dell Technologies Equipment Clean.
If you have the same issue after following this, explore the list further.
Some systems require the system cover to be closed and the intrusion switch to be in the closed state (pressed). If the cover is not installed and as a result the intrusion switch is triggered the fan speed does increase to max as a precaution.
This can also happen as a result of a faulty intrusion switch for those systems as a broken switch is always open so in this case always triggered indicating an open system cover.
Check the following:
- Remove the system cover and reinstall it again while making sure to fits correctly.
- It is useful to test this on a test bench or work bench with power available outside of the rack to ensure a save environment.
- This will also allow for better visibility regarding fit of the system cover and any damage to the holder of the intrusion switch or the switch itself.
- Check if the switch is in place correctly and if it triggers when springing back and disables when pressed.
- Triggering the intrusion switch does generate an entry in the system event log (found in the iDRAC of the system)
- Close the system cover correctly, inspect the fit, and ensure that all parts fit correctly together.
If you still have need of further help after this, consult the list for another topic.
The iDRAC controls the thermal settings of the machine making sure that all parts are cooled correctly. These settings can be manually changed to increase or decrease the offset of the fan speed or changing the default thermal profile. Changing the profile from the default profile can also increase the fan speeds.
If you are uncertain of the settings used, you can use the following steps to reset the settings:
- During the POST, press F2
- Select System Services
- Find in the bottom right corner Defaults and press it
- Select Exit
- When prompted, select Save and Reboot
- Once rebooted, press F2 during POST again
- Select iDRAC settings > Thermal
- Make sure that there are no settings set or selected and the profile is showing the default Thermal Profile Settings (Max Performance).
- Finish and reboot.
If you have been through this part and have not found a solution yet, consider to checking the list above. If you have exhausted this list, collect the support log file [TSR] (Technical Support Report) and contact our support team.
It is possible that you come across a warning message in the System Event Log (SEL) of the iDRAC advising that the inlet temperature failed or that the read out is higher as expected (environmental temperature when measured does not closely match the sensor output). The sensor measures the temperature at the front of the machine, and the iDRAC is using the data provided to calculate the cooling needs based on that. As a result a faulty or incorrectly measuring sensor will result in higher or maximized fan speeds.
Note: For lower fan speeds at default settings and normal workloads, typical inlet temperatures range between 21 °C to 26 °C (70 °F to 79 °F). The server can operate at higher temperatures but must increase fan speeds to compensate.
In order to troubleshoot this matter, do the following:
- Check the SEL for the warning or error message
- If you have not carried out the actions outlined in the firmware section, follow that section to exclude firmware as a problem for the mismatch information.
- Check the SEL again after all firmware updates are complete.
- Check the inlet temperature in the iDRAC web interface and see if it is still higher than expected or not reading at all.
- If the matter persists, collect a new TSR and contact our support team.
Here you can go back to the list.