Dell PowerEdge servers that are equipped with add-in PCIe network adapters may report an unknown health status within the iDRAC9 management interfaces on iDRAC9 5.xx.xx.xx firmware versions. When this condition occurs, the Lifecycle Log records an HWC8607 error alert.
Impacted iDRAC9 Firmware Versions:
Within the iDRAC9 user interface (UI), the status of the impacted network adapter is represented with a question mark '?' when unknown health status is encountered.
Example:
iDRAC9 UI > System > Network Devices
Lifecycle Log Error Example:
2022-01-11 23:25:16 3223 HWC8607 The data communication with the device NIC in Slot 2 running on the port 4 is lost. 2022-01-11 23:25:04 3222 HWC8607 The data communication with the device NIC in Slot 2 running on the port 2 is lost. 2022-01-11 23:24:46 3219 HWC8607 The data communication with the device NIC in Slot 2 running on the port 3 is lost. 2022-01-11 23:20:06 3216 HWC8607 The data communication with the device NIC in Slot 2 running on the port 1 is lost.
iDRAC9 5.00.00.00 firmware introduced support for PCIeVDM side-band management support for Dell PCIe network adapters. Under certain conditions, the PCIeVDM queues on the iDRAC9 may fill and prevent iDRAC from sending any additional commands to the adapter.
iDRAC9 firmware version 5.10.30.00 (June 2022) corrects the conditions that leads to this sighting.
Workarounds:
Disabling PCIeVDM on the iDRAC9 reverts the management of the installed PCIe network adapters back to the SMBUS interface without any impact to network device management. Disabling PCIeVDM and rebooting iDRAC recovers from this condition and prevents additional occurrences.
To disable PCIeVDM on the iDRAC9, leverage the following RACADM commands:
racadm>>racadm set iDRAC.PCIeVDM.Enable Disabled [Key=iDRAC.Embedded.1#PCIeVDM.1] Object value modified successfully racadm>>racadm racreset RAC reset operation initiated successfully. It may take a few minutes for the RAC to come online again.
iDRAC9 firmware updates do not modify user-defined attribute settings. Once iDRAC9 5.10.30.00 is applied to impacted servers, PCIeVDM must be reenabled manually to turn this protocol back on.
racadm>>racadm set iDRAC.PCIeVDM.Enable Enabled [Key=iDRAC.Embedded.1#PCIeVDM.1] Object value modified successfully racadm>>racadm racreset RAC reset operation initiated successfully. It may take a few minutes for the RAC to come online again.