Management Controller (MC) and Storage Controller (SC) are separate subsystems on ME series arrays. The array continues to serve I/O, however all management interfaces (UI, SSH, Serial, SNMP, or REST API) are unresponsive.
Typical symptoms:
When the event logs are reviewed afterwards, see the resolution steps below. Administrators may see the following entry in the event history log even though there is no firmware upgrade in progress.
... B849 2023-08-08 01:08:16 152 WARNING The Storage Controller is not receiving data from the Management Controller. (This is normal during firmware update.) ...
An out of memory condition leads to the management application processes being terminated. Incidences may be more frequent in environments where external management applications are polling the management interfaces using SNMP or REST API.
ME5 release notes:
FMW-65056 Resolves a condition that may result in an unresponsive CLI and user interface.
Which Systems May Be Affected?
Product (and version) | The Following Dell PowerVault ME5 Series Storage Systems
|
Running this Core Software (Operating System or Operating Environment) |
PowerVault ME5 controller firmware LESS THAN version ME5.1.2.0.1 |
Open an SSH session to each controllers' management interface, and log in as a managing or administrator level user. Alternatively, an administrator can try with USB serial connection to each controller. Where it is not possible to log in using SSH or a serial connection, go to Step 2: Physically reseat one controller module or power off array.
If the login is successful, restart the management controller on each using the following command:
restart mc full # restart mc full During the restart process you will briefly lose communication with the specified Management Controller(s). Do you want to continue? (y/n) y Info: Restarting the local MC (A)... Success: Command completed successfully. (2023-08-24 05:34:01) # Killed
Scenario 1: Dual controller with redundant path host configuration
These steps can be implemented without the need for a maintenance window.
The following conditions must be true:
For guidance, see the Module removal and replacement > customer replaceable units section in the Dell PowerVault ME5 Series Storage System Owner’s Manual.
Physically pull controller module B forward in its slot by approximately five centimeters or approximately two inches, then reseat the controller module after 30 seconds.
Allow approximately two or three minutes for controller B to complete boot and firmware load.
Open an SSH session to the controller B management IP address and login as a managing or administrator level user.
Restart peer storage controller A. Type the command:
restart sc a # restart sc a While a Storage Controller is restarting, communication will temporarily be lost with the corresponding Management Controller, and also may cause a temporary loss of data availability. Do you want to continue? (y/n) y Success: Command completed successfully. - The command to restart SC A completed successfully. The controller will restart in approximately 30 seconds. (2023-08-24 07:08:39)
When the peer controller is online, login to the PowerVault Manager and go to Step 3: Upgrade controller module firmware to ME5.1.2.1.0 or later.
Scenario 2: Single controller module or nonredundant host path configuration
A maintenance window is required. Unexpectedly removing the single path to data means that the host loses access to data and stops responding!
For guidance, see the Module removal and replacement > customer replaceable units section in the Dell PowerVault ME5 Series Storage System Owner’s Manual.
Notify users of the outage and follow the host operating system user guide to put the connected host in maintenance mode or shutdown hosts.
At the array rear, switch off both power supplies for approximately 60 seconds before turning them on again.
Allow approximately three minutes for the controllers to complete boot and load firmware.
Log in to the PowerVault Manager, and go to Step 3: Upgrade controller module firmware to ME5.1.2.1.0 or later.
See the Updating system firmware section in the Dell PowerVault ME5 Series Administrator's Guide.
With ME5 controller firmware version ME5.1.2.0.1 or above, administrators may occasionally receive the following information alert.
Figure 1: Information alert
The Management Controller entered a memory exhaustion state and will reboot to recover. Data access will not be interrupted.
The management controller (MC) provides the management UI and CLI interface to monitor and configure the system. Restarting management services does not reboot controllers or disrupt I/O. The effect of restarting is an inability to access the management interface for two minutes. If you are receiving this information alert frequently, more investigation may be required to establish the cause.