Dell ObjectScale 1.3 Administration Guide

Automatic disk replacement service procedure

About this task

An automatic disk replacement procedure is implemented within ObjectScale and automatically handles disk failures. This procedure details the process to locate failed Persistent Volumes and then successfully replace the PV.

Steps

From the ObjectScale Portal user interface, click Administration > ObjectScale.
The list of Object Stores in the selected namespace that the user is authorized to view is displayed.
Select the appropriate namespace from the namespace drop-down on the upper right the ObjectScale Portal user interface.
Optional: Locate the object store containing the failed disk on the object stores details page.
When a PV fails, the state and health of the object store(s) containing that PV will go into ReplacingPV and the disk replacement service procedure begins.
Monitor the status of the process at Monitoring > Logs tab to view the system generated events for the disk replacement process.
- If the system contains an available spare drive, the service procedure will progress to completion. Once the process has completed, the object store(s) will return to the Started State and Available Health.
- If there are no available spare drives or otherwise insufficient capacity, the disk replacement service procedure generates a warning events for Not enough capacity as it attempts to recreate the PVCs on the failed PV. Complete step 5 to complete the replacement if this occurs.
After getting an event that Reason: DriveReadyForRemoval and have a new disk available, initiate the physical replacement in OpenShift by placing a replacement=ready annotation on the failed/suspect disk.
1. Confirm the disk is in Released status.
```
kubectl get drives | grep <DRIVE_SERIAL_NUMBER>
```
2. Place the replacement=ready annotation on the failed/suspect disk.
```
annotate drives.csi-baremetal.dell.com <DRIVE_RESOURCE_ID> replacement=ready
```
3. Confirm that the disk is now in Removed status.
```
kubectl get drives | grep <DRIVE_SERIAL_NUMBER>
```
4. Confirm that the ISSUE has been updated with Reason: DriveReadyForPhysicalRemoval.
  
  CAUTION:Do not physical replace the disk until the above WARNING event is displayed under the respective ISSUE.
The disk LED is blinking. If you are unable to identify the disk to replace, you will need to determine another way to identify the disk manually or visually, by using additional information located in the associated ISSUE events.
Remove and replace the failed drive with the new, clean drive. Afterwards, the ISSUE in ObjectScale Portal UI will be auto-cleared by being set to Normal severity. Once the event Reason: "DriveSuccessfullyRemoved" occurs and you have inserted a new drive into the node, the disk replacement service procedure has completed successfully and no further actions is required.

Welcome

Welcome to Dell

Dell ObjectScale 1.3 Administration Guide

Automatic disk replacement service procedure

About this task

Steps

Rate this content