Place a failed node into permanent maintenance mode (ObjectScale on OpenShift)
Use kubectl to manually place a failed node (that is powered off, dead, or otherwise inaccessible) into permanent maintenance mode. Use this process for ObjectScale instances on a Red Hat OpenShift cluster.
About this task
NOTE:The removal process for a failed node is largely manual, and does not involve the ObjectScale Operator in the same way that standard PMM does. However, during the node removal process, the ObjectScale Operator starts recovery procedures for certain non-SS stateful pods, such as bookie, influxdb, and zookeeper. Recovery of the non-SS stateful pods occurs automatically, and does not affect the procedure workflow.
Place the failed node to be removed in to permanent maintenance mode:
Steps
Mark the failed node as unschedulable so that it is no longer available to run pods.
Collect the UUID for the node to be removed from the cluster:
kubectl get csibmnodes
Remove the node from the OpenShift cluster.
kubectl delete node <NODE_NAME>
NOTE:Once you delete the node, it is no longer listed in the
kubectl get nodes output.
Manually delete the PVCs bound to the failed node:
Get the names of the PVCs:
kubectl get pvc
Get the details for each of the PVCs:
kubectl describe <PVC_NAME>
Get the node for each of the PVCs:
for i in `kubectl get pvc --no-headers -o jsonpath="{.items[*].metadata.name}"`; do echo "=== $i"; kubectl get pvc $i -o json | grep selected-node | grep -v "{}"; done
kubectl get ac | grep <NODE_UUID> | awk '{print $1}' | xargs kubectl delete ac
Remove the pending pods for all namespaces that are associated with ObjectScale and object stores:
Identify the pods to be deleted:
kubectl get pods | grep Pending
Delete each pod returned, that is associated with the removed node:
kubectl delete pods <PODS>
Finally, verify that all the resources have been successfully removed:
Check for Bare-Metal nodes:
kubectl get csibmnode | grep <NODE_UUID>
Check for available capacity:
kubectl get ac | grep <NODE_UUID>
Check for drive CRs:
kubectl get drive | grep <NODE_UUID>
Optional: Monitor the automatic recovery of non-SS pods:
In order to ensure data protection, certain non-SS pods, such as the bookie, influxdb, and zookeeper pods, require recovery after they are relocated. ObjectScale Operator initiates recovery for these pods automatically once the pods are removed from the PMM node and started on another available node in the cluster.
kubectl get serviceprocedures -A -o custom-columns=Name:metadata.name,Node:spec.nodeInfo.name,Type:spec.type,Time:metadata.managedFields[0].time,Reason:status.reason,Message:status.message
Data is not available for the Topic
Please provide ratings (1-5 stars).
Please provide ratings (1-5 stars).
Please provide ratings (1-5 stars).
Please select whether the article was helpful or not.
Comments cannot contain these special characters: <>()\