Skip to main content
  • Place orders quickly and easily
  • View orders and track your shipping status
  • Create and access a list of your products

Dell ObjectScale 1.3 Administration Guide

Prepare a failed node for node hardware or software maintenance (ObjectScale Software Bundle)

Follow this Node Reparation procedure for a failed node.

About this task

Stateful data on the node to be repaired are not deleted, and nor are stateful ObjectScale pods rescheduled to other nodes in the cluster.

Steps

  1. The ObjectScale Software Bundle CMO Platform Manager APIs require a keycloak token to authenticate the requests for cluster management tasks.

    The ObjectScale Software Bundle contains a CMO Platform Manager running on Kubernetes within the cluster that is used to request cluster management tasks, like service procedures.

    1. Collect the keycloak account information from the secret:
      export KEYCLOAK_USER=$(kubectl get secret keycloak-pm-auth-info -n cmo -o json | jq -r '.data["keycloak-username"]' | base64 --decode)
      export KEYCLOAK_PASSWORD=$(kubectl get secret keycloak-pm-auth-info -n cmo -o json | jq -r '.data["keycloak-password"]' | base64 --decode)
      export KEYCLOAK_REALM=$(kubectl get secret keycloak-pm-auth-info -n cmo -o json | jq -r '.data["keycloak-realm"]' | base64 --decode)
      export KEYCLOAK_CLIENT=$(kubectl get secret keycloak-pm-auth-info -n cmo -o json | jq -r '.data["keycloak-client"]' | base64 --decode)
      export KEYCLOAK_CLIENT_SECRET=$(kubectl get secret keycloak-pm-auth-info -n cmo -o json | jq -r '.data["keycloak-credentials-secret"]' | base64 --decode)
    2. Set an environment variable for the access token:
      export TOKEN=$(curl -L -X POST https://keycloak-http.atlantic/auth/realms/$KEYCLOAK_REALM/protocol/openid-connect/token -H 'Content-Type: application/x-www-form-urlencoded' --data-urlencode client_id=$KEYCLOAK_CLIENT --data-urlencode 'grant_type=password' --data-urlencode client_secret=$KEYCLOAK_CLIENT_SECRET --data-urlencode 'scope=openid' --data-urlencode username=$KEYCLOAK_USER --data-urlencode password=$KEYCLOAK_PASSWORD | jq -r '.access_token')
  2. Collect the IP address of the CMO Platform Manager.
    kubectl get services -n cmo platform-manager -o jsonpath='{.spec.clusterIP}'
  3. Safely evict all your pods from the node:
    kubectl drain <NODE_NAME> --ignore-daemonsets --delete-emptydir-data --force

    For example:

    # kubectl drain hostname6 --ignore-daemonsets --delete-emptydir-data --force
    WAINING: ignoring DaemonSet-managed Pods: cmo/metallb-speaker-ggrkq, cmo/whereabout-whereabouts-58dss, calico-system/calico-node-8rmcp, default/csi-baremetal-node-9wg5w, kube-system/rke2-ingress-nginx-controller-zvbnw, kube-system/rke2-multus-ds-5bv2x
    evicting pod cmo/decks-support-store-0
    pod/decks-support-store-0 evicted
    node/hostname6 drained
  4. Create the scaledown.json with the details of the node that you are removing from the ObjectScale Software Bundle.
    Place this JSON payload in one of the controlplane nodes where you will to perform the scale down of the node.
    {
      "hosts":  [{
        "hostname": "<NODE_HOSTNAME>"
      }],
      "remove_os_packages": "false"
    }
  5. Scale down the node using the CMO Platform Manager scale down API.
    NOTE: If the node is unreachable (the logs read "Unreachable=1"), a scale down operation would report failure, even though the scale down happens successfully.
    curl --header "Content-Type: application/json" --header "Authorization: Bearer $TOKEN" --request DELETE --data @scaledown.json https://<CMO_PLATFORM_MANAGER_IP>/v3/clusters/nodes -v -k | json_pp
    For example:
    ......
    {
       "created_at" : "2023-04-15T11:35:35Z",
       "completed_tasks" : 0,
       "total_tasks" : 273,
       "recap" : {
          "hosts" : {}
       },
       "id" : "ac2324c5-0112-45f3-83e9-4f018d24ca57",
       "link" : {
          "href" : "https://0.0.0.0:8080/v1/status/ac2324c5-0112-45f3-83e9-4f018d24ca57",
          "rel" : "self"
       },
       "logs" : "",
       "state" : "created",
       "updated_at" : "2023-04-15T11:35:36Z",
       "playbook_id" : "remove-node"
    }
  6. Collect the "id" value from the returned output. You will use this value in the next step.
    For previous example, the "id" value is ac2324c5-0112-45f3-83e9-4f018d24ca57.
  7. After performing the scale down API, check the status of the operation through the API below:
    NOTE:The CMO Platform Manager TOKEN may expire, and be refreshed by running:
    export TOKEN=$(curl -L -X POST https://keycloak-http.atlantic/auth/realms/$KEYCLOAK_REALM/protocol/openid-connect/token -H 'Content-Type: application/x-www-form-urlencoded' --data-urlencode client_id=$KEYCLOAK_CLIENT --data-urlencode 'grant_type=password' --data-urlencode client_secret=$KEYCLOAK_CLIENT_SECRET --data-urlencode 'scope=openid' --data-urlencode username=$KEYCLOAK_USER --data-urlencode password=$KEYCLOAK_PASSWORD | jq -r '.access_token')
    curl --header "Content-Type: application/json" --header "Authorization: Bearer $TOKEN" --request GET https://<CMO_PLATFORM_MANAGER_IP>/v1/status/<ID> -k | jq
    When the operation is finished, the operation "state" is marked as "complete".
    NOTE:In certain situations, the status may show as Failed when the failure node was removed successfully. Check the node status.
  8. Confirm that the node has been removed from the node list.
    kubectl get node
  9. Verify that the statefulset pods have move to Pending state after node removal:
    kubectl get pods -o wide | grep -v Running
  10. Fix the node while it is offline, and then go to the next step.
  11. On the node, create the scaleup.json file with the necessary details for the node.
    NOTE:When a node is added to a cluster, a situation may occur whereby the /etc/hosts file for the added node is not updated correctly, which causes issues when the cluster is upgraded. To avoid failures during the upgrade process, perform the following steps after adding a node:
    1. Retrieve the helmrepo service IP address.
      kubectl -n cmo get svc helmrepo
      For example:
      kubectl -n cmo get svc helmrepo
      NAME       TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)     AGE
      helmrepo   ClusterIP   172.43.174.187   <none>        30036/TCP   12d
    2. Add an entry for the service to the /etc/hosts file of the added node. For example:

      <CLUSTER_IP> helmrepo
      For example:
      172.43.174.187 helmrepo
    Place this JSON payload in the node where we are going to perform the scale up of the node.
    {
      "credentials": [{
        "name": "<HOSTNAME>",
        "type": "password",
        "password": "<PASSWORD>"
      }],
      "hosts": [{
        "hostname": "<NODE_HOSTNAME>",
        "managementhost": "<HOST_IP>",
        "kuberneteshost": "<HOST_IP>",
        "hostCredentials": "<HOST_CREDS>",
        "topology": {
            "role": "controlplane" or "worker"
        }
      }]
    }
    For example:
    {
      "credentials": [{
        "name": "mykey1",
        "type": "password",
        "password": "ChangeMe"
      }],
      "hosts": [{
        "hostname": "hostname6",
        "managementhost": "10.236.227.213",
        "kuberneteshost": "10.236.227.213",
        "hostCredentials": "mykey1",
        "topology": {
            "role": "controlplane" 
        }
      }]
    }
  12. Call the CMO Platform Manager API to initiate the scaling-up operation.

    Run this command from the directory where the scaleup.json file exists.

    curl --header "Content-Type: application/json" --header "Authorization: Bearer $TOKEN"  --request POST --data @scaleup.json https://<CMO_PLATFORM_MANAGER_IP>/v3/clusters/nodes -v -k | json_pp
    For example:
    curl --header "Content-Type: application/json" --header "Authorization: Bearer $TOKEN"  --request POST --data @scaleup.json https://10.43.78.77:7070/v3/clusters/nodes -v -k | json_pp
    ......
    {
       "created_at" : "2023-04-15T11:35:35Z",
       "completed_tasks" : 0,
       "total_tasks" : 273,
       "recap" : {
          "hosts" : {}
       },
       "id" : "286bdb32-ff07-4e46-947e-e4c9e9b98338",
       "link" : {
          "href" : "https://0.0.0.0:8080/v1/status/286bdb32-ff07-4e46-947e-e4c9e9b98338",
          "rel" : "self"
       },
       "logs" : "",
       "state" : "created",
       "updated_at" : "2023-04-15T11:35:36Z",
       "playbook_id" : "scale"
    }
  13. Collect the "id" value from the returned output. You will use this value in the next step.
    For previous example, the "id" value is 286bdb32-ff07-4e46-947e-e4c9e9b98338.
  14. After performing the scale up API, check the status of the operation:
    NOTE:The CMO Platform Manager TOKEN may expire, and need to be refreshed by running:
    export TOKEN=$(curl -L -X POST https://keycloak-http.atlantic/auth/realms/$KEYCLOAK_REALM/protocol/openid-connect/token -H 'Content-Type: application/x-www-form-urlencoded' --data-urlencode client_id=$KEYCLOAK_CLIENT --data-urlencode 'grant_type=password' --data-urlencode client_secret=$KEYCLOAK_CLIENT_SECRET --data-urlencode 'scope=openid' --data-urlencode username=$KEYCLOAK_USER --data-urlencode password=$KEYCLOAK_PASSWORD | jq -r '.access_token')
    curl --header "Content-Type: application/json" --header "Authorization: Bearer $TOKEN" --request GET https://<CMO_PLATFORM_MANAGER_IP>/v1/status/<ID> -k | jq 

    When the operation is finished, the operation "state" is marked as "complete".

  15. Confirm that the new node appears in the node list.
    kubectl get node
  16. Verify that pods can be rescheduled to this node:
     kubectl get pod -A -o wide | grep <NODE_NAME>

Rate this content

Accurate
Useful
Easy to understand
Was this article helpful?
0/3000 characters
  Please provide ratings (1-5 stars).
  Please provide ratings (1-5 stars).
  Please provide ratings (1-5 stars).
  Please select whether the article was helpful or not.
  Comments cannot contain these special characters: <>()\