Skip to main content
  • Place orders quickly and easily
  • View orders and track your shipping status
  • Create and access a list of your products

Dell ObjectScale 1.3 Administration Guide

Prepare a healthy node for node hardware or software maintenance (ObjectScale Software Bundle)

Follow this Node Reparation procedure to repair a healthy node to fix a system disk, issues with node hardware or software, or upgrade the node Operating System.

Steps

  1. The ObjectScale Software Bundle CMO Platform Manager APIs require a keycloak token to authenticate the requests for cluster management tasks.

    The ObjectScale Software Bundle contains a CMO Platform Manager running on Kubernetes within the cluster that is used to request cluster management tasks, like service procedures.

    1. Collect the keycloak account information from the secret:
      export KEYCLOAK_USER=$(kubectl get secret keycloak-pm-auth-info -n cmo -o json | jq -r '.data["keycloak-username"]' | base64 --decode)
      export KEYCLOAK_PASSWORD=$(kubectl get secret keycloak-pm-auth-info -n cmo -o json | jq -r '.data["keycloak-password"]' | base64 --decode)
      export KEYCLOAK_REALM=$(kubectl get secret keycloak-pm-auth-info -n cmo -o json | jq -r '.data["keycloak-realm"]' | base64 --decode)
      export KEYCLOAK_CLIENT=$(kubectl get secret keycloak-pm-auth-info -n cmo -o json | jq -r '.data["keycloak-client"]' | base64 --decode)
      export KEYCLOAK_CLIENT_SECRET=$(kubectl get secret keycloak-pm-auth-info -n cmo -o json | jq -r '.data["keycloak-credentials-secret"]' | base64 --decode)
    2. Set an environment variable for the access token:
      export TOKEN=$(curl -L -X POST https://keycloak-http.atlantic/auth/realms/$KEYCLOAK_REALM/protocol/openid-connect/token -H 'Content-Type: application/x-www-form-urlencoded' --data-urlencode client_id=$KEYCLOAK_CLIENT --data-urlencode 'grant_type=password' --data-urlencode client_secret=$KEYCLOAK_CLIENT_SECRET --data-urlencode 'scope=openid' --data-urlencode username=$KEYCLOAK_USER --data-urlencode password=$KEYCLOAK_PASSWORD | jq -r '.access_token')
  2. Collect the IP address of the CMO Platform Manager.
    kubectl get services -n cmo platform-manager -o jsonpath='{.spec.clusterIP}'
  3. Apply a taint to the node to be placed into ObjectScale temporary maintenance mode:
    kubectl taint node <NODE_NAME> node.dell.com/drain=planned-downtime:NoSchedule
  4. Verify that the PHASE of the cluster now displays Maintenance.
    kubectl -n <OBJECTSCALE_NAMESPACE> get ecs-cluster
    NAME          PHASE         READY COMPONENTS   S3 ENDPOINT         MGMT API
    ecs-cluster   Maintenance   22/23              10.236.228.53:443   10.236.228.52:4443

    The ObjectScale Portal UI shows the object store status as Maintenance.

  5. Once the taint has been applied to a node, the ObjectScale Operator creates the ObjectScale TMM service procedure. Retrieve the list of service procedures and locate the TMM service procedure with tmm- prefixed to the service procedure name:
    kubectl -n <OBJECTSCALE_NAMESPACE> get serviceprocedures
    NOTE:To obtain details about a service procedure, including its status, use:
    kubectl -n <OBJECTSCALE_NAMESPACE> describe serviceprocedures <SP_NAME>
    NOTE:Do not delete the service procedure while it is running.
  6. Monitor the status of the service procedure with the following command:
    while true; do kubectl -n <OBJECTSCALE_NAMESPACE> get serviceprocedures -o custom-columns=Name:metadata.name,Node:spec.nodeInfo.name,Type:spec.type,Time:metadata.managedFields[0].time,Reason:status.reason,Message:status.message; echo; sleep 5; done

    The service procedure transitions through various phases as it progresses. The Reason value for the TMM service procedure should progress from NotStarted, In Progress, PostCheck, Waiting, and finally to Success. A reason of Success or Waiting indicates that the service procedure has completed without error, and the node is now in TMM.

  7. Next, place the node into maintenance mode within the CMO Platform within the ObjectScale Software Bundle.
    kubectl cordon <NODE_NAME>
  8. Safely evict all your pods from the node:
    kubectl drain <NODE_NAME> --ignore-daemonsets --delete-emptydir-data --force

    For example:

    # kubectl drain hostname6 --ignore-daemonsets --delete-emptydir-data --force
    WAINING: ignoring DaemonSet-managed Pods: cmo/metallb-speaker-ggrkq, cmo/whereabout-whereabouts-58dss, calico-system/calico-node-8rmcp, default/csi-baremetal-node-9wg5w, kube-system/rke2-ingress-nginx-controller-zvbnw, kube-system/rke2-multus-ds-5bv2x
    evicting pod cmo/decks-support-store-0
    pod/decks-support-store-0 evicted
    node/hostname6 drained
  9. Verify the status of the drained node:
    kubectl get node <NODE_NAME>

    For example:

    # kubectl get node hostname6
    NAME         STATUS                     ROLES    AGE     VERSION
    hostname14   Ready,SchedulingDisabled   <none>   6d19h   v1.24.7+rke2r1
  10. Verify that all CMO component pods have been rescheduled to the other nodes.
    kubectl get pod -n cmo | grep Pending
  11. Verify the ObjectScale Portal UI shows that the node has entered TMM by reviewing the Monitoring > Issues tab.
  12. Create the scaledown.json with the details of the node that you are removing from the ObjectScale Software Bundle cluster.
    Place this JSON payload in a controlplane node where you are going to perform the scale down of the node.
    {
      "hosts":  [{
        "hostname": "<NODE_HOSTNAME>"
      }],
      "remove_os_packages": "false"
    }
  13. Scale down the node using the CMO Platform Manager scale down API.
    NOTE: If the node is unreachable (the logs read "Unreachable=1"), a scale down operation would report failure, even though the scale down happens successfully.
    curl --header "Content-Type: application/json" --header "Authorization: Bearer $TOKEN" --request DELETE --data @scaledown.json https://<CMO_PLATFORM_MANAGER_IP>/v3/clusters/nodes -v -k | json_pp
    For example:
    ......
    {
       "created_at" : "2023-04-15T11:35:35Z",
       "completed_tasks" : 0,
       "total_tasks" : 273,
       "recap" : {
          "hosts" : {}
       },
       "id" : "ac2324c5-0112-45f3-83e9-4f018d24ca57",
       "link" : {
          "href" : "https://0.0.0.0:8080/v1/status/ac2324c5-0112-45f3-83e9-4f018d24ca57",
          "rel" : "self"
       },
       "logs" : "",
       "state" : "created",
       "updated_at" : "2023-04-15T11:35:36Z",
       "playbook_id" : "remove-node"
    }
  14. Collect the "id" value from the returned output. You will use this value in the next step.
    For previous example, the "id" value is ac2324c5-0112-45f3-83e9-4f018d24ca57.
  15. After performing the scale down API, check the status of the operation through the API below:
    NOTE:The CMO Platform Manager TOKEN may expire, and be refreshed by running:
    export TOKEN=$(curl -L -X POST https://keycloak-http.atlantic/auth/realms/$KEYCLOAK_REALM/protocol/openid-connect/token -H 'Content-Type: application/x-www-form-urlencoded' --data-urlencode client_id=$KEYCLOAK_CLIENT --data-urlencode 'grant_type=password' --data-urlencode client_secret=$KEYCLOAK_CLIENT_SECRET --data-urlencode 'scope=openid' --data-urlencode username=$KEYCLOAK_USER --data-urlencode password=$KEYCLOAK_PASSWORD | jq -r '.access_token')
    curl --header "Content-Type: application/json" --header "Authorization: Bearer $TOKEN" --request GET https://<CMO_PLATFORM_MANAGER_IP>/v1/status/<ID> -k | jq
    When the operation is finished, the operation "state" is marked as "complete".
    NOTE:In certain situations, the status may show as Failed when the failure node was removed successfully. Check the node status.
  16. Confirm that the node has been removed from the node list.
    kubectl get node
  17. Perform any necessary maintenance on the node.
  18. On the node, create the scaleup.json file with the necessary details for the node.
    NOTE:When a node is added to a cluster, a situation may occur whereby the /etc/hosts file for the added node is not updated correctly, which causes issues when the cluster is upgraded. To avoid failures during the upgrade process, perform the following steps after adding a node:
    1. Retrieve the helmrepo service IP address.
      kubectl -n cmo get svc helmrepo
      For example:
      kubectl -n cmo get svc helmrepo
      NAME       TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)     AGE
      helmrepo   ClusterIP   172.43.174.187   <none>        30036/TCP   12d
    2. Add an entry for the service to the /etc/hosts file of the added node. For example:

      <CLUSTER_IP> helmrepo
      For example:
      172.43.174.187 helmrepo
    Place this JSON payload in the node where we are going to perform the scale up of the node.
    {
      "credentials": [{
        "name": "<HOSTNAME>",
        "type": "password",
        "password": "<PASSWORD>"
      }],
      "hosts": [{
        "hostname": "<NODE_HOSTNAME>",
        "managementhost": "<HOST_IP>",
        "kuberneteshost": "<HOST_IP>",
        "hostCredentials": "<HOST_CREDS>",
        "topology": {
            "role": "controlplane" or "worker"
        }
      }]
    }
    For example:
    {
      "credentials": [{
        "name": "mykey1",
        "type": "password",
        "password": "ChangeMe"
      }],
      "hosts": [{
        "hostname": "hostname6",
        "managementhost": "10.236.227.213",
        "kuberneteshost": "10.236.227.213",
        "hostCredentials": "mykey1",
        "topology": {
            "role": "controlplane" 
        }
      }]
    }
  19. Call the CMO Platform Manager API to initiate the scaling-up operation.

    Run this command from the directory where the scaleup.json file exists.

    curl --header "Content-Type: application/json" --header "Authorization: Bearer $TOKEN"  --request POST --data @scaleup.json https://<CMO_PLATFORM_MANAGER_IP>/v3/clusters/nodes -v -k | json_pp
    For example:
    curl --header "Content-Type: application/json" --header "Authorization: Bearer $TOKEN"  --request POST --data @scaleup.json https://10.43.78.77:7070/v3/clusters/nodes -v -k | json_pp
    ......
    {
       "created_at" : "2023-04-15T11:35:35Z",
       "completed_tasks" : 0,
       "total_tasks" : 273,
       "recap" : {
          "hosts" : {}
       },
       "id" : "286bdb32-ff07-4e46-947e-e4c9e9b98338",
       "link" : {
          "href" : "https://0.0.0.0:8080/v1/status/286bdb32-ff07-4e46-947e-e4c9e9b98338",
          "rel" : "self"
       },
       "logs" : "",
       "state" : "created",
       "updated_at" : "2023-04-15T11:35:36Z",
       "playbook_id" : "scale"
    }
  20. Collect the "id" value from the returned output. You will use this value in the next step.
    For previous example, the "id" value is 286bdb32-ff07-4e46-947e-e4c9e9b98338.
  21. After performing the scale up API, check the status of the operation:
    NOTE:The CMO Platform Manager TOKEN may expire, and need to be refreshed by running:
    export TOKEN=$(curl -L -X POST https://keycloak-http.atlantic/auth/realms/$KEYCLOAK_REALM/protocol/openid-connect/token -H 'Content-Type: application/x-www-form-urlencoded' --data-urlencode client_id=$KEYCLOAK_CLIENT --data-urlencode 'grant_type=password' --data-urlencode client_secret=$KEYCLOAK_CLIENT_SECRET --data-urlencode 'scope=openid' --data-urlencode username=$KEYCLOAK_USER --data-urlencode password=$KEYCLOAK_PASSWORD | jq -r '.access_token')
    curl --header "Content-Type: application/json" --header "Authorization: Bearer $TOKEN" --request GET https://<CMO_PLATFORM_MANAGER_IP>/v1/status/<ID> -k | jq 

    When the operation is finished, the operation "state" is marked as "complete".

  22. Confirm that the new node appears in the node list.
    kubectl get node

Rate this content

Accurate
Useful
Easy to understand
Was this article helpful?
0/3000 characters
  Please provide ratings (1-5 stars).
  Please provide ratings (1-5 stars).
  Please provide ratings (1-5 stars).
  Please select whether the article was helpful or not.
  Comments cannot contain these special characters: <>()\