Alert can be seen in ObjectScale UI, Object Store Health dashboard in '
Issues' tab.
vSphere
Example below from vSphere: Under
Objectstore -->
Dashboard-->
issues
OpenShift
Example below from OpenShift: Under
Objectstore ->
Health.
Telegraf, Fluxd on the object store level do not work or InfluxDB instances on the object store level do not accept writes or unable to process read requests.
Alert triggered by object store event service. Flux query sent by the service to Fluxd to check if new measurements are sent by Telegraf to InfluxDB.
If you have the above alert in your environment, it is advised to run the below steps from a jumpbox or service node with kubectl installed and provide details to Dell Technologies support. O
bjectstore should be replaced with the actual objectstore in all the below commands.
- Validate that <
objectstore>-telegraf
pods are running. Pods should be in ready state and not frequently restarting.
Commands:
# kubectl get deployment objectstore-telegraf
# kubectl get pod -l app.kubernetes.io/name=objectstore-telegraf
(where objectstore is specific to what is created)
Example:
# kubectl get deployment demo-corkboy-telegraf
NAME READY UP-TO-DATE AVAILABLE AGE
demo-corkboy-telegraf 3/3 3 3 4h15m
# kubectl get pod -l app.kubernetes.io/name=demo-corkboy-telegraf
NAME READY STATUS RESTARTS AGE
demo-corkboy-telegraf-6fbb6d7bbc-4l4wd 3/3 Running 0 4h17m
demo-corkboy-telegraf-6fbb6d7bbc-bvqw2 3/3 Running 0 4h17m
demo-corkboy-telegraf-6fbb6d7bbc-qkk9z 3/3 Running 0 4h17m
- Validate that <
objectstore>
-influxdb
resources exist and all pods are running. Pods should be in ready state and not frequently restarting. If missing, go to step 3. If pending, go to step 4.
Commands:
kubectl get influxdb objectstore-influxdb
kubectl get statefulset objectstore-influxdb
kubectl get pod -l app.kubernetes.io/name=objectstore-influxdb
(where objectstore is specific to what is created)
Example:
# kubectl get influxdb demo-corkboy-influxdb
NAME AGE
demo-corkboy-influxdb 4h38m
# kubectl get statefulset demo-corkboy-influxdb
NAME READY AGE
demo-corkboy-influxdb 3/3 4h39m
# kubectl get pod -l app.kubernetes.io/name=demo-corkboy-influxdb
NAME READY STATUS RESTARTS AGE
demo-corkboy-influxdb-0 5/5 Running 0 4h41m
demo-corkboy-influxdb-1 5/5 Running 0 4h41m
demo-corkboy-influxdb-2 5/5 Running 0 4h41m
- If <
objectstore>
-influxdb 'influxdb'
resource or statefulset
is missing in previous command, validate that objectscale-manager-influxdb-operator
is running.
Commands:
kubectl get deployment objectscale-manager-influxdb-operator
kubectl get pod -l app.kubernetes.io/name=objectscale-manager-influxdb-operator
Example:
# kubectl get deployment objectscale-manager-influxdb-operator
NAME READY UP-TO-DATE AVAILABLE AGE
objectscale-manager-influxdb-operator 1/1 1 1 5h4m
# kubectl get pod -l app.kubernetes.io/name=objectscale-manager-influxdb-operator
NAME READY STATUS RESTARTS AGE
objectscale-manager-influxdb-operator-56f65b6c54-n28w6 2/2 Running 0 5h4m
- If <
objectstore>
-influxdb
pods are pending per point 2, validate that PVCs for InfluxDB pods are bound.
Command:
# kubectl get pvc -l app.kubernetes.io/name=objectstore-influxdb
(where objectstore is specific to what is created)
Example:
# kubectl get pvc -l app.kubernetes.io/name=demo-corkboy-influxdb
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
demo-corkboy-influxdb-data-demo-corkboy-influxdb-0 Bound pvc-11d40155-3346-4a83-bff3-503a49a9f9fc 20Gi RWO objectscale-highly-available 4h54m
demo-corkboy-influxdb-data-demo-corkboy-influxdb-1 Bound pvc-045a869e-76a6-417e-ac7d-df2132b64a38 20Gi RWO objectscale-highly-available 4h54m
demo-corkboy-influxdb-data-demo-corkboy-influxdb-2 Bound pvc-39a772bd-cfce-40f2-9c88-eb1e3d4c9156 20Gi RWO objectscale-highly-available 4h54m
- Validate that <
objectstore>
-fluxd
pods are running. Pods should be in ready state and not frequently restarting.
Commands:
kubectl get deployment objectstore-fluxd
kubectl get pod -l app.kubernetes.io/name=objectstore-fluxd
Examples:
# kubectl get deployment demo-corkboy-fluxd
NAME READY UP-TO-DATE AVAILABLE AGE
demo-corkboy-fluxd 1/1 1 1 5h7m
# kubectl get pod -l app.kubernetes.io/name=demo-corkboy-fluxd
NAME READY STATUS RESTARTS AGE
demo-corkboy-fluxd-668cb6799f-dhmw5 3/3 Running 0 5h7m
- Open a service request with Dell Technologies support with the results of above commands.
- Note when this issue is resolved the system sends a clear alert OBJST-MON-4016