1 Rookie
•
3 Posts
0
1923
GSAN degraded during maintenance window
Hi,
I ran into some logging messages concerning the gsan status. This cam eup because 3 AVEs reported gsan degraded in our monitoring solution. Upon further inspection I saw that there are entries in the dpnctl log which report gsan as degraded. This seems to somehow correlate with either a checkpoint or a hfscheck running. But I am not sure why that might be the case, since not every AVE is reporting the issue (40+ AVEs in use).
cplist
cp.20220511190341 Wed May 11 21:03:41 2022 valid rol --- nodes 1/1 stripes 9357
cp.20220511213024 Wed May 11 23:30:24 2022 valid --- --- nodes 1/1 stripes 9357
status.dpn
Last checkpoint: cp.20220511213024 finished Wed May 11 23:47:48 2022 after 17m 24s (OK)
Last GC: finished Wed May 11 20:03:41 2022 after 03m 17s >> recovered 442.66 MB (OK)
Last hfscheck: finished Wed May 11 23:30:00 2022 after 02h 15m >> checked 3853 of 3853 stripes (OK)
Maintenance windows scheduler capacity profile is active.
The backup window is currently running.
Next backup window start time: Fri May 13 06:00:00 2022 CEST
Next maintenance window start time: Thu May 12 20:00:00 2022 CEST
parts from dpnctl.log
2022/05/11-21:33:30 server degraded: checkpoint accessmode(00pu+00pu+00pu)
2022/05/11-21:33:31 dpnctl: INFO: opstatus.dpn result: degraded: checkpoint accessmode(00pu+00pu+00pu)
2022/05/11-21:33:31 dpnctl: INFO: gsan status: degraded
2022/05/11-21:37:00 server degraded: checkpoint accessmode(00pu+00pu+00pu)
2022/05/11-21:37:00 dpnctl: INFO: opstatus.dpn result: degraded: checkpoint accessmode(00pu+00pu+00pu)
2022/05/11-21:37:00 dpnctl: INFO: gsan status: degraded
2022/05/11-21:39:46 server degraded: checkpoint accessmode(00pu+00pu+00pu)
2022/05/11-21:40:35 server degraded: checkpoint accessmode(00pu+00pu+00pu)
2022/05/11-21:40:35 dpnctl: INFO: opstatus.dpn result: degraded: checkpoint accessmode(00pu+00pu+00pu)
2022/05/11-21:40:35 dpnctl: INFO: gsan status: degraded
Does anyone had any similar issues?
Thayneforbes
1 Message
1
June 10th, 2022 09:00
This is expected behavior. The staus of 'degraded' is somewhat misleading. On a single node or AVE, it would be more accurate to call it 'read-only'. In your case, it is set to read-only while a checkpoint is being generated, and back to fullaccess when it is complete. For reference, there is also a brief period at the beginning of hfscheck when the gsan is also 'degraded'.
Ilavarasan IC
1 Rookie
1 Rookie
•
50 Posts
0
May 20th, 2022 21:00
Could you check the Gsan status with " dpnctl status" this could be a momentary message when a checkpoint is taken.
Andy4223
1 Rookie
1 Rookie
•
3 Posts
0
May 23rd, 2022 02:00
Hi,
so the monitoring only comes up with the gsan messages during the checkpoint creation. Normally the gsan status is up, nothing to worry about. I just wanted to know if the degration is normal behavior during the checkpoint creation. If so, I need to come up with a plan for this to not trigger alarms in monitoring. Since this happens on a daily basis with 40+ AVEs in use.
After the maintenance is done, gsan is healthy again. So I guess this is a normal behavior.