Unsolved
This post is more than 5 years old
4 Posts
0
3582
ViPR SRM - Alerting - Average CPU over 5 minute period
I have a special request to write an alert that will trigger when the utilization (or CPU) averages over 90% for a 5 minute period.
As things are configured today the system spikes over 90% and generates an e-mail to the team, but the e-mails are really meaningless as the spikes quickly drop off. But, if the system remains over 90% for a period of time we could have a real issue.
Has anyone else written an alert like this one? (Any help or pointers on the correct documentation to review would be appreciated!)
Thanks,
E
isakats
141 Posts
0
July 31st, 2018 07:00
Hi enwillsonII ,
You can do this with the Time Window operation, see the Sustained Average Comparator in the examples.
Regards,
Isaka
enwillsonII
4 Posts
0
August 1st, 2018 07:00
Isaka, Thank you for the quick response!
So, I just want to be sure that I understand -
The flow you've documented says that
1. Take the average over the last 15 minutes.
2. Get the sustained average over the last 5 minutes.
3. Compare the sustained average to a benchmark (in this case 90%)
4. Log an alert if over 90 (Send e-mail)
alt 4. Clear log alert
(I'm trying to use this to monitor CPU on VPLex with is a little different as well......Especially when gathering the initial metric.)
If I would like this alert to send an e-mail I would just substitute the "Log Alert Set" with an e-mail and I could leave the "Log alert clear" as the alternative.
BTW - is there a manual that specifically documents all of the parameters for the components in the "flowchart" method to write these custom alerts?
isakats
141 Posts
0
August 3rd, 2018 07:00
Hi enwillsonII,
You are correct, except on #2 it takes the sustained average of the last 15 minutes.
For Vplex CPU, you can refine the filter in the reporting UI by using the advanced search to narrow down your results or looking at the filters used in a report in browse mode.
Something like source=='VPLEX-Collector' & name=='CurrentUtilization' & parttype=='Processor' should be a good start.
Indeed, you can replace the log with an email action. Note that the clear will only trigger if the alert has been triggered and the value goes below the comparator.
Also you should pay attention to the stateless setting in the comparator, this effects how often the alert is triggered, if stateless in enabled the alert will only trigger when the value crosses the threshold, if non stateless all values above the threshold will trigger...which can make for a lot of notifications in some cases.
Yes, there is a manual available in the Documentation.
Regards,
Isaka
enwillsonII
4 Posts
0
August 6th, 2018 08:00
Thank you again Isaka!
I am trying to create a custom grouping, but for some reason I don't have an Icon to add another grouping, and when I try to modify one of the examples and then save the new grouping I receive an error. Here are a few screenshots:
(I've attempted the add as both myself and as the admin user to ensure that this isn't a permissions issue.)
No add button:
My alert design:
Setting up to save the report:
Error:
Any thoughts? Maybe I am doing something wrong in my report? I can't figure it out.
Thank you again for your help!
isakats
141 Posts
0
August 7th, 2018 08:00
HI enwillsonII,
I havent been able to replicate the error you got; you can add a new grouped box by when you have in the "Grouped Box" and then clicking on the green "+".
regards,
Isaka