-
Hello, I am trying to build an alerting system based on Prometheus/Alertmanager for a small team, handling a few cloud environments. Now, as a cost-saving measure we have implemented schedules that will shutdown our non-production VM's on the weekends and during specific hours during the work week (the timings vary depending on the environment). Right now Alertmanager is firing off missing job target alerts during those times, and after some research I haven't found an adequate way of handling this. I essentially need silences that will repeat during x days, during x hours, indefinitely. Like setting a repeating event in your calendar app. I looked into mute_time_intervals, but that doesn't really allow me to specify multiple different timings for different environments (differentiated by labels). We are currently using Grafana Alertmanager, and that does allow us to specify multiple notification policies for different label matchers, which does the job, but I'm not seeing a way to achieve that in Prometheus Alertmanager. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
Use a recording rule to define a new metric and use that for an inhibition rule. See https://medium.com/@tom.fawcett/time-of-day-based-notifications-with-prometheus-and-alertmanager-1bf7a23b7695 fo example. You could also use an exporter to provide such a metric, see https://github.com/allangood/holiday_exporter for example. Also consider having a look at https://groups.google.com/g/prometheus-users, where topics like this one have been very likely been discussed already. |
Beta Was this translation helpful? Give feedback.
-
Well for simple instances there is also mute_time_interval, see https://prometheus.io/docs/alerting/latest/configuration/#route |
Beta Was this translation helpful? Give feedback.
Well for simple instances there is also mute_time_interval, see https://prometheus.io/docs/alerting/latest/configuration/#route