AlertManager 的监控规则告警配置
一: AlertManager job 告警
Prometheus AlertManager job missing - alert: PrometheusAlertmanagerJobMissing expr: absent(up{job="alertmanager"}) for: 0m labels: severity: warning annotations: summary: Prometheus AlertManager job missing (instance {{ $labels.instance }}) description: "A Prometheus AlertManager job has disappeared\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"expr: absent(up{job="alertmanager"})
解读:
absent(),表示不存在
up{},针对某个job标签
二:alertmanger 配置监控监测
Prometheus AlertManager configuration reload failure - alert: PrometheusAlertmanagerConfigurationReloadFailure expr: alertmanager_config_last_reload_successful != 1 for: 0m labels: severity: warning annotations: summary: Prometheus AlertManager configuration reload failure (instance {{ $labels.instance }}) description: "AlertManager configuration reload error\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"expr: alertmanager_config_last_reload_successful != 1
检测配置的最后一次是否加载成功