prometheus监控目标存在性

韵味老鸟 2024-08-12 18:06:34

prometheus 监控目标存在性

一:目标缺失

Prometheus target empty - alert: PrometheusTargetEmpty expr: prometheus_sd_discovered_targets == 0 for: 0m labels: severity: critical annotations: summary: Prometheus target empty (instance {{ $labels.instance }}) description: "Prometheus has no target in service discovery\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"

expr: prometheus_sd_discovered_targets == 0

解读:

prometheus_sd_discovered_targets 是一个指标,表示 Prometheus 通过服务发现机制发现的目标数量

二:目标收集变慢

Prometheus target scraping slow - alert: PrometheusTargetScrapingSlow expr: prometheus_target_interval_length_seconds{quantile="0.9"} / on (interval, instance, job) prometheus_target_interval_length_seconds{quantile="0.5"} > 1.05 for: 5m labels: severity: warning annotations: summary: Prometheus target scraping slow (instance {{ $labels.instance }}) description: "Prometheus is scraping exporters slowly since it exceeded the requested interval time. Your Prometheus server is under-provisioned.\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"

expr: prometheus_target_interval_length_seconds{quantile="0.9"} / on (interval, instance, job) prometheus_target_interval_length_seconds{quantile="0.5"} > 1.05

解读:

这个查询表达式是一个Prometheus告警规则,用于检测Prometheus抓取目标的时间间隔是否出现异常

prometheus_target_interval_length_seconds 是一个指标,表示Prometheus抓取目标的实际时间间隔

{quantile="0.9"} 和 {quantile="0.5"} 分别表示这个时间间隔的90%分位数和50%分位数(中位数)

/ on (interval, instance, job) 确保只在相同的 interval、instance 和 job 标签之间进行除法运算

0 阅读:0