|
/etc/prometheus/rules/prometheus-k8s-rulefiles-0/monitoring-prometheus-k8s-rules.yaml > alertmanager.rules
|
alert: AlertmanagerConfigInconsistent
expr: count_values by(service) ("config_hash", alertmanager_config_hash{job="alertmanager-main",namespace="monitoring"}) / on(service) group_left() label_replace(max by(name, job, namespace, controller) (prometheus_operator_spec_replicas{controller="alertmanager",job="prometheus-operator",namespace="monitoring"}), "service", "alertmanager-$1", "name", "(.*)") != 1
for: 5m
labels:
severity: critical
annotations:
message: The configuration of the instances of the Alertmanager cluster `{{$labels.service}}` are out of sync.
|
|
|
|
|
|
/etc/prometheus/rules/prometheus-k8s-rulefiles-0/monitoring-prometheus-k8s-rules.yaml > general.rules
|
| Labels |
State |
Active Since |
Value |
|
alertname="TargetDown"
job="node-exporter"
namespace="monitoring"
service="node-exporter"
severity="warning"
|
firing |
2026-05-16 13:37:45.022073385 +0000 UTC |
100 |
| Annotations |
- message
- 100% of the node-exporter/node-exporter targets in monitoring namespace are down.
|
|
alertname="TargetDown"
job="kube-scheduler"
namespace="kube-system"
service="kube-scheduler-prometheus-discovery"
severity="warning"
|
firing |
2026-05-16 12:27:45 +0000 UTC |
100 |
| Annotations |
- message
- 100% of the kube-scheduler/kube-scheduler-prometheus-discovery targets in kube-system namespace are down.
|
|
alertname="TargetDown"
job="kube-controller-manager"
namespace="kube-system"
service="kube-controller-manager-prometheus-discovery"
severity="warning"
|
firing |
2026-03-29 19:46:15 +0000 UTC |
100 |
| Annotations |
- message
- 100% of the kube-controller-manager/kube-controller-manager-prometheus-discovery targets in kube-system namespace are down.
|
|
alert: Watchdog
expr: vector(1)
labels:
severity: none
annotations:
message: |
This is an alert meant to ensure that the entire alerting pipeline is functional.
This alert is always firing, therefore it should always be firing in Alertmanager
and always fire against a receiver. There are integrations with various notification
mechanisms that send a notification when this alert is not firing. For example the
"DeadMansSnitch" integration in PagerDuty.
| Labels |
State |
Active Since |
Value |
|
alertname="Watchdog"
severity="none"
|
firing |
2026-05-16 13:37:15.022073385 +0000 UTC |
1 |
| Annotations |
- message
- This is an alert meant to ensure that the entire alerting pipeline is functional.
This alert is always firing, therefore it should always be firing in Alertmanager
and always fire against a receiver. There are integrations with various notification
mechanisms that send a notification when this alert is not firing. For example the
"DeadMansSnitch" integration in PagerDuty.
|
|
|
/etc/prometheus/rules/prometheus-k8s-rulefiles-0/monitoring-prometheus-k8s-rules.yaml > kube-apiserver-slos
|
| Labels |
State |
Active Since |
Value |
|
alertname="KubeAPIErrorBudgetBurn"
long="1d"
severity="warning"
short="2h"
|
firing |
2026-05-16 12:21:32 +0000 UTC |
0.25605684302617177 |
| Annotations |
- message
- The API server is burning too much error budget
- runbook_url
- https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeapierrorbudgetburn
|
|
| Labels |
State |
Active Since |
Value |
|
alertname="KubeAPIErrorBudgetBurn"
long="3d"
severity="warning"
short="6h"
|
pending |
2026-05-16 12:22:12.100998003 +0000 UTC |
0.2560568430261718 |
| Annotations |
- message
- The API server is burning too much error budget
- runbook_url
- https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeapierrorbudgetburn
|
|
|
|
|
|
|
/etc/prometheus/rules/prometheus-k8s-rulefiles-0/monitoring-prometheus-k8s-rules.yaml > kube-state-metrics
|
| Labels |
State |
Active Since |
Value |
|
alertname="KubeStateMetricsListErrors"
severity="critical"
|
firing |
2026-05-16 12:23:00 +0000 UTC |
0.983173076923077 |
| Annotations |
- message
- kube-state-metrics is experiencing errors at an elevated rate in list operations. This is likely causing it to not be able to expose metrics about Kubernetes objects correctly or at all.
- runbook_url
- https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubestatemetricslisterrors
|
|
| Labels |
State |
Active Since |
Value |
|
alertname="KubeStateMetricsWatchErrors"
severity="critical"
|
pending |
2026-05-16 14:03:30.417432371 +0000 UTC |
0.8622754491017964 |
| Annotations |
- message
- kube-state-metrics is experiencing errors at an elevated rate in watch operations. This is likely causing it to not be able to expose metrics about Kubernetes objects correctly or at all.
- runbook_url
- https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubestatemetricswatcherrors
|
|
|
/etc/prometheus/rules/prometheus-k8s-rulefiles-0/monitoring-prometheus-k8s-rules.yaml > kubernetes-apps
|
| Labels |
State |
Active Since |
Value |
|
alertname="KubePodCrashLooping"
container="kube-rbac-proxy"
instance="10.42.0.188:8443"
job="kube-state-metrics"
namespace="monitoring"
pod="node-exporter-kgllz"
severity="warning"
|
firing |
2026-05-16 13:38:05.095800493 +0000 UTC |
4.444444444444445 |
| Annotations |
- message
- Pod monitoring/node-exporter-kgllz (kube-rbac-proxy) is restarting 4.44 times / 5 minutes.
- runbook_url
- https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepodcrashlooping
|
|
alertname="KubePodCrashLooping"
container="cert-manager-cainjector"
instance="10.42.0.188:8443"
job="kube-state-metrics"
namespace="cert-manager"
pod="cert-manager-cainjector-5c9c699bdd-tljtb"
severity="warning"
|
pending |
2026-05-16 14:05:05.095800493 +0000 UTC |
1.1111111111111112 |
| Annotations |
- message
- Pod cert-manager/cert-manager-cainjector-5c9c699bdd-tljtb (cert-manager-cainjector) is restarting 1.11 times / 5 minutes.
- runbook_url
- https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepodcrashlooping
|
|
alertname="KubePodCrashLooping"
container="system-upgrade-controller"
instance="10.42.0.188:8443"
job="kube-state-metrics"
namespace="system-upgrade"
pod="system-upgrade-controller-6c7dc6d998-ft75d"
severity="warning"
|
pending |
2026-05-16 14:05:05.095800493 +0000 UTC |
1.1111111111111112 |
| Annotations |
- message
- Pod system-upgrade/system-upgrade-controller-6c7dc6d998-ft75d (system-upgrade-controller) is restarting 1.11 times / 5 minutes.
- runbook_url
- https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepodcrashlooping
|
|
alertname="KubePodCrashLooping"
container="kube-rbac-proxy"
instance="10.42.0.188:8443"
job="kube-state-metrics"
namespace="monitoring"
pod="node-exporter-t242n"
severity="warning"
|
firing |
2026-05-16 13:51:05.095800493 +0000 UTC |
2.2222222222222223 |
| Annotations |
- message
- Pod monitoring/node-exporter-t242n (kube-rbac-proxy) is restarting 2.22 times / 5 minutes.
- runbook_url
- https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepodcrashlooping
|
|
alertname="KubePodCrashLooping"
container="node-exporter"
instance="10.42.0.188:8443"
job="kube-state-metrics"
namespace="monitoring"
pod="node-exporter-t242n"
severity="warning"
|
firing |
2026-05-16 13:51:05.095800493 +0000 UTC |
2.2222222222222223 |
| Annotations |
- message
- Pod monitoring/node-exporter-t242n (node-exporter) is restarting 2.22 times / 5 minutes.
- runbook_url
- https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepodcrashlooping
|
|
alertname="KubePodCrashLooping"
container="cert-manager-controller"
instance="10.42.0.188:8443"
job="kube-state-metrics"
namespace="cert-manager"
pod="cert-manager-66cd67f646-72nq6"
severity="warning"
|
pending |
2026-05-16 14:05:05.095800493 +0000 UTC |
1.1111111111111112 |
| Annotations |
- message
- Pod cert-manager/cert-manager-66cd67f646-72nq6 (cert-manager-controller) is restarting 1.11 times / 5 minutes.
- runbook_url
- https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepodcrashlooping
|
|
alertname="KubePodCrashLooping"
container="node-exporter"
instance="10.42.0.188:8443"
job="kube-state-metrics"
namespace="monitoring"
pod="node-exporter-kgllz"
severity="warning"
|
firing |
2026-05-16 13:38:05.095800493 +0000 UTC |
4.444444444444445 |
| Annotations |
- message
- Pod monitoring/node-exporter-kgllz (node-exporter) is restarting 4.44 times / 5 minutes.
- runbook_url
- https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepodcrashlooping
|
|
alertname="KubePodCrashLooping"
container="node-exporter"
instance="10.42.0.188:8443"
job="kube-state-metrics"
namespace="monitoring"
pod="node-exporter-4xk97"
severity="warning"
|
firing |
2026-05-16 13:42:05.095800493 +0000 UTC |
3.333333333333334 |
| Annotations |
- message
- Pod monitoring/node-exporter-4xk97 (node-exporter) is restarting 3.33 times / 5 minutes.
- runbook_url
- https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepodcrashlooping
|
|
alertname="KubePodCrashLooping"
container="kube-rbac-proxy"
instance="10.42.0.188:8443"
job="kube-state-metrics"
namespace="monitoring"
pod="node-exporter-4xk97"
severity="warning"
|
firing |
2026-05-16 13:42:05.095800493 +0000 UTC |
3.333333333333334 |
| Annotations |
- message
- Pod monitoring/node-exporter-4xk97 (kube-rbac-proxy) is restarting 3.33 times / 5 minutes.
- runbook_url
- https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepodcrashlooping
|
|
| Labels |
State |
Active Since |
Value |
|
alertname="KubeContainerWaiting"
container="kube-rbac-proxy"
namespace="monitoring"
pod="node-exporter-kgllz"
severity="warning"
|
pending |
2026-05-16 14:07:05.095800493 +0000 UTC |
1 |
| Annotations |
- message
- Pod monitoring/node-exporter-kgllz container kube-rbac-proxy has been in waiting state for longer than 1 hour.
- runbook_url
- https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubecontainerwaiting
|
|
alertname="KubeContainerWaiting"
container="node-exporter"
namespace="monitoring"
pod="node-exporter-4xk97"
severity="warning"
|
pending |
2026-05-16 14:07:05.095800493 +0000 UTC |
1 |
| Annotations |
- message
- Pod monitoring/node-exporter-4xk97 container node-exporter has been in waiting state for longer than 1 hour.
- runbook_url
- https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubecontainerwaiting
|
|
alertname="KubeContainerWaiting"
container="node-exporter"
namespace="monitoring"
pod="node-exporter-t242n"
severity="warning"
|
pending |
2026-05-16 14:06:35.095800493 +0000 UTC |
1 |
| Annotations |
- message
- Pod monitoring/node-exporter-t242n container node-exporter has been in waiting state for longer than 1 hour.
- runbook_url
- https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubecontainerwaiting
|
|
alertname="KubeContainerWaiting"
container="kube-rbac-proxy"
namespace="monitoring"
pod="node-exporter-t242n"
severity="warning"
|
pending |
2026-05-16 14:06:35.095800493 +0000 UTC |
1 |
| Annotations |
- message
- Pod monitoring/node-exporter-t242n container kube-rbac-proxy has been in waiting state for longer than 1 hour.
- runbook_url
- https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubecontainerwaiting
|
|
alertname="KubeContainerWaiting"
container="kube-rbac-proxy"
namespace="monitoring"
pod="node-exporter-4xk97"
severity="warning"
|
pending |
2026-05-16 14:07:05.095800493 +0000 UTC |
1 |
| Annotations |
- message
- Pod monitoring/node-exporter-4xk97 container kube-rbac-proxy has been in waiting state for longer than 1 hour.
- runbook_url
- https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubecontainerwaiting
|
|
alertname="KubeContainerWaiting"
container="node-exporter"
namespace="monitoring"
pod="node-exporter-kgllz"
severity="warning"
|
pending |
2026-05-16 14:06:05.095800493 +0000 UTC |
1 |
| Annotations |
- message
- Pod monitoring/node-exporter-kgllz container node-exporter has been in waiting state for longer than 1 hour.
- runbook_url
- https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubecontainerwaiting
|
|
| Labels |
State |
Active Since |
Value |
|
alertname="KubeDaemonSetRolloutStuck"
daemonset="node-exporter"
instance="10.42.0.188:8443"
job="kube-state-metrics"
namespace="monitoring"
severity="warning"
|
pending |
2026-05-16 13:55:05.095800493 +0000 UTC |
0 |
| Annotations |
- message
- Only 0% of the desired Pods of DaemonSet monitoring/node-exporter are scheduled and ready.
- runbook_url
- https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubedaemonsetrolloutstuck
|
|
| Labels |
State |
Active Since |
Value |
|
alertname="KubeDeploymentReplicasMismatch"
deployment="podsync"
instance="10.42.0.188:8443"
job="kube-state-metrics"
namespace="applications"
severity="warning"
|
pending |
2026-05-16 13:55:05.095800493 +0000 UTC |
1 |
| Annotations |
- message
- Deployment applications/podsync has not matched the expected number of replicas for longer than 15 minutes.
- runbook_url
- https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubedeploymentreplicasmismatch
|
|
alertname="KubeDeploymentReplicasMismatch"
deployment="root-nfs-subdir-external-provisioner"
instance="10.42.0.188:8443"
job="kube-state-metrics"
namespace="default"
severity="warning"
|
pending |
2026-05-16 13:55:05.095800493 +0000 UTC |
1 |
| Annotations |
- message
- Deployment default/root-nfs-subdir-external-provisioner has not matched the expected number of replicas for longer than 15 minutes.
- runbook_url
- https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubedeploymentreplicasmismatch
|
|
| Labels |
State |
Active Since |
Value |
|
alertname="KubeJobCompletion"
instance="10.42.0.188:8443"
job="kube-state-metrics"
job_name="apply-agent-plan-on-node1-with-ab098c4a499dd6cc4d68700e54-8a12b"
namespace="system-upgrade"
severity="warning"
|
pending |
2026-05-16 14:00:35.095800493 +0000 UTC |
1 |
| Annotations |
- message
- Job system-upgrade/apply-agent-plan-on-node1-with-ab098c4a499dd6cc4d68700e54-8a12b is taking more than one hour to complete.
- runbook_url
- https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubejobcompletion
|
|
| Labels |
State |
Active Since |
Value |
|
alertname="KubeStatefulSetReplicasMismatch"
instance="10.42.0.188:8443"
job="kube-state-metrics"
namespace="monitoring"
severity="warning"
statefulset="uptime-kuma"
|
pending |
2026-05-16 13:55:05.095800493 +0000 UTC |
0 |
| Annotations |
- message
- StatefulSet monitoring/uptime-kuma has not matched the expected number of replicas for longer than 15 minutes.
- runbook_url
- https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubestatefulsetreplicasmismatch
|
|
|
|
|
|
|
|
|
|
|
|
|
|
alert: KubeJobFailed
expr: kube_job_failed{job="kube-state-metrics"} > 0
for: 15m
labels:
severity: warning
annotations:
message: Job {{ $labels.namespace }}/{{ $labels.job_name }} failed to complete.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubejobfailed
|
|
|
|
|
|
|
|
/etc/prometheus/rules/prometheus-k8s-rulefiles-0/monitoring-prometheus-k8s-rules.yaml > kubernetes-resources
|
| Labels |
State |
Active Since |
Value |
|
alertname="CPUThrottlingHigh"
container="grafana"
namespace="monitoring"
pod="grafana-594fc7f587-v7wlr"
severity="warning"
|
firing |
2026-05-16 12:22:51 +0000 UTC |
0.323943661971831 |
| Annotations |
- message
- 32.39% throttling of CPU in namespace monitoring for container grafana in pod grafana-594fc7f587-v7wlr.
- runbook_url
- https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-cputhrottlinghigh
|
|
alertname="CPUThrottlingHigh"
container="kube-rbac-proxy"
namespace="monitoring"
pod="arm-exporter-jzhqp"
severity="warning"
|
pending |
2026-05-16 13:55:17.55260138 +0000 UTC |
0.2876712328767123 |
| Annotations |
- message
- 28.77% throttling of CPU in namespace monitoring for container kube-rbac-proxy in pod arm-exporter-jzhqp.
- runbook_url
- https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-cputhrottlinghigh
|
|
alertname="CPUThrottlingHigh"
container="kube-rbac-proxy"
namespace="monitoring"
pod="arm-exporter-gjg9w"
severity="warning"
|
firing |
2026-05-16 13:32:47.574415178 +0000 UTC |
0.4814814814814815 |
| Annotations |
- message
- 48.15% throttling of CPU in namespace monitoring for container kube-rbac-proxy in pod arm-exporter-gjg9w.
- runbook_url
- https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-cputhrottlinghigh
|
|
|
|
|
|
|
|
|
|
|
|
|
/etc/prometheus/rules/prometheus-k8s-rulefiles-0/monitoring-prometheus-k8s-rules.yaml > kubernetes-storage
|
|
|
|
|
|
|
|
/etc/prometheus/rules/prometheus-k8s-rulefiles-0/monitoring-prometheus-k8s-rules.yaml > kubernetes-system
|
| Labels |
State |
Active Since |
Value |
|
alertname="KubeClientErrors"
instance="192.168.178.27:10250"
job="kubelet"
severity="warning"
|
pending |
2026-05-16 14:04:47.588420806 +0000 UTC |
0.08247610064076825 |
| Annotations |
- message
- Kubernetes API server client 'kubelet/192.168.178.27:10250' is experiencing 8.248% errors.'
- runbook_url
- https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeclienterrors
|
|
alertname="KubeClientErrors"
instance="192.168.178.94:10250"
job="kubelet"
severity="warning"
|
pending |
2026-05-16 14:04:47.588420806 +0000 UTC |
0.0936880905196102 |
| Annotations |
- message
- Kubernetes API server client 'kubelet/192.168.178.94:10250' is experiencing 9.369% errors.'
- runbook_url
- https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeclienterrors
|
|
|
|
|
/etc/prometheus/rules/prometheus-k8s-rulefiles-0/monitoring-prometheus-k8s-rules.yaml > kubernetes-system-apiserver
|
| Labels |
State |
Active Since |
Value |
|
alertname="AggregatedAPIDown"
name="v1beta1.metrics.k8s.io"
namespace="kube-system"
severity="warning"
|
pending |
2026-05-16 14:05:04.774722027 +0000 UTC |
1 |
| Annotations |
- message
- An aggregated API v1beta1.metrics.k8s.io/kube-system is down. It has not been available at least for the past five minutes.
- runbook_url
- https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-aggregatedapidown
|
|
| Labels |
State |
Active Since |
Value |
|
alertname="KubeAPIErrorsHigh"
severity="warning"
subresource="/readyz"
verb="GET"
|
pending |
2026-05-16 14:05:04.774722027 +0000 UTC |
0.6666666666666666 |
| Annotations |
- message
- API server is returning errors for 66.67% of requests for GET /readyz.
- runbook_url
- https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeapierrorshigh
|
|
alert: AggregatedAPIErrors
expr: sum by(name, namespace) (increase(aggregator_unavailable_apiservice_count[5m])) > 2
labels:
severity: warning
annotations:
message: An aggregated API {{ $labels.name }}/{{ $labels.namespace }} has reported errors. The number of errors have increased for it in the past five minutes. High values indicate that the availability of the service changes too often.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-aggregatedapierrors
|
alert: KubeAPIDown
expr: absent(up{job="apiserver"} == 1)
for: 15m
labels:
severity: critical
annotations:
message: KubeAPI has disappeared from Prometheus target discovery.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeapidown
|
|
|
|
|
|
|
|
/etc/prometheus/rules/prometheus-k8s-rulefiles-0/monitoring-prometheus-k8s-rules.yaml > kubernetes-system-controller-manager
|
| Labels |
State |
Active Since |
Value |
|
alertname="KubeControllerManagerDown"
severity="critical"
|
firing |
2026-03-29 19:46:14 +0000 UTC |
1 |
| Annotations |
- message
- KubeControllerManager has disappeared from Prometheus target discovery.
- runbook_url
- https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubecontrollermanagerdown
|
|
|
/etc/prometheus/rules/prometheus-k8s-rulefiles-0/monitoring-prometheus-k8s-rules.yaml > kubernetes-system-kubelet
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
/etc/prometheus/rules/prometheus-k8s-rulefiles-0/monitoring-prometheus-k8s-rules.yaml > kubernetes-system-scheduler
|
alert: KubeSchedulerDown
expr: absent(up{job="kube-scheduler"} == 1)
for: 15m
labels:
severity: critical
annotations:
message: KubeScheduler has disappeared from Prometheus target discovery.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeschedulerdown
| Labels |
State |
Active Since |
Value |
|
alertname="KubeSchedulerDown"
severity="critical"
|
firing |
2026-05-16 12:27:32 +0000 UTC |
1 |
| Annotations |
- message
- KubeScheduler has disappeared from Prometheus target discovery.
- runbook_url
- https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeschedulerdown
|
|
|
/etc/prometheus/rules/prometheus-k8s-rulefiles-0/monitoring-prometheus-k8s-rules.yaml > node-exporter
|
alert: NodeClockNotSynchronising
expr: min_over_time(node_timex_sync_status[5m]) == 0
for: 10m
labels:
severity: warning
annotations:
message: Clock on {{ $labels.instance }} is not synchronising. Ensure NTP is configured on this host.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-nodeclocknotsynchronising
summary: Clock not synchronising.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
alert: NodeNetworkReceiveErrs
expr: increase(node_network_receive_errs_total[2m]) > 10
for: 1h
labels:
severity: warning
annotations:
description: '{{ $labels.instance }} interface {{ $labels.device }} has encountered {{ printf "%.0f" $value }} receive errors in the last two minutes.'
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-nodenetworkreceiveerrs
summary: Network interface is reporting many receive errors.
|
alert: NodeNetworkTransmitErrs
expr: increase(node_network_transmit_errs_total[2m]) > 10
for: 1h
labels:
severity: warning
annotations:
description: '{{ $labels.instance }} interface {{ $labels.device }} has encountered {{ printf "%.0f" $value }} transmit errors in the last two minutes.'
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-nodenetworktransmiterrs
summary: Network interface is reporting many transmit errors.
|
|
|
|
/etc/prometheus/rules/prometheus-k8s-rulefiles-0/monitoring-prometheus-k8s-rules.yaml > node-network
|
|
|
|
/etc/prometheus/rules/prometheus-k8s-rulefiles-0/monitoring-prometheus-k8s-rules.yaml > prometheus
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
/etc/prometheus/rules/prometheus-k8s-rulefiles-0/monitoring-prometheus-k8s-rules.yaml > prometheus-operator
|
|
|
|
|