User Tools

Site Tools


runbooks:coustom_alerts:kubernetesnodediskpressure

runbooks:coustom_alerts:KubernetesNodeDiskPressure

KubernetesNodeDiskPressure

Meaning

This alert is triggered when a Kubernetes node reports the DiskPressure condition for more than 2 minutes. DiskPressure indicates that the node is running low on available disk space, and Kubernetes may evict pods to free space.

Impact

Disk pressure on a node can cause:

  • Pod evictions or restarts
  • Application failures due to insufficient storage
  • Node instability
  • Scheduling failures for new pods

This alert is critical, as sustained disk pressure can affect cluster stability and production workloads.

Diagnosis

Check node status:

kubectl get nodes
kubectl describe node <NODE_NAME>

Check disk usage:

df -h
du -sh /var/lib/kubelet/*

Check pods consuming disk space:

kubectl get pvc --all-namespaces
kubectl describe pod <POD_NAME> -n <NAMESPACE>

Check recent events:

kubectl get events --sort-by=.lastTimestamp

Possible Causes

  • Full disks due to logs, images, or temporary files
  • Large persistent volumes filling up
  • Containers writing excessive data
  • Old or unused Docker images not cleaned
  • Disk size too small for workload requirements

Mitigation

  1. Clean up unused images and temporary files
  2. Rotate and compress logs
  3. Move non-critical data to other storage
  4. Increase node disk capacity if possible
  5. Evict non-critical pods or scale workloads to other nodes

Drain node if immediate relief is needed:

kubectl drain <NODE_NAME> --ignore-daemonsets --delete-emptydir-data

After mitigation:

kubectl uncordon <NODE_NAME>

Escalation

  • Escalate if DiskPressure persists beyond 10 minutes
  • Page on-call engineer if production workloads are impacted
  • Treat multiple affected nodes as cluster-level incident
  • HighDiskUsage
  • HighDiskIOWait
  • KubernetesNodeNotReady
  • PodCrashLoopBackOff
  • Grafana → Kubernetes / Node Disk
  • Grafana → Node Exporter Disk Overview
runbooks/coustom_alerts/kubernetesnodediskpressure.txt · Last modified: by admin