KubernetesNodeDiskPressure

Meaning

This alert is triggered when a Kubernetes node reports the DiskPressure condition for more than 2 minutes. DiskPressure indicates that the node is running low on available disk space, and Kubernetes may evict pods to free space.

Impact

Disk pressure on a node can cause:

Pod evictions or restarts
Application failures due to insufficient storage
Node instability
Scheduling failures for new pods

This alert is critical, as sustained disk pressure can affect cluster stability and production workloads.

Diagnosis

Check node status:

kubectl get nodes
kubectl describe node <NODE_NAME>

Check disk usage:

df -h
du -sh /var/lib/kubelet/*

Check pods consuming disk space:

kubectl get pvc --all-namespaces
kubectl describe pod <POD_NAME> -n <NAMESPACE>

Check recent events:

kubectl get events --sort-by=.lastTimestamp

Possible Causes

Full disks due to logs, images, or temporary files
Large persistent volumes filling up
Containers writing excessive data
Old or unused Docker images not cleaned
Disk size too small for workload requirements

Mitigation

Clean up unused images and temporary files
Rotate and compress logs
Move non-critical data to other storage
Increase node disk capacity if possible
Evict non-critical pods or scale workloads to other nodes

Drain node if immediate relief is needed:

kubectl drain <NODE_NAME> --ignore-daemonsets --delete-emptydir-data

After mitigation:

kubectl uncordon <NODE_NAME>

Escalation

Escalate if DiskPressure persists beyond 10 minutes
Page on-call engineer if production workloads are impacted
Treat multiple affected nodes as cluster-level incident

Related Alerts

HighDiskUsage
HighDiskIOWait
KubernetesNodeNotReady
PodCrashLoopBackOff

Related Dashboards

Grafana → Kubernetes / Node Disk
Grafana → Node Exporter Disk Overview

Table of Contents