runbooks:coustom_alerts:kubernetesnodememorypressure
Table of Contents
runbooks:coustom_alerts:KubernetesNodeMemoryPressure
KubernetesNodeMemoryPressure
Meaning
This alert is triggered when a Kubernetes node reports the MemoryPressure condition for more than 2 minutes. MemoryPressure indicates that the node is running low on available memory and may start evicting pods.
Impact
Memory pressure on a node can lead to:
- Pod evictions and restarts
- OOMKilled containers
- Degraded application performance
- Scheduling failures for new pods
This alert is critical because sustained memory pressure directly affects workload stability.
Diagnosis
Check node memory status:
kubectl get nodes kubectl describe node <NODE_NAME>
Check node memory usage:
kubectl top node <NODE_NAME> free -m
List pods consuming high memory:
kubectl top pod --all-namespaces --sort-by=memory
Check recent pod evictions:
kubectl get events --sort-by=.lastTimestamp
Possible Causes
- Memory leaks in applications
- Insufficient memory requests/limits
- Sudden traffic spikes
- Misconfigured workloads or batch jobs
- Too many pods scheduled on the node
Mitigation
- Identify and restart or scale memory-heavy pods
- Set proper resource requests and limits
- Scale out workloads or add more nodes
- Increase node memory capacity if required
If immediate relief is needed, drain the node:
kubectl drain <NODE_NAME> --ignore-daemonsets --delete-emptydir-data
After mitigation and stabilization:
kubectl uncordon <NODE_NAME>
Escalation
- Escalate if memory pressure persists longer than 10 minutes
- Page on-call engineer if pod evictions impact production
- If multiple nodes show memory pressure, treat as cluster capacity issue
Related Alerts
- HighMemoryUsage
- PodCrashLoopBackOff
- KubernetesNodeNotReady
- HighCPUUsage
Related Dashboards
- Grafana → Kubernetes / Node Memory
- Grafana → Node Exporter Full
runbooks/coustom_alerts/kubernetesnodememorypressure.txt · Last modified: by admin
