runbooks:coustom_alerts:HighMemoryUsage
This alert is triggered when memory usage on a node exceeds 90% for more than 5 minutes. Memory usage is calculated based on total memory and available memory reported by node-exporter.
High memory usage can significantly affect node and application stability.
Possible impacts include:
This alert is a warning, but may escalate to a critical issue if not addressed.
Check memory usage across nodes:
kubectl top nodes
Identify top memory-consuming pods:
kubectl top pods -A --sort-by=memory
Check node conditions for memory pressure:
kubectl describe node <NODE_NAME>
Look for recent memory-related events:
kubectl get events --field-selector involvedObject.kind=Node
If SSH access is available, inspect memory usage directly:
free -h top vmstat 1
Check for pods being OOM-killed:
kubectl get pods -A | grep OOMKilled
If the node is under sustained pressure, drain it temporarily:
kubectl drain <NODE_NAME> --ignore-daemonsets
After recovery:
kubectl uncordon <NODE_NAME>