runbooks:coustom_alerts:KubernetesNodeMemoryPressure

====== KubernetesNodeMemoryPressure ======

===== Meaning =====
This alert is triggered when a Kubernetes node reports the **MemoryPressure** condition for more than 2 minutes.
MemoryPressure indicates that the node is running low on available memory and may start evicting pods.

===== Impact =====
Memory pressure on a node can lead to:
  * Pod evictions and restarts
  * OOMKilled containers
  * Degraded application performance
  * Scheduling failures for new pods

This alert is **critical** because sustained memory pressure directly affects workload stability.

===== Diagnosis =====
Check node memory status:

<code bash>
kubectl get nodes
kubectl describe node <NODE_NAME>
</code>

Check node memory usage:

<code bash>
kubectl top node <NODE_NAME>
free -m
</code>

List pods consuming high memory:

<code bash>
kubectl top pod --all-namespaces --sort-by=memory
</code>

Check recent pod evictions:

<code bash>
kubectl get events --sort-by=.lastTimestamp
</code>

===== Possible Causes =====
  * Memory leaks in applications
  * Insufficient memory requests/limits
  * Sudden traffic spikes
  * Misconfigured workloads or batch jobs
  * Too many pods scheduled on the node

===== Mitigation =====
  - Identify and restart or scale memory-heavy pods
  - Set proper resource **requests and limits**
  - Scale out workloads or add more nodes
  - Increase node memory capacity if required

If immediate relief is needed, drain the node:

<code bash>
kubectl drain <NODE_NAME> --ignore-daemonsets --delete-emptydir-data
</code>

After mitigation and stabilization:

<code bash>
kubectl uncordon <NODE_NAME>
</code>

===== Escalation =====
  * Escalate if memory pressure persists longer than 10 minutes
  * Page on-call engineer if pod evictions impact production
  * If multiple nodes show memory pressure, treat as cluster capacity issue

===== Related Alerts =====
  * HighMemoryUsage
  * PodCrashLoopBackOff
  * KubernetesNodeNotReady
  * HighCPUUsage

===== Related Dashboards =====
  * Grafana → Kubernetes / Node Memory
  * Grafana → Node Exporter Full