Table of Contents

runbooks:coustom_alerts:HostOutOfMemory

HostOutOfMemory

Meaning

This alert is triggered when a host node has less than 10% of available memory for more than 2 minutes. It indicates that the node is at risk of running out of memory, which may lead to OOMKilled processes and system instability.

Impact

Low memory on a host node can cause:

This alert is marked warning, as it can escalate quickly if memory continues to deplete.

Diagnosis

Check node memory usage:

kubectl top node {{ $labels.instance }}
free -m

Check top memory-consuming processes:

top
htop
ps aux --sort=-%mem | head -n 20

Check pod resource usage on the node:

kubectl top pod --all-namespaces --field-selector spec.nodeName={{ $labels.instance }}

Possible Causes

Mitigation

  1. Identify and restart memory-heavy pods or processes
  2. Scale workloads to other nodes
  3. Adjust resource requests/limits for pods
  4. Free up system memory (e.g., clear caches, restart unnecessary processes)
  5. Add more memory to the node if possible

Escalation