runbooks:coustom_alerts:KubeNodeNotReady
This alert is triggered when a Kubernetes node reports a `NotReady` status for more than 2 minutes. A node in `NotReady` state cannot reliably run or manage pods.
This alert indicates a node-level availability issue.
Possible impacts include:
This alert is a warning, but may become critical if the condition persists or affects multiple nodes.
Check node status:
kubectl get nodes
Describe the affected node to inspect conditions and events:
kubectl describe node {{ $labels.node }}
Check recent node-related events:
kubectl get events --field-selector involvedObject.kind=Node
Verify kubelet health on the node (if SSH access is available):
systemctl status kubelet journalctl -u kubelet --since "15 min ago"
Check node resource pressure:
kubectl describe node {{ $labels.node }} | grep -i pressure
kubectl drain {{ $labels.node }} --ignore-daemonsets
After the node becomes healthy:
kubectl uncordon {{ $labels.node }}