runbooks:coustom_alerts:NodeRebootedRecently
====== NodeRebootedRecently ======
===== Meaning =====
This alert is triggered when a node has rebooted within the last 5 minutes.
It is detected by comparing the current time with the node's boot time as reported by node-exporter.
===== Impact =====
This alert indicates a **recent node restart** and may affect workloads running on the node.
Possible impacts include:
* Temporary disruption of pods scheduled on the node
* Pod restarts or rescheduling to other nodes
* Short-lived service degradation
* Loss of in-memory application state
This alert is typically **informational or warning-level**, but may require attention if frequent or unexpected.
===== Diagnosis =====
Verify node status and readiness:
kubectl get nodes
Check detailed node information and recent events:
kubectl describe node
Check events related to node reboot or pressure conditions:
kubectl get events --field-selector involvedObject.kind=Node
Check system uptime from node-exporter metrics (Grafana) or via SSH:
uptime
If SSH access is available, check system logs for reboot cause:
journalctl --list-boots
journalctl -b -1
===== Possible Causes =====
* Planned maintenance or OS patching
* Kernel panic or hardware issue
* Cloud provider host restart
* Manual reboot by an operator
* Power or resource pressure issues
===== Mitigation =====
- Confirm whether the reboot was planned or expected
- Ensure the node is in `Ready` state
- Verify that all critical pods have been rescheduled successfully
- Check workloads for crash loops or degraded performance
- If reboots are frequent, investigate system and kernel logs
If needed, temporarily cordon the node for investigation:
kubectl cordon
Uncordon once verified healthy:
kubectl uncordon
===== Escalation =====
* If the reboot was unplanned, notify the platform or infrastructure team
* If the same node reboots multiple times within 24 hours, escalate immediately
* If production services are impacted, page the on-call engineer
===== Related Alerts =====
* NodeDown
* NodeNotReady
* KubeletDown
===== Related Dashboards =====
* Grafana → Node Overview
* Grafana → Node Exporter