NodeRebootedRecently

Meaning

This alert is triggered when a node has rebooted within the last 5 minutes. It is detected by comparing the current time with the node's boot time as reported by node-exporter.

Impact

This alert indicates a recent node restart and may affect workloads running on the node.

Possible impacts include:

Temporary disruption of pods scheduled on the node
Pod restarts or rescheduling to other nodes
Short-lived service degradation
Loss of in-memory application state

This alert is typically informational or warning-level, but may require attention if frequent or unexpected.

Diagnosis

Verify node status and readiness:

kubectl get nodes

Check detailed node information and recent events:

kubectl describe node <NODE_NAME>

Check events related to node reboot or pressure conditions:

kubectl get events --field-selector involvedObject.kind=Node

Check system uptime from node-exporter metrics (Grafana) or via SSH:

uptime

If SSH access is available, check system logs for reboot cause:

journalctl --list-boots
journalctl -b -1

Possible Causes

Planned maintenance or OS patching
Kernel panic or hardware issue
Cloud provider host restart
Manual reboot by an operator
Power or resource pressure issues

Mitigation

Confirm whether the reboot was planned or expected
Ensure the node is in `Ready` state
Verify that all critical pods have been rescheduled successfully
Check workloads for crash loops or degraded performance
If reboots are frequent, investigate system and kernel logs

If needed, temporarily cordon the node for investigation:

kubectl cordon <NODE_NAME>

Uncordon once verified healthy:

kubectl uncordon <NODE_NAME>

Escalation

If the reboot was unplanned, notify the platform or infrastructure team
If the same node reboots multiple times within 24 hours, escalate immediately
If production services are impacted, page the on-call engineer

Related Alerts

NodeDown
NodeNotReady
KubeletDown

Related Dashboards

Grafana → Node Overview
Grafana → Node Exporter

Table of Contents