runbooks:coustom_alerts:KubernetesPodCrashLooping
This alert is triggered when a Kubernetes pod has restarted more than 10 times in the last 6 hours. It indicates that the pod is crash looping and unable to run stably.
Crash looping pods can cause:
This alert is warning, but can become critical if it affects production workloads or multiple pods.
Check pod status:
kubectl get pod {{ $labels.pod }} -n {{ $labels.namespace }} kubectl describe pod {{ $labels.pod }} -n {{ $labels.namespace }}
Check container restart count:
kubectl get pod {{ $labels.pod }} -n {{ $labels.namespace }} -o jsonpath='{.status.containerStatuses[*].restartCount}'
Inspect pod logs to identify the root cause:
kubectl logs {{ $labels.pod }} -n {{ $labels.namespace }} --previous kubectl logs {{ $labels.pod }} -n {{ $labels.namespace }} --all-containers
Check events for errors:
kubectl get events -n {{ $labels.namespace }} --sort-by=.lastTimestamp
kubectl delete pod {{ $labels.pod }} -n {{ $labels.namespace }}