runbooks:coustom_alerts:HighDiskIOWait
This alert is triggered when the CPU spends an unusually high amount of time waiting for disk I/O operations to complete. High I/O wait typically indicates disk performance bottlenecks.
Sustained high disk I/O wait can significantly degrade system and application performance.
Possible impacts include:
This alert is a warning, but may escalate if the condition persists.
Check I/O wait and overall CPU usage:
kubectl top nodes
If SSH access is available, inspect disk I/O metrics directly:
iostat -xz 1 vmstat 1
Identify processes causing high disk I/O:
iotop
Check disk usage and pressure conditions:
kubectl describe node <NODE_NAME>
Verify if disk-related alerts are firing:
kubectl get events --field-selector involvedObject.kind=Node
If the node is severely impacted, drain it temporarily:
kubectl drain <NODE_NAME> --ignore-daemonsets
After mitigation:
kubectl uncordon <NODE_NAME>