runbooks:coustom_alerts:highdiskiowait
Table of Contents
runbooks:coustom_alerts:HighDiskIOWait
HighDiskIOWait
Meaning
This alert is triggered when the CPU spends an unusually high amount of time waiting for disk I/O operations to complete. High I/O wait typically indicates disk performance bottlenecks.
Impact
Sustained high disk I/O wait can significantly degrade system and application performance.
Possible impacts include:
- Increased application latency
- Slow database queries and file operations
- Pod startup delays
- Reduced overall node throughput
This alert is a warning, but may escalate if the condition persists.
Diagnosis
Check I/O wait and overall CPU usage:
kubectl top nodes
If SSH access is available, inspect disk I/O metrics directly:
iostat -xz 1 vmstat 1
Identify processes causing high disk I/O:
iotop
Check disk usage and pressure conditions:
kubectl describe node <NODE_NAME>
Verify if disk-related alerts are firing:
kubectl get events --field-selector involvedObject.kind=Node
Possible Causes
- Disk saturation due to heavy read/write operations
- Slow or degraded storage (network-attached or cloud disks)
- Log flooding or excessive file writes
- Database or batch jobs performing intensive I/O
- Disk nearing full capacity
Mitigation
- Identify and throttle or stop I/O-heavy workloads
- Move high I/O workloads to faster storage
- Enable or tune log rotation
- Scale out workloads to reduce per-node I/O pressure
- Increase disk performance (IOPS / throughput) if supported
If the node is severely impacted, drain it temporarily:
kubectl drain <NODE_NAME> --ignore-daemonsets
After mitigation:
kubectl uncordon <NODE_NAME>
Escalation
- If high I/O wait persists beyond 10 minutes, escalate to the platform team
- If multiple nodes are affected, treat as a storage-level incident
- If production services are impacted, page the on-call engineer
Related Alerts
- HighDiskUsage
- NodeNotReady
- HighCPUUsage
Related Dashboards
- Grafana → Node Exporter / Disk I/O
- Grafana → Storage Performance Overview
runbooks/coustom_alerts/highdiskiowait.txt · Last modified: by admin
