runbooks:coustom_alerts:hostunusualdiskreadrate
Table of Contents
runbooks:coustom_alerts:HostUnusualDiskReadRate
HostUnusualDiskReadRate
Meaning
This alert is triggered when a host node experiences high disk read activity, with IO wait greater than 80% over a 5-minute window. It indicates that the disk may be a bottleneck or under heavy load.
Impact
High disk read rates can lead to:
- Application slowdowns or latency
- Increased pod response times
- Potential cascading failures if services rely on disk-intensive operations
- Node-level resource contention
This alert is warning, as prolonged high IO can degrade performance or trigger other alerts.
Diagnosis
Check disk IO statistics:
iostat -x 1 5 iotop -o
Check system-wide IO wait:
top vmstat 1 5
Check disk usage and filesystem health:
df -h lsblk smartctl -a /dev/sdX
Check pods consuming disk on the node:
kubectl top pod --all-namespaces --field-selector spec.nodeName={{ $labels.instance }}
Possible Causes
- Disk-intensive workloads or batch jobs
- Logging or database writes causing high IO
- Slow or failing disks
- Misconfigured storage (e.g., small volumes)
- Backup jobs or heavy monitoring metrics writes
Mitigation
- Identify and reduce disk-intensive workloads
- Move high IO workloads to other nodes or storage
- Monitor disk health and replace failing disks
- Tune filesystem or storage configuration if needed
- Scale out storage for critical workloads
Escalation
- Escalate if high IO persists for extended periods
- Page on-call engineer if production services are impacted
- Investigate related alerts (DiskPressure, HighDiskUsage)
Related Alerts
- HighDiskUsage
- HighDiskIOWait
- KubernetesNodeDiskPressure
- HostUnusualDiskWriteRate
Related Dashboards
- Grafana → Node Disk IO
- Grafana → Node Exporter Disk Metrics
runbooks/coustom_alerts/hostunusualdiskreadrate.txt · Last modified: by admin
