These alerts monitor the Kubernetes Job created by the Zammad backup CronJob.
Alerts:
* Success → Backup completed successfully. Zammad database is safely stored in MinIO. * Failure → Zammad database may not be backed up. Could affect disaster recovery if restoration is needed.
1. Check Kubernetes Job status:
kubectl get job zammad-backup-job -n <NAMESPACE> kubectl describe job zammad-backup-job -n <NAMESPACE>
2. Check logs of the Job pod:
kubectl logs job/zammad-backup-job -n <NAMESPACE>
3. Verify backup in MinIO:
mc ls <MINIO_ALIAS>/zammad-backups/ mc stat <MINIO_ALIAS>/zammad-backups/<backup_file>
4. Check PVC mounts:
kubectl get pvc -n <NAMESPACE> kubectl describe pvc <PVC_NAME> -n <NAMESPACE>
* Pod in CrashLoopBackOff, OOMKilled, or Failed * PVC mount unavailable or insufficient space * MinIO credentials missing or misconfigured * Network issues preventing upload to MinIO * Disk space or permissions issues on the node * CronJob manifest misconfiguration * PostgreSQL credentials invalid or inaccessible
1. Inspect Job pod logs to identify errors. 2. Verify MinIO credentials and connectivity. 3. Check PVC status and node disk availability. 4. Verify PostgreSQL credentials. 5. Retry backup manually if needed:
kubectl create job --from=cronjob/zammad-backup-job zammad-backup-job-manual -n <NAMESPACE>
6. Correct any misconfigurations in CronJob YAML, PVC, database, or MinIO bucket policy. 7. Escalate to SRE or admin team if repeated failures occur.
* Escalate if backups fail for more than one consecutive run. * Notify on-call engineer if Zammad database may not be recoverable.
* ZammadBackupSucceeded * ZammadBackupFailed * HostOutOfDiskSpace (node running backup Job) * KubernetesPodCrashLooping
* Kubernetes → Jobs & CronJobs (namespace: <NAMESPACE>) * Grafana → Backup Job status metrics * MinIO → Backup object listings