Table of Contents
OpenKM Backup CronJob
Meaning
These alerts monitor the Kubernetes Job created by the OpenKM backup CronJob.
Alerts:
- `OpenKMBackupSucceeded` → Backup Job succeeded; MariaDB database and OpenKM repository files were archived and uploaded to MinIO.
- `OpenKMBackupFailed` → Backup Job failed; database dump or repository files backup did not complete or failed to upload.
Impact
* Success → Backup completed successfully. OpenKM data is safely stored in MinIO. * Failure → OpenKM database or repository files may not be backed up. Could affect disaster recovery if restoration is needed.
Diagnosis
1. Check Kubernetes Job status:
kubectl get job openkm-backup-job -n <NAMESPACE> kubectl describe job openkm-backup-job -n <NAMESPACE>
2. Check logs of the Job pod:
kubectl logs job/openkm-backup-job -n <NAMESPACE>
3. Verify backup in MinIO:
mc ls <MINIO_ALIAS>/openkm-backups/ mc stat <MINIO_ALIAS>/openkm-backups/<backup_file>
4. Check PVC mounts:
kubectl get pvc -n <NAMESPACE> kubectl describe pvc <PVC_NAME> -n <NAMESPACE>
Possible Causes of Failure
* Pod in CrashLoopBackOff, OOMKilled, or Failed * PVC mount unavailable or insufficient space * MinIO credentials missing or misconfigured * Network issues preventing upload to MinIO * Disk space or permissions issues on the node * CronJob manifest misconfiguration * MariaDB credentials invalid or inaccessible
Mitigation
1. Inspect Job pod logs to identify errors. 2. Verify MinIO credentials and connectivity. 3. Check PVC status and node disk availability. 4. Verify MariaDB credentials and connectivity. 5. Retry backup manually if needed:
kubectl create job --from=cronjob/openkm-backup-job openkm-backup-job-manual -n <NAMESPACE>
6. Correct any misconfigurations in CronJob YAML, PVC, or MinIO bucket policy. 7. Escalate to SRE or admin team if repeated failures occur.
Escalation
* Escalate if backups fail for more than one consecutive run. * Notify on-call engineer if OpenKM data may not be recoverable.
Related Alerts
* OpenKMBackupSucceeded * OpenKMBackupFailed * HostOutOfDiskSpace (node running backup Job) * KubernetesPodCrashLooping
Related Dashboards
* Kubernetes → Jobs & CronJobs (namespace: <NAMESPACE>) * Grafana → Backup Job status metrics * MinIO → Backup object listings
