====== Flectra Backup CronJob ====== ===== Meaning ===== These alerts monitor the Kubernetes Job created by the Flectra backup CronJob. Alerts: * `FlectraBackupSucceeded` → Backup Job succeeded; PostgreSQL database, filestore, and extra-addons have been archived and uploaded to MinIO. * `FlectraBackupFailed` → Backup Job failed; backup was not created or upload to MinIO failed. ===== Impact ===== * Success → Backup completed successfully. Data is safely stored in MinIO. * Failure → Flectra database or files may not be backed up. Could affect disaster recovery if restoration is needed. ===== Diagnosis ===== 1. Check Kubernetes Job status: kubectl get job flectra-backup-job -n kubectl describe job flectra-backup-job -n 2. Check logs of the Job pod: kubectl logs job/flectra-backup-job -n 3. Verify backup in MinIO: mc ls /flectra-backups/ mc stat /flectra-backups/ 4. Check PVC mounts if used: kubectl get pvc -n kubectl describe pvc -n ===== Possible Causes of Failure ===== * Pod in CrashLoopBackOff, OOMKilled, or Failed * PVC mount unavailable or insufficient space * MinIO credentials missing or misconfigured * Network issues preventing upload to MinIO * Disk space or permissions issues on the node * CronJob manifest misconfiguration ===== Mitigation ===== 1. Inspect Job pod logs to identify errors. 2. Verify MinIO credentials and connectivity. 3. Check PVC status and node disk availability. 4. Retry backup manually if needed: kubectl create job --from=cronjob/flectra-backup-job flectra-backup-job-manual -n 5. Correct any misconfigurations in CronJob YAML or MinIO bucket policy. 6. Escalate to SRE or admin team if repeated failures occur. ===== Escalation ===== * Escalate if backups fail for more than one consecutive run. * Notify on-call engineer if production Flectra data may not be recoverable. ===== Related Alerts ===== * FlectraBackupSucceeded * FlectraBackupFailed * HostOutOfDiskSpace (node running backup Job) * KubernetesPodCrashLooping ===== Related Dashboards ===== * Kubernetes → Jobs & CronJobs (namespace: ) * Grafana → Backup Job status metrics * MinIO → Backup object listings