# Disaster Recovery & Backup Resources This directory contains all Kubernetes resources related to disaster recovery and backup operations for Gravl. ## Files ### `postgres-backup-cronjob.yaml` Defines automated daily backup CronJob for PostgreSQL database. **Components:** - PostgreSQL Backup ServiceAccount - RBAC ClusterRole and ClusterRoleBinding - Daily Backup CronJob (runs at 02:00 UTC) - Weekly Backup Test CronJob (runs at 03:00 UTC on Sundays) **Key Features:** - Automated daily full backups of gravl database - Gzip compression (level 6) - Upload to S3 with encryption (AES256) - Backup manifest generation with checksums - Automatic retry on failure (up to 3 attempts) - 1-hour timeout for backup operations **Deployment:** ```bash kubectl apply -f postgres-backup-cronjob.yaml ``` ## Manual Backup Scripts All scripts are in `/workspace/gravl/scripts/`: - **backup.sh** - Perform manual full database backup to S3 - **restore.sh** - Restore database from S3 backup - **test-restore.sh** - Automated backup restore testing - **failover.sh** - Initiate failover to secondary region - **failback.sh** - Failback to primary region ## Monitoring & Alerts - **Prometheus Rules:** ../monitoring/prometheus-rules-dr.yaml - **Grafana Dashboard:** ../monitoring/dashboards/gravl-disaster-recovery.json ## Documentation See `/workspace/gravl/docs/DISASTER_RECOVERY.md` for comprehensive documentation including: - RTO/RPO strategy - Backup architecture - Restore procedures - Multi-region failover design - Runbooks for disaster scenarios