# Gravl Deployment Guide This guide covers how to deploy Gravl's backend and frontend services using automated scripts, verify deployment status, and handle troubleshooting and recovery scenarios. --- ## Overview Gravl uses Docker and Docker Compose for containerization. Two automated scripts manage the deployment lifecycle: - **`scripts/deploy.sh`**: Pulls latest code, builds fresh images (with `--no-cache` to prevent stale assets), and starts containers with health checks - **`scripts/build-check.sh`**: Verifies that running containers match the current git HEAD (detects stale deployments) --- ## Prerequisites Before deploying, ensure you have: 1. **Docker & Docker Compose** installed and running ```bash docker --version docker compose version ``` 2. **Git** configured with push/pull access to the repository ```bash git remote -v ``` 3. **Network access** to required ports: - Backend: `localhost:3001` (health check at `http://localhost:3001/api/health`) - Frontend: `localhost:3000` (or configured in `docker-compose.yml`) 4. **Sufficient disk space** for Docker images and volumes ```bash docker system df ``` 5. **No conflicting services** using ports 3000-3001 ```bash lsof -i :3000 -i :3001 # (macOS/Linux only) ``` --- ## How to Run `deploy.sh` ### Basic Usage ```bash cd /workspace/gravl scripts/deploy.sh ``` ### What It Does 1. **Git Pull**: Fetches and merges latest code from remote - Exits if merge conflicts occur (manual resolution required) 2. **Captures Metadata**: - Current git commit hash - Build timestamp - These are stored as Docker image labels for later verification 3. **Builds Docker Images** (`--no-cache`): - Rebuilds all layers (no caching) to prevent stale assets - Applies git commit and build timestamp as labels 4. **Starts Containers**: - Uses `docker compose up -d --force-recreate` to ensure clean start - Both backend and frontend containers are started 5. **Health Check**: - Waits up to 60 seconds for backend to respond on `/api/health` - Retries every 5 seconds (12 attempts max) - Fails with exit code 1 if health check times out ### Exit Codes | Code | Meaning | Next Steps | |------|---------|-----------| | 0 | Success | Deployment complete; containers healthy | | 1 | Failure | See troubleshooting below | ### Logs All deploy activity is logged to `logs/deploy.log`: ```bash tail -50 logs/deploy.log # Last 50 lines grep ERROR logs/deploy.log # Find errors ``` ### Environment Variables Optional env vars can be set before running `deploy.sh`: | Variable | Default | Purpose | |----------|---------|---------| | `GIT_COMMIT` | auto-detected | Override git commit label (not recommended) | | `BUILD_DATE` | auto-detected | Override build timestamp (not recommended) | --- ## How to Check Build Status (`build-check.sh`) Run this command anytime to verify deployed containers match your local code: ```bash scripts/build-check.sh ``` ### Output Example **Healthy deployment:** ``` Local HEAD: abc1234 (abc1234567890abcdef1234567890abcdef123456) [gravl-backend] Built: abc1234 on 2026-03-03T18:21:00Z [gravl-backend] OK: up to date [gravl-frontend] Built: abc1234 on 2026-03-03T18:21:00Z [gravl-frontend] OK: up to date ``` **Stale containers (code updated, not redeployed):** ``` Local HEAD: xyz5678 (xyz5678...) [gravl-backend] Built: abc1234 on 2026-03-03T18:21:00Z [gravl-backend] STALE: container is behind local code — run scripts/deploy.sh [gravl-frontend] Built: abc1234 on 2026-03-03T18:21:00Z [gravl-frontend] STALE: container is behind local code — run scripts/deploy.sh ``` **Missing labels (container built manually, not via deploy.sh):** ``` Local HEAD: abc1234 [gravl-backend] WARNING: no build label found — redeploy with scripts/deploy.sh to add tracking [gravl-frontend] Not running ``` ### Exit Codes | Code | Meaning | |------|---------| | 0 | All checks completed (warnings don't fail; see output for status) | | (no error exit) | Missing containers are noted but don't cause failure | --- ## Troubleshooting ### Health Check Failures **Symptom:** `ERROR: Health check failed after 60s` **Causes & Solutions:** 1. **Backend service didn't start** ```bash docker logs gravl-backend | tail -20 # Look for: # - Port conflicts (ERR_EADDRINUSE) # - Missing dependencies (module not found) # - Database connection errors ``` 2. **Port 3001 is already in use** ```bash lsof -i :3001 # Find what's using it docker port gravl-backend # Check exposed port kill -9 # Kill conflicting process (if safe) scripts/deploy.sh # Retry ``` 3. **Network issue between host and container** ```bash docker inspect gravl-backend --format '{{.NetworkSettings.IPAddress}}' curl -sf http://:3001/api/health # Test directly ``` 4. **Backend code has syntax error** ```bash docker logs gravl-backend 2>&1 | grep -i "syntax\|error\|exception" # Check backend/src/index.js for obvious errors # Revert recent changes: git log --oneline -5 && git checkout ``` **Quick recovery:** ```bash # 1. Stop everything docker compose down # 2. Check backend logs docker compose up -d gravl-backend sleep 5 docker logs gravl-backend | tail -50 # 3. If logs show errors, fix code and retry git diff HEAD~1..HEAD backend/src/ # ... fix issues ... scripts/deploy.sh ``` --- ### Stale Containers **Symptom:** `build-check.sh` shows `STALE: container is behind local code` **Causes:** - Code was updated (`git pull`) but `deploy.sh` hasn't been run - Deployment failed partway through - Manual restart without redeploy **Solution:** ```bash scripts/deploy.sh scripts/build-check.sh # Verify update ``` --- ### Missing Build Labels **Symptom:** `WARNING: no build label found — redeploy with scripts/deploy.sh` **Causes:** - Container was built with `docker compose build` directly (not via `deploy.sh`) - Container predates the labeling system **Solution:** ```bash # Re-deploy to add labels scripts/deploy.sh ``` --- ### Container Won't Start (CrashLoopBackOff / Exited) **Symptom:** `docker compose ps` shows container in "Exited" state **Steps:** 1. **Check container logs** ```bash docker logs gravl-backend --tail 50 docker logs gravl-frontend --tail 50 ``` 2. **Check docker-compose.yml for typos** ```bash docker compose config # Validates syntax ``` 3. **Inspect health check endpoint** ```bash curl -v http://localhost:3001/api/health # Should see HTTP 200, not 404 or 500 ``` 4. **If all else fails, clean rebuild** ```bash docker compose down docker rmi gravl-backend gravl-frontend docker system prune -f scripts/deploy.sh ``` --- ### Database Connection Issues **Symptom:** Backend logs show `Connection refused` or `ECONNREFUSED` **Causes:** - Database service not running - Wrong host/port in `.env` or backend code - Network issue between containers **Solutions:** 1. **Check database service status** (if applicable) ```bash docker compose ps # All services running? docker network ls # Check gravl network exists ``` 2. **Verify connection string in `.env`** ```bash cat .env | grep -i database # Should match docker-compose.yml service name (e.g., gravl-db:5432) ``` 3. **Test connection from backend container** ```bash docker exec gravl-backend ping gravl-db docker exec gravl-backend curl http://gravl-db:5432 # If HTTP, adjust port ``` --- ### Disk Space Issues **Symptom:** `no space left on device` during build **Solution:** ```bash # Check disk usage docker system df # Clean up unused images/containers docker system prune -a --volumes # Then retry deploy scripts/deploy.sh ``` --- ## Recovery Procedures ### Manual Rollback to Previous Commit Use this when the deployed code is broken and you need to quickly revert. ```bash # 1. Find the last good commit git log --oneline -10 # Review recent commits # 2. Check out the known-good commit git checkout # 3. Redeploy scripts/deploy.sh # 4. Verify scripts/build-check.sh curl -sf http://localhost:3001/api/health # 5. Document the incident echo "Rolled back to due to " >> logs/rollback.log ``` ### Emergency Container Cleanup Use this when containers are hung, corrupted, or in an unknown state. ```bash # 1. Stop all services docker compose down # 2. Remove images (forces fresh rebuild) docker rmi gravl-backend gravl-frontend # 3. Clear unused volumes (optional; use with caution!) # docker volume prune # 4. Rebuild from scratch scripts/deploy.sh # 5. Verify all containers running and healthy docker compose ps scripts/build-check.sh curl -sf http://localhost:3001/api/health ``` **Safety Check:** If your data is in Docker volumes, `docker volume prune` will destroy them. Skip this step unless you're sure you don't need the data. ### Staged Rollback (Zero-Downtime) If you're running a blue-green deployment setup: ```bash # 1. Deploy to green environment cd /path/to/green git pull && docker compose build --no-cache && docker compose up -d # 2. Test green (health check, smoke tests) curl -sf http://green-backend:3001/api/health # 3. Switch traffic to green (via load balancer or DNS) # (Implementation depends on your infrastructure) # 4. If green has issues, revert traffic to blue immediately # (Blue kept serving; no downtime) # 5. Debug green offline docker logs gravl-backend ``` --- ## Monitoring After Deployment ### Immediate Checks (after `deploy.sh` completes) ```bash # Containers are running docker compose ps # Backend is healthy curl -sf http://localhost:3001/api/health | jq . # Containers match local code scripts/build-check.sh # Logs have no errors docker logs gravl-backend 2>&1 | grep -i error | head -5 ``` ### Ongoing Checks (periodically) ```bash # Run build-check regularly (cron every 30 min, or manual) scripts/build-check.sh # Monitor resource usage docker stats gravl-backend gravl-frontend # Audit logs for issues docker logs gravl-backend --since 1h --until now | grep ERROR ``` ### Example Monitoring Script ```bash #!/bin/bash # Save as scripts/health-monitor.sh set -euo pipefail HEALTHY=true # Check containers running docker compose ps | grep -q "Up" || HEALTHY=false # Check health endpoint curl -sf http://localhost:3001/api/health || HEALTHY=false # Check for stale containers scripts/build-check.sh | grep -q "STALE" && HEALTHY=false if [ "$HEALTHY" = "true" ]; then echo "[$(date)] Gravl is healthy ✓" else echo "[$(date)] Gravl has issues! See above." >&2 exit 1 fi ``` --- ## Best Practices 1. **Always run `build-check.sh` before deploying changes** - Ensures you know current state - Catches stale containers early 2. **Review changes before deploying** ```bash git log --oneline -5 # Recent commits git diff origin/main..HEAD # What will be deployed ``` 3. **Test in staging first** - Separate staging environment for pre-production testing - Deploy to staging, verify, then deploy to production 4. **Keep logs rotated** - `logs/deploy.log` can grow large - Use `logrotate` or manual cleanup: `tail -1000 logs/deploy.log > logs/deploy.log.1 && > logs/deploy.log` 5. **Automate regular checks** - Cron job to run `build-check.sh` every 30 minutes - Send alerts if "STALE" or "WARNING" found 6. **Document rollbacks** - Always log why you rolled back - Review patterns (e.g., "rolled back 3 times this week" = code review process failing) --- ## See Also - **Testing**: [DEPLOYMENT_TEST_PLAN.md](./DEPLOYMENT_TEST_PLAN.md) — comprehensive test scenarios - **Code style**: [CODING-CONVENTIONS.md](./CODING-CONVENTIONS.md) - **Architecture**: Backend README or architecture docs (if available) --- *Last updated: 2026-03-03 | Maintained by: Gravl Development Team*