# Gravl Deployment Testing Plan ## Overview This document outlines unit, integration, and rollback testing procedures for the Gravl deployment automation scripts: - `scripts/deploy.sh`: Pulls code, builds fresh images (--no-cache), starts containers - `scripts/build-check.sh`: Verifies deployed containers match local git HEAD --- ## Part A: Unit Tests ### Unit Test Suite for `deploy.sh` #### UT-D1: Git Pull Functionality **Objective:** Verify that `git pull` successfully fetches and merges latest code. **Setup:** - Create a test branch with at least one commit ahead of current HEAD - Have a clean working tree **Test Steps:** 1. Note current git HEAD: `GIT_BEFORE=$(git rev-parse HEAD)` 2. Manually push a new commit to remote 3. Run `scripts/deploy.sh` 4. Verify commit was pulled: `git rev-parse HEAD` should differ from `GIT_BEFORE` **Success Criteria:** - `git pull` completes without merge conflicts - Script continues to build step - New commit is reflected in logs: `git log --oneline -1` **Failure Handling:** - If merge conflict occurs, script exits with `set -e` - Manual resolution required before retry --- #### UT-D2: Docker Build with --no-cache **Objective:** Verify that `docker compose build --no-cache` forces fresh image builds. **Setup:** - Clear Docker build cache: `docker builder prune -af` - Have a recent layer in backend/Dockerfile that changes behavior **Test Steps:** 1. Build images normally: `docker compose build` 2. Note build output time 3. Immediately run `scripts/deploy.sh` 4. Capture build output: `docker compose build --no-cache 2>&1 | tee /tmp/build-output.txt` **Success Criteria:** - No layers are cached (all FROM statements rebuild) - Build completes successfully - Final images have new `org.opencontainers.image.revision` label set to current `GIT_COMMIT` **Failure Handling:** - If a layer fails to rebuild, check Dockerfile syntax and dependencies - Clear `node_modules` and rebuild if necessary --- #### UT-D3: Health Check Success Path **Objective:** Verify backend service responds to health endpoint within timeout. **Setup:** - Backend service responds quickly on `/api/health` - Network connectivity is stable **Test Steps:** 1. Run `scripts/deploy.sh` 2. Observe health check loop in logs 3. Verify backend responds: `curl -sf http://localhost:3001/api/health` **Success Criteria:** - Health check completes on first or second attempt (within 10s) - Log shows: `[...] Backend healthy` - Script exits with code 0 **Failure Handling:** - See health check timeout scenario (UT-D4) --- #### UT-D4: Health Check Timeout (Negative Test) **Objective:** Verify script fails gracefully when backend doesn't respond. **Setup:** - Stop backend service before health check loop - Health endpoint returns 500 or times out **Test Steps:** 1. Run `scripts/deploy.sh` 2. Observe health check loop iterate 12 times (60 seconds total) 3. Verify script exits with error code 1 **Success Criteria:** - Loop runs all 12 iterations (5-second intervals) - Final log shows: `ERROR: Health check failed after 60s` - Process exits non-zero - Containers remain running (so you can debug manually) **Failure Handling:** - Check backend logs: `docker logs gravl-backend` - Verify port 3001 is exposed: `docker port gravl-backend` - Test endpoint manually: `curl -v http://localhost:3001/api/health` --- #### UT-D5: Metadata Labeling **Objective:** Verify build metadata is correctly stored in container labels. **Setup:** - After a successful deploy, query container labels **Test Steps:** 1. Run `scripts/deploy.sh` 2. Inspect backend container: `docker inspect gravl-backend --format '{{json .Config.Labels}}'` 3. Verify labels contain: - `org.opencontainers.image.revision`: matches `git rev-parse HEAD` - `org.opencontainers.image.created`: matches build timestamp **Success Criteria:** - Both labels are present and non-empty - Revision matches current HEAD - Created timestamp is recent (within 1 minute of deploy time) **Failure Handling:** - Check docker-compose.yml build args are being passed - Verify Dockerfile includes label copy from build args --- ### Unit Test Suite for `build-check.sh` #### UT-B1: Label Detection - Matching Commit **Objective:** Verify build-check correctly identifies up-to-date containers. **Setup:** - Deploy using `scripts/deploy.sh` (creates proper labels) - Run build-check immediately after deploy **Test Steps:** 1. Execute: `scripts/build-check.sh` 2. Observe output for gravl-backend and gravl-frontend **Success Criteria:** - Output shows: `[gravl-backend] OK: up to date` - Output shows: `[gravl-frontend] OK: up to date` - No STALE or WARNING messages --- #### UT-B2: Label Detection - Missing Labels (Negative) **Objective:** Verify build-check warns when containers lack revision labels. **Setup:** - Manually build and run container without deploy.sh - Container has no `org.opencontainers.image.revision` label **Test Steps:** 1. Build without labels: `docker build -t gravl-backend:test .` 2. Run container manually 3. Execute: `scripts/build-check.sh` **Success Criteria:** - Output shows: `WARNING: no build label found — redeploy with scripts/deploy.sh to add tracking` - No crash or error exit code - Script provides remediation guidance --- #### UT-B3: Stale Detection - Behind HEAD **Objective:** Verify build-check detects containers built from old commits. **Setup:** - Deploy at commit A - Push new commit B to remote - `git pull` locally (so local HEAD = B, but container is at A) - Don't redeploy **Test Steps:** 1. Note current HEAD: `BEFORE=$(git rev-parse HEAD)` 2. Create a dummy commit and push: `echo "test" >> test.txt && git add test.txt && git commit -m "test" && git push` 3. In test environment, pull but don't deploy: `git pull` 4. Run: `scripts/build-check.sh` **Success Criteria:** - Output shows: `[gravl-backend] STALE: container is behind local code — run scripts/deploy.sh` - Commit hash differs between "Built:" and "Local HEAD:" - Exit code is 0 (warning only, not error) --- #### UT-B4: Container Not Running **Objective:** Verify build-check handles missing containers gracefully. **Setup:** - Stop one of the containers (e.g., frontend) - Run build-check **Test Steps:** 1. Stop frontend: `docker stop gravl-frontend` 2. Run: `scripts/build-check.sh` **Success Criteria:** - Output shows: `[gravl-frontend] Not running` - Output for backend is normal - No error; script completes with exit code 0 --- #### UT-B5: Commit Comparison Logic **Objective:** Verify build-check correctly compares local HEAD against container labels. **Setup:** - Deploy at commit with known hash (e.g., abc1234) - Verify container label has exact match - Then create new commit without redeploying **Test Steps:** 1. Get deployed commit: `docker inspect gravl-backend --format '{{index .Config.Labels "org.opencontainers.image.revision"}}'` 2. Verify it matches current HEAD: `git rev-parse HEAD` 3. Create and commit new code: `git commit -am "test"` 4. Run build-check again **Success Criteria:** - Before new commit: "OK: up to date" - After new commit: "STALE: container is behind local code" - Commit hashes are extracted and compared correctly --- ## Part B: Integration Tests ### Integration Test Suite #### IT-1: Full Deploy Cycle in Staging **Objective:** Verify entire deployment workflow from code to running containers. **Preconditions:** - Staging environment isolated from production - Docker daemon running - Git remotes configured - Backend health endpoint functional **Test Steps:** 1. **Baseline:** Document initial state ```bash git rev-parse HEAD > /tmp/baseline-commit.txt scripts/build-check.sh | tee /tmp/baseline-check.txt ``` 2. **Commit code:** Push a non-breaking change ```bash git checkout -b test/it-1-$$ echo "// test change" >> backend/src/index.js git add backend/src/index.js git commit -m "test: IT-1 change" git push origin test/it-1-$$ ``` 3. **Deploy:** Run the full deployment ```bash scripts/deploy.sh | tee /tmp/deploy-log.txt ``` 4. **Verify:** Check health and container state ```bash scripts/build-check.sh | tee /tmp/postdeploy-check.txt docker compose ps curl -sf http://localhost:3001/api/health ``` 5. **Cleanup:** Revert test branch ```bash git checkout - git branch -D test/it-1-$$ ``` **Success Criteria:** - `scripts/deploy.sh` completes with exit code 0 - Health check passes within 60s - `build-check.sh` shows "OK: up to date" for both containers - Containers remain running after deploy completes - Logs show proper git pull, build, and health check steps **Rollback Path (if failure occurs during IT-1):** - See rollback procedures below --- #### IT-2: Deploy with Health Check Failure Recovery **Objective:** Verify deployment handles intermittent health check failures and recovers. **Preconditions:** - Backend can be temporarily paused/resumed - System has `docker pause`/`docker unpause` available **Test Steps:** 1. **Pre-deploy:** Baseline state ```bash scripts/build-check.sh > /tmp/it2-baseline.txt ``` 2. **Deploy start:** Trigger deployment (background) ```bash scripts/deploy.sh > /tmp/it2-deploy.log 2>&1 & DEPLOY_PID=$! ``` 3. **Introduce pause:** After 3 seconds, pause backend (simulates slow startup) ```bash sleep 3 docker pause gravl-backend ``` 4. **Allow recovery:** Unpause before timeout ```bash sleep 15 docker unpause gravl-backend ``` 5. **Verify completion:** ```bash wait $DEPLOY_PID RESULT=$? ``` **Success Criteria:** - Deploy script retries health check multiple times - When backend recovers, health check passes - Script completes with exit code 0 - Containers transition to healthy state --- #### IT-3: Multi-Service Coordination **Objective:** Verify frontend and backend both restart and sync properly. **Preconditions:** - Both services configured in docker-compose.yml - Frontend depends on backend being healthy **Test Steps:** 1. **Deploy:** ```bash scripts/deploy.sh ``` 2. **Check startup order:** - Grep logs for `[gravl-backend]` and `[gravl-frontend]` timestamps - Verify backend logs appear before frontend health check 3. **Verify networking:** ```bash docker exec gravl-frontend curl -sf http://gravl-backend:3001/api/health docker exec gravl-backend curl -sf http://localhost:3001/api/health ``` 4. **Verify labels on both:** ```bash docker inspect gravl-backend gravl-frontend --format '{{.Name}} => {{index .Config.Labels "org.opencontainers.image.revision"}}' ``` **Success Criteria:** - Both containers start successfully - Both containers have matching revision labels (same commit) - Frontend can reach backend via container hostname - Build-check shows "OK: up to date" for both --- ## Part C: Rollback Procedures & Safety Checks ### RB-1: Manual Rollback to Previous Commit **When to use:** Deployed code is broken and breaks production. **Prerequisites:** - Know the last good commit hash - Database migrations (if any) are reversible - Users can be impacted for <5 min **Steps:** ```bash # 1. Document current state git rev-parse HEAD > /tmp/rollback-from.txt # 2. Check out previous good commit git checkout # 3. Redeploy (pulls and rebuilds) scripts/deploy.sh # 4. Verify recovery scripts/build-check.sh curl -sf http://localhost:3001/api/health # 5. Log the incident echo "Rolled back from $(cat /tmp/rollback-from.txt) to $good-commit-hash" >> logs/rollback.log ``` **Safety Checks:** - ✅ Always verify health endpoint responds after rollback - ✅ Check logs for errors: `docker logs gravl-backend | tail -50` - ✅ Check database state if applicable (query active sessions, etc.) - ✅ Notify team of rollback and reason --- ### RB-2: Emergency Container Cleanup & Restart **When to use:** Containers are hung, corrupted, or in unknown state. **Prerequisites:** - OK to restart services temporarily - Data is persistent in volumes **Steps:** ```bash # 1. Stop all containers docker compose down # 2. Remove images (to force fresh rebuild on next deploy) docker rmi gravl-backend gravl-frontend # 3. Redeploy fresh scripts/deploy.sh # 4. Verify docker compose ps scripts/build-check.sh ``` **Safety Checks:** - ✅ Confirm volumes are not removed: `docker volume ls | grep gravl` - ✅ Verify all containers start: `docker compose ps` shows all "Up" - ✅ Health check passes within 60s - ✅ No data loss from persistent stores --- ### RB-3: Staged Rollback (Blue-Green Alternative) **When to use:** Can't tolerate any downtime. **Prerequisites:** - Two separate services running (blue = prod, green = staging) - Load balancer or router can switch traffic - Synchronized database **Steps:** ```bash # 1. Deploy to green environment cd /path/to/green/environment git pull docker compose build --no-cache docker compose up -d # 2. Health check green curl -sf http://green-backend:3001/api/health # 3. Route traffic to green (via load balancer/DNS) # (This step is environment-specific) # 4. If issues, revert traffic to blue immediately # (No containers to roll back on blue; it kept serving) # 5. Debug green offline # (No downtime for users) ``` --- ## Safety Checks Summary | Check | When | Command | Pass Criteria | |-------|------|---------|---------------| | Health | After deploy | `curl -sf http://localhost:3001/api/health` | HTTP 200 within 60s | | Labels | After deploy | `docker inspect gravl-backend --format '{{index .Config.Labels "org.opencontainers.image.revision"}}'` | Non-empty, matches `git rev-parse HEAD` | | Build status | Before deploy | `scripts/build-check.sh` | No STALE warnings | | Container state | After deploy | `docker compose ps` | All containers "Up" | | Logs | After deploy | `docker logs gravl-backend \| tail -20` | No ERROR or CRITICAL lines | --- ## Running Tests Locally ### Quick Test (5 min) ```bash cd /workspace/gravl # UT-D1: Git pull git pull # UT-D2: Build with no-cache docker compose build --no-cache # UT-D3: Health check curl -sf http://localhost:3001/api/health # UT-B1: Build-check scripts/build-check.sh ``` ### Full Suite (30 min) ```bash # Clone test repo in /tmp mkdir -p /tmp/gravl-test cd /tmp/gravl-test git clone /workspace/gravl . git remote set-url origin /workspace/gravl # Run all UTs and IT-1 # (See individual test steps above) ``` --- ## Metrics to Monitor After each test, log these metrics to `logs/test-results.json`: - Deploy time (seconds) - Health check time (seconds) - Build cache hit rate (% of layers reused) - Container restart count - Error count in logs Example: ```json { "timestamp": "2026-03-03T18:21:00Z", "test_name": "IT-1", "deploy_time_sec": 45, "health_check_time_sec": 8, "result": "pass" } ``` --- *Last updated: 2026-03-03 | Next review: After phase 07-04 completion*