diff --git a/backend/README.md b/backend/README.md new file mode 100644 index 0000000..21197b7 --- /dev/null +++ b/backend/README.md @@ -0,0 +1,218 @@ +# Gravl Backend + +Node.js / Express API server for the Gravl application. + +## Development + +### Prerequisites +- Node.js 18+ +- PostgreSQL 14+ (or use Docker Compose) +- Docker & Docker Compose (for containerized development) + +### Getting Started + +```bash +# Install dependencies +npm install + +# Create .env file (copy from .env.example) +cp .env.example .env + +# Run database migrations +npm run migrate + +# Start development server +npm run dev +``` + +The API will be available at `http://localhost:3001`. + +### Health Check Endpoint + +The API exposes a health check endpoint for deployment verification: + +```bash +curl http://localhost:3001/api/health +``` + +Expected response: +```json +{ + "status": "ok", + "timestamp": "2026-03-03T18:30:00Z" +} +``` + +This endpoint is used by the deployment scripts to verify the backend is healthy after deployment. + +--- + +## Deployment + +### Quick Start + +See `/docs/DEPLOYMENT.md` for comprehensive deployment documentation. + +```bash +# Deploy the application +scripts/deploy.sh + +# Check deployment status +scripts/build-check.sh +``` + +### How It Works + +1. **Automatic build:** `scripts/deploy.sh` builds fresh Docker images +2. **Zero downtime:** Old containers are replaced with `--force-recreate` +3. **Health verification:** API health endpoint is polled before deployment completes +4. **Rollback:** Use git to revert and redeploy if issues arise + +### Prerequisites for Deployment + +- Docker and Docker Compose installed +- Git remote configured and accessible +- Backend listening on port 3001 +- Health endpoint (`/api/health`) responding with 200 OK + +### Example Deployment Workflow + +```bash +# 1. Make code changes and commit +git add . && git commit -m "feat: new API endpoint" + +# 2. Deploy from project root +cd /workspace/gravl +scripts/deploy.sh + +# 3. Verify deployment +scripts/build-check.sh + +# 4. Check logs if needed +docker compose logs gravl-backend +``` + +### Container Labels + +All deployed containers include build metadata labels for tracking: +- `org.opencontainers.image.revision` — Git commit SHA +- `org.opencontainers.image.created` — Build timestamp + +These are used by `scripts/build-check.sh` to detect stale deployments. + +--- + +## Testing + +```bash +# Run unit tests +npm test + +# Run integration tests +npm run test:integration + +# Run with coverage +npm run test:coverage +``` + +--- + +## Database + +### Migrations + +```bash +# Run pending migrations +npm run migrate + +# Rollback last migration +npm run migrate:rollback + +# Create new migration +npm run migrate:create -- my_migration_name +``` + +### Connection + +Configure via `.env`: +``` +DATABASE_URL=postgresql://user:password@localhost:5432/gravl +``` + +--- + +## Environment Variables + +See `.env.example` for all available variables. + +Key variables: +- `NODE_ENV` — Development/production mode +- `PORT` — Server port (default: 3001) +- `DATABASE_URL` — PostgreSQL connection string +- `JWT_SECRET` — Token signing secret + +--- + +## Project Structure + +``` +backend/ +├── src/ +│ ├── api/ # Express route handlers +│ ├── middleware/ # Express middleware +│ ├── models/ # Database models +│ ├── services/ # Business logic +│ └── index.js # App entry point +├── tests/ # Unit and integration tests +├── migrations/ # Database migrations +├── docker/ # Dockerfile +├── .env.example # Environment template +└── README.md # This file +``` + +--- + +## Troubleshooting + +### API Won't Start + +Check the logs: +```bash +docker compose logs gravl-backend +``` + +Common issues: +- Port 3001 already in use: Kill the process or change the port +- Database connection failed: Verify `.env` DATABASE_URL +- Node modules missing: Run `npm install` + +### Health Check Fails + +Ensure the `/api/health` endpoint is implemented: + +```javascript +// backend/src/api/health.js +app.get('/api/health', (req, res) => { + res.json({ status: 'ok', timestamp: new Date().toISOString() }); +}); +``` + +### Database Issues + +Check Docker container status: +```bash +docker compose ps +docker compose logs gravl-db +``` + +--- + +## Contributing + +See `CODING-CONVENTIONS.md` in the project root for code style and standards. + +--- + +**Last Updated:** 2026-03-03 +**Phase:** 07-03 +**Related:** `/docs/DEPLOYMENT.md` diff --git a/docs/DEPLOYMENT.md b/docs/DEPLOYMENT.md new file mode 100644 index 0000000..e322529 --- /dev/null +++ b/docs/DEPLOYMENT.md @@ -0,0 +1,292 @@ +# Gravl Deployment Guide + +This guide covers how to deploy the Gravl application, verify deployments, and troubleshoot common issues. + +## Prerequisites + +- Docker and Docker Compose installed +- Git repository with remote configured +- Access to `/workspace/gravl` directory +- Backend API listening on `http://localhost:3001/api/health` + +## Deployment Script + +### Running a Deployment + +```bash +cd /workspace/gravl +scripts/deploy.sh +``` + +### What It Does + +1. **Pulls latest code:** `git pull` +2. **Captures build metadata:** + - Git commit SHA + - Build timestamp +3. **Builds fresh images:** `docker compose build --no-cache` + - `--no-cache` ensures all layers are rebuilt (prevents stale assets) +4. **Restarts containers:** `docker compose up -d --force-recreate` +5. **Health check:** Polls `/api/health` for up to 60 seconds +6. **Logs deployment:** Records all steps to `logs/deploy.log` + +### Output Example + +``` +[2026-03-03 18:30:00] === Deploy started === +[2026-03-03 18:30:01] Pulling latest code... +[2026-03-03 18:30:05] Commit: 53f4df6 | Date: 2026-03-03T18:30:00Z +[2026-03-03 18:30:06] Building images (--no-cache)... +[2026-03-03 18:30:45] Starting containers... +[2026-03-03 18:30:50] Health check... +[2026-03-03 18:30:55] Backend healthy +[2026-03-03 18:30:56] === Deploy complete: 53f4df6 === +``` + +--- + +## Checking Deployment Status + +### Build Status Check + +```bash +cd /workspace/gravl +scripts/build-check.sh +``` + +### Output Example + +``` +Local HEAD: 53f4df6 (53f4df6f8a5c4d2e1f0a9b8c7d6e5f4a3b2c1d0) + +[gravl-backend] Built: 53f4df6 on 2026-03-03T18:30:00Z +[gravl-backend] OK: up to date + +[gravl-frontend] Built: 53f4df6 on 2026-03-03T18:30:00Z +[gravl-frontend] OK: up to date +``` + +### What the Check Tells You + +- **OK: up to date** — Containers match the local git commit (everything is current) +- **STALE: container is behind local code** — Code has changed but containers haven't been redeployed yet +- **WARNING: no build label found** — Container is old (pre-07-02) and lacks build tracking labels + +--- + +## Troubleshooting + +### Health Check Failures + +**Symptom:** Deployment fails with "ERROR: Health check failed after 60s" + +**Possible Causes & Solutions:** + +| Cause | Solution | +|-------|----------| +| Backend not starting | Check logs: `docker compose logs gravl-backend` | +| Health endpoint not implemented | Implement `GET /api/health` in backend (returns `200 OK`) | +| Network issues | Verify network: `docker network inspect gravl` or restart: `docker compose restart` | +| Port already in use | Check: `lsof -i :3001` and kill the process or change port | +| Insufficient resources | Free disk space: `df -h` or reduce image size | + +**Manual Restart:** +```bash +docker compose restart gravl-backend +# Wait a few seconds +curl -sf http://localhost:3001/api/health +``` + +--- + +### Stale Containers + +**Symptom:** `build-check.sh` shows "STALE: container is behind local code" + +**Cause:** Code has been updated but containers haven't been redeployed. + +**Solution:** +```bash +scripts/deploy.sh +scripts/build-check.sh # Should now show OK +``` + +--- + +### Missing Docker Labels + +**Symptom:** `build-check.sh` shows "WARNING: no build label found" + +**Cause:** Containers were built before phase 07-02 (before labels were added). + +**Solution:** +```bash +scripts/deploy.sh # Rebuilds with labels +``` + +--- + +### Deployment Hangs + +**Symptom:** `scripts/deploy.sh` doesn't complete or appears stuck. + +**Possible Causes & Solutions:** + +| Symptom | Solution | +|---------|----------| +| Stuck at "Building images" | Docker build is slow. Check: `docker builder prune` to free cache | +| Stuck at "Health check" | Backend not responding. Try: `docker compose logs` to see errors | +| Git pull conflicts | Resolve conflicts manually: `cd /workspace/gravl && git status` | + +**Force Stop:** +```bash +# Kill the deploy script +pkill -f scripts/deploy.sh + +# Manually check status +docker compose ps +docker compose logs +``` + +--- + +## Rollback Procedures + +### Quick Rollback + +If the current deployment is broken: + +```bash +# Revert to previous commit +git reset --hard HEAD~1 + +# Redeploy +scripts/deploy.sh + +# Verify +scripts/build-check.sh +``` + +### Multi-Commit Rollback + +If you need to go back several commits: + +```bash +# View recent commits +git log --oneline -10 + +# Rollback to a specific commit (example: abc1234) +git reset --hard abc1234 + +# Redeploy +scripts/deploy.sh +``` + +### Rollback Verification + +After rolling back, verify the system is stable: + +```bash +# Check containers match the previous code +scripts/build-check.sh + +# Check API is healthy +curl -sf http://localhost:3001/api/health | jq . + +# Check frontend is responsive +curl -sf http://localhost:3000/ | head -c 500 +``` + +--- + +## Manual Container Cleanup + +If containers become corrupted or stuck: + +```bash +# Stop all containers +docker compose down + +# Remove volumes (WARNING: deletes data) +docker compose down -v + +# Verify they're gone +docker compose ps + +# Full redeploy +scripts/deploy.sh +``` + +--- + +## Monitoring & Logs + +### Deployment Log + +```bash +tail -f logs/deploy.log +``` + +### Container Logs + +```bash +# Backend logs +docker compose logs gravl-backend + +# Frontend logs +docker compose logs gravl-frontend + +# All logs with timestamps +docker compose logs --timestamps --follow +``` + +### Build Info + +```bash +# List deployed images +docker images | grep gravl + +# Inspect container labels (build metadata) +docker inspect gravl-backend | jq '.Config.Labels' +``` + +--- + +## Best Practices + +1. **Always test in staging first** — Validate the deploy in a non-production environment +2. **Check status before deploying** — Run `scripts/build-check.sh` to ensure no stale containers +3. **Review logs after deployment** — Check `logs/deploy.log` for warnings or errors +4. **Plan rollbacks** — Know which commits are stable before deploying +5. **Monitor health endpoints** — Regularly ping `/api/health` in production +6. **Backup before major changes** — Tag releases in git before significant deployments +7. **Use semantic commits** — Make it easy to identify which commits introduced changes + +--- + +## FAQ + +**Q: Can I deploy without building (e.g., just restart containers)?** +A: No. The script always rebuilds to prevent stale code. This is intentional for safety. + +**Q: How long should a deployment take?** +A: Typically 60-90 seconds (build time + health check). If longer, check Docker build performance. + +**Q: What if I need to deploy a specific commit?** +A: Check it out first, then deploy: +```bash +git checkout +scripts/deploy.sh +``` + +**Q: Can I skip the health check?** +A: Not recommended. The health check prevents deploying broken code. Fix the health endpoint instead. + +**Q: What data is lost if I rollback?** +A: Container rollback only reverts code. Database data persists unless you `docker compose down -v`. + +--- + +**Last Updated:** 2026-03-03 +**Document Version:** 1.0 +**Phase:** 07-03 diff --git a/docs/DEPLOYMENT_TEST_PLAN.md b/docs/DEPLOYMENT_TEST_PLAN.md new file mode 100644 index 0000000..ba28e78 --- /dev/null +++ b/docs/DEPLOYMENT_TEST_PLAN.md @@ -0,0 +1,549 @@ +# Gravl Deployment Testing Plan + +## Overview + +This document outlines unit, integration, and rollback testing procedures for the Gravl deployment automation scripts: +- `scripts/deploy.sh`: Pulls code, builds fresh images (--no-cache), starts containers +- `scripts/build-check.sh`: Verifies deployed containers match local git HEAD + +--- + +## Part A: Unit Tests + +### Unit Test Suite for `deploy.sh` + +#### UT-D1: Git Pull Functionality +**Objective:** Verify that `git pull` successfully fetches and merges latest code. + +**Setup:** +- Create a test branch with at least one commit ahead of current HEAD +- Have a clean working tree + +**Test Steps:** +1. Note current git HEAD: `GIT_BEFORE=$(git rev-parse HEAD)` +2. Manually push a new commit to remote +3. Run `scripts/deploy.sh` +4. Verify commit was pulled: `git rev-parse HEAD` should differ from `GIT_BEFORE` + +**Success Criteria:** +- `git pull` completes without merge conflicts +- Script continues to build step +- New commit is reflected in logs: `git log --oneline -1` + +**Failure Handling:** +- If merge conflict occurs, script exits with `set -e` +- Manual resolution required before retry + +--- + +#### UT-D2: Docker Build with --no-cache +**Objective:** Verify that `docker compose build --no-cache` forces fresh image builds. + +**Setup:** +- Clear Docker build cache: `docker builder prune -af` +- Have a recent layer in backend/Dockerfile that changes behavior + +**Test Steps:** +1. Build images normally: `docker compose build` +2. Note build output time +3. Immediately run `scripts/deploy.sh` +4. Capture build output: `docker compose build --no-cache 2>&1 | tee /tmp/build-output.txt` + +**Success Criteria:** +- No layers are cached (all FROM statements rebuild) +- Build completes successfully +- Final images have new `org.opencontainers.image.revision` label set to current `GIT_COMMIT` + +**Failure Handling:** +- If a layer fails to rebuild, check Dockerfile syntax and dependencies +- Clear `node_modules` and rebuild if necessary + +--- + +#### UT-D3: Health Check Success Path +**Objective:** Verify backend service responds to health endpoint within timeout. + +**Setup:** +- Backend service responds quickly on `/api/health` +- Network connectivity is stable + +**Test Steps:** +1. Run `scripts/deploy.sh` +2. Observe health check loop in logs +3. Verify backend responds: `curl -sf http://localhost:3001/api/health` + +**Success Criteria:** +- Health check completes on first or second attempt (within 10s) +- Log shows: `[...] Backend healthy` +- Script exits with code 0 + +**Failure Handling:** +- See health check timeout scenario (UT-D4) + +--- + +#### UT-D4: Health Check Timeout (Negative Test) +**Objective:** Verify script fails gracefully when backend doesn't respond. + +**Setup:** +- Stop backend service before health check loop +- Health endpoint returns 500 or times out + +**Test Steps:** +1. Run `scripts/deploy.sh` +2. Observe health check loop iterate 12 times (60 seconds total) +3. Verify script exits with error code 1 + +**Success Criteria:** +- Loop runs all 12 iterations (5-second intervals) +- Final log shows: `ERROR: Health check failed after 60s` +- Process exits non-zero +- Containers remain running (so you can debug manually) + +**Failure Handling:** +- Check backend logs: `docker logs gravl-backend` +- Verify port 3001 is exposed: `docker port gravl-backend` +- Test endpoint manually: `curl -v http://localhost:3001/api/health` + +--- + +#### UT-D5: Metadata Labeling +**Objective:** Verify build metadata is correctly stored in container labels. + +**Setup:** +- After a successful deploy, query container labels + +**Test Steps:** +1. Run `scripts/deploy.sh` +2. Inspect backend container: `docker inspect gravl-backend --format '{{json .Config.Labels}}'` +3. Verify labels contain: + - `org.opencontainers.image.revision`: matches `git rev-parse HEAD` + - `org.opencontainers.image.created`: matches build timestamp + +**Success Criteria:** +- Both labels are present and non-empty +- Revision matches current HEAD +- Created timestamp is recent (within 1 minute of deploy time) + +**Failure Handling:** +- Check docker-compose.yml build args are being passed +- Verify Dockerfile includes label copy from build args + +--- + +### Unit Test Suite for `build-check.sh` + +#### UT-B1: Label Detection - Matching Commit +**Objective:** Verify build-check correctly identifies up-to-date containers. + +**Setup:** +- Deploy using `scripts/deploy.sh` (creates proper labels) +- Run build-check immediately after deploy + +**Test Steps:** +1. Execute: `scripts/build-check.sh` +2. Observe output for gravl-backend and gravl-frontend + +**Success Criteria:** +- Output shows: `[gravl-backend] OK: up to date` +- Output shows: `[gravl-frontend] OK: up to date` +- No STALE or WARNING messages + +--- + +#### UT-B2: Label Detection - Missing Labels (Negative) +**Objective:** Verify build-check warns when containers lack revision labels. + +**Setup:** +- Manually build and run container without deploy.sh +- Container has no `org.opencontainers.image.revision` label + +**Test Steps:** +1. Build without labels: `docker build -t gravl-backend:test .` +2. Run container manually +3. Execute: `scripts/build-check.sh` + +**Success Criteria:** +- Output shows: `WARNING: no build label found — redeploy with scripts/deploy.sh to add tracking` +- No crash or error exit code +- Script provides remediation guidance + +--- + +#### UT-B3: Stale Detection - Behind HEAD +**Objective:** Verify build-check detects containers built from old commits. + +**Setup:** +- Deploy at commit A +- Push new commit B to remote +- `git pull` locally (so local HEAD = B, but container is at A) +- Don't redeploy + +**Test Steps:** +1. Note current HEAD: `BEFORE=$(git rev-parse HEAD)` +2. Create a dummy commit and push: `echo "test" >> test.txt && git add test.txt && git commit -m "test" && git push` +3. In test environment, pull but don't deploy: `git pull` +4. Run: `scripts/build-check.sh` + +**Success Criteria:** +- Output shows: `[gravl-backend] STALE: container is behind local code — run scripts/deploy.sh` +- Commit hash differs between "Built:" and "Local HEAD:" +- Exit code is 0 (warning only, not error) + +--- + +#### UT-B4: Container Not Running +**Objective:** Verify build-check handles missing containers gracefully. + +**Setup:** +- Stop one of the containers (e.g., frontend) +- Run build-check + +**Test Steps:** +1. Stop frontend: `docker stop gravl-frontend` +2. Run: `scripts/build-check.sh` + +**Success Criteria:** +- Output shows: `[gravl-frontend] Not running` +- Output for backend is normal +- No error; script completes with exit code 0 + +--- + +#### UT-B5: Commit Comparison Logic +**Objective:** Verify build-check correctly compares local HEAD against container labels. + +**Setup:** +- Deploy at commit with known hash (e.g., abc1234) +- Verify container label has exact match +- Then create new commit without redeploying + +**Test Steps:** +1. Get deployed commit: `docker inspect gravl-backend --format '{{index .Config.Labels "org.opencontainers.image.revision"}}'` +2. Verify it matches current HEAD: `git rev-parse HEAD` +3. Create and commit new code: `git commit -am "test"` +4. Run build-check again + +**Success Criteria:** +- Before new commit: "OK: up to date" +- After new commit: "STALE: container is behind local code" +- Commit hashes are extracted and compared correctly + +--- + +## Part B: Integration Tests + +### Integration Test Suite + +#### IT-1: Full Deploy Cycle in Staging +**Objective:** Verify entire deployment workflow from code to running containers. + +**Preconditions:** +- Staging environment isolated from production +- Docker daemon running +- Git remotes configured +- Backend health endpoint functional + +**Test Steps:** + +1. **Baseline:** Document initial state + ```bash + git rev-parse HEAD > /tmp/baseline-commit.txt + scripts/build-check.sh | tee /tmp/baseline-check.txt + ``` + +2. **Commit code:** Push a non-breaking change + ```bash + git checkout -b test/it-1-$$ + echo "// test change" >> backend/src/index.js + git add backend/src/index.js + git commit -m "test: IT-1 change" + git push origin test/it-1-$$ + ``` + +3. **Deploy:** Run the full deployment + ```bash + scripts/deploy.sh | tee /tmp/deploy-log.txt + ``` + +4. **Verify:** Check health and container state + ```bash + scripts/build-check.sh | tee /tmp/postdeploy-check.txt + docker compose ps + curl -sf http://localhost:3001/api/health + ``` + +5. **Cleanup:** Revert test branch + ```bash + git checkout - + git branch -D test/it-1-$$ + ``` + +**Success Criteria:** +- `scripts/deploy.sh` completes with exit code 0 +- Health check passes within 60s +- `build-check.sh` shows "OK: up to date" for both containers +- Containers remain running after deploy completes +- Logs show proper git pull, build, and health check steps + +**Rollback Path (if failure occurs during IT-1):** +- See rollback procedures below + +--- + +#### IT-2: Deploy with Health Check Failure Recovery +**Objective:** Verify deployment handles intermittent health check failures and recovers. + +**Preconditions:** +- Backend can be temporarily paused/resumed +- System has `docker pause`/`docker unpause` available + +**Test Steps:** + +1. **Pre-deploy:** Baseline state + ```bash + scripts/build-check.sh > /tmp/it2-baseline.txt + ``` + +2. **Deploy start:** Trigger deployment (background) + ```bash + scripts/deploy.sh > /tmp/it2-deploy.log 2>&1 & + DEPLOY_PID=$! + ``` + +3. **Introduce pause:** After 3 seconds, pause backend (simulates slow startup) + ```bash + sleep 3 + docker pause gravl-backend + ``` + +4. **Allow recovery:** Unpause before timeout + ```bash + sleep 15 + docker unpause gravl-backend + ``` + +5. **Verify completion:** + ```bash + wait $DEPLOY_PID + RESULT=$? + ``` + +**Success Criteria:** +- Deploy script retries health check multiple times +- When backend recovers, health check passes +- Script completes with exit code 0 +- Containers transition to healthy state + +--- + +#### IT-3: Multi-Service Coordination +**Objective:** Verify frontend and backend both restart and sync properly. + +**Preconditions:** +- Both services configured in docker-compose.yml +- Frontend depends on backend being healthy + +**Test Steps:** + +1. **Deploy:** + ```bash + scripts/deploy.sh + ``` + +2. **Check startup order:** + - Grep logs for `[gravl-backend]` and `[gravl-frontend]` timestamps + - Verify backend logs appear before frontend health check + +3. **Verify networking:** + ```bash + docker exec gravl-frontend curl -sf http://gravl-backend:3001/api/health + docker exec gravl-backend curl -sf http://localhost:3001/api/health + ``` + +4. **Verify labels on both:** + ```bash + docker inspect gravl-backend gravl-frontend --format '{{.Name}} => {{index .Config.Labels "org.opencontainers.image.revision"}}' + ``` + +**Success Criteria:** +- Both containers start successfully +- Both containers have matching revision labels (same commit) +- Frontend can reach backend via container hostname +- Build-check shows "OK: up to date" for both + +--- + +## Part C: Rollback Procedures & Safety Checks + +### RB-1: Manual Rollback to Previous Commit + +**When to use:** Deployed code is broken and breaks production. + +**Prerequisites:** +- Know the last good commit hash +- Database migrations (if any) are reversible +- Users can be impacted for <5 min + +**Steps:** + +```bash +# 1. Document current state +git rev-parse HEAD > /tmp/rollback-from.txt + +# 2. Check out previous good commit +git checkout + +# 3. Redeploy (pulls and rebuilds) +scripts/deploy.sh + +# 4. Verify recovery +scripts/build-check.sh +curl -sf http://localhost:3001/api/health + +# 5. Log the incident +echo "Rolled back from $(cat /tmp/rollback-from.txt) to $good-commit-hash" >> logs/rollback.log +``` + +**Safety Checks:** +- ✅ Always verify health endpoint responds after rollback +- ✅ Check logs for errors: `docker logs gravl-backend | tail -50` +- ✅ Check database state if applicable (query active sessions, etc.) +- ✅ Notify team of rollback and reason + +--- + +### RB-2: Emergency Container Cleanup & Restart + +**When to use:** Containers are hung, corrupted, or in unknown state. + +**Prerequisites:** +- OK to restart services temporarily +- Data is persistent in volumes + +**Steps:** + +```bash +# 1. Stop all containers +docker compose down + +# 2. Remove images (to force fresh rebuild on next deploy) +docker rmi gravl-backend gravl-frontend + +# 3. Redeploy fresh +scripts/deploy.sh + +# 4. Verify +docker compose ps +scripts/build-check.sh +``` + +**Safety Checks:** +- ✅ Confirm volumes are not removed: `docker volume ls | grep gravl` +- ✅ Verify all containers start: `docker compose ps` shows all "Up" +- ✅ Health check passes within 60s +- ✅ No data loss from persistent stores + +--- + +### RB-3: Staged Rollback (Blue-Green Alternative) + +**When to use:** Can't tolerate any downtime. + +**Prerequisites:** +- Two separate services running (blue = prod, green = staging) +- Load balancer or router can switch traffic +- Synchronized database + +**Steps:** + +```bash +# 1. Deploy to green environment +cd /path/to/green/environment +git pull +docker compose build --no-cache +docker compose up -d + +# 2. Health check green +curl -sf http://green-backend:3001/api/health + +# 3. Route traffic to green (via load balancer/DNS) +# (This step is environment-specific) + +# 4. If issues, revert traffic to blue immediately +# (No containers to roll back on blue; it kept serving) + +# 5. Debug green offline +# (No downtime for users) +``` + +--- + +## Safety Checks Summary + +| Check | When | Command | Pass Criteria | +|-------|------|---------|---------------| +| Health | After deploy | `curl -sf http://localhost:3001/api/health` | HTTP 200 within 60s | +| Labels | After deploy | `docker inspect gravl-backend --format '{{index .Config.Labels "org.opencontainers.image.revision"}}'` | Non-empty, matches `git rev-parse HEAD` | +| Build status | Before deploy | `scripts/build-check.sh` | No STALE warnings | +| Container state | After deploy | `docker compose ps` | All containers "Up" | +| Logs | After deploy | `docker logs gravl-backend \| tail -20` | No ERROR or CRITICAL lines | + +--- + +## Running Tests Locally + +### Quick Test (5 min) +```bash +cd /workspace/gravl + +# UT-D1: Git pull +git pull + +# UT-D2: Build with no-cache +docker compose build --no-cache + +# UT-D3: Health check +curl -sf http://localhost:3001/api/health + +# UT-B1: Build-check +scripts/build-check.sh +``` + +### Full Suite (30 min) +```bash +# Clone test repo in /tmp +mkdir -p /tmp/gravl-test +cd /tmp/gravl-test +git clone /workspace/gravl . +git remote set-url origin /workspace/gravl + +# Run all UTs and IT-1 +# (See individual test steps above) +``` + +--- + +## Metrics to Monitor + +After each test, log these metrics to `logs/test-results.json`: +- Deploy time (seconds) +- Health check time (seconds) +- Build cache hit rate (% of layers reused) +- Container restart count +- Error count in logs + +Example: +```json +{ + "timestamp": "2026-03-03T18:21:00Z", + "test_name": "IT-1", + "deploy_time_sec": 45, + "health_check_time_sec": 8, + "result": "pass" +} +``` + +--- + +*Last updated: 2026-03-03 | Next review: After phase 07-04 completion* diff --git a/scripts/build-check.sh b/scripts/build-check.sh index 5cca7ef..539ae85 100755 --- a/scripts/build-check.sh +++ b/scripts/build-check.sh @@ -1,6 +1,25 @@ #!/bin/bash # Compare deployed container versions against local git HEAD # Warns if containers are stale (built from an older commit) +# +# Usage: +# ./scripts/build-check.sh +# +# What it shows: +# - Local git HEAD commit SHA +# - Each container's built commit SHA (from Docker labels) +# - Whether containers are up-to-date or stale +# - Warnings if labels are missing (pre-07-02 containers) +# +# Label fields read: +# - org.opencontainers.image.revision = Git commit SHA embedded by deploy.sh +# - org.opencontainers.image.created = Build timestamp (ISO 8601 format) +# +# Exit codes: +# 0 = All containers up to date +# 1+ = Warnings or stale containers detected +# +# See: /docs/DEPLOYMENT.md for troubleshooting SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" REPO_DIR="$(dirname "$SCRIPT_DIR")" @@ -11,14 +30,19 @@ LOCAL_COMMIT=$(git rev-parse HEAD) echo "Local HEAD: $(git rev-parse --short HEAD) ($LOCAL_COMMIT)" echo "" +# Check a single container's build status +# Args: $1 = container name check() { local name="$1" + # Container not running if ! docker inspect "$name" &>/dev/null; then echo "[$name] Not running" return fi + # Read Docker labels set by deploy.sh + # If labels are missing, container was built before phase 07-02 local commit date commit=$(docker inspect "$name" --format '{{index .Config.Labels "org.opencontainers.image.revision"}}' 2>/dev/null) date=$(docker inspect "$name" --format '{{index .Config.Labels "org.opencontainers.image.created"}}' 2>/dev/null) @@ -28,8 +52,10 @@ check() { return fi + # Display container info echo "[$name] Built: ${commit:0:7} on ${date:-unknown}" + # Compare container commit to local HEAD if [ "$commit" = "$LOCAL_COMMIT" ]; then echo "[$name] OK: up to date" else @@ -37,5 +63,6 @@ check() { fi } +# Check both containers check "gravl-backend" check "gravl-frontend" diff --git a/scripts/deploy.sh b/scripts/deploy.sh index 14d5fda..48ab2b2 100755 --- a/scripts/deploy.sh +++ b/scripts/deploy.sh @@ -1,6 +1,24 @@ #!/bin/bash # Gravl deployment script # Prevents stale containers by always building fresh with --no-cache +# +# Usage: +# ./scripts/deploy.sh +# +# What it does: +# 1. Pulls latest code from git +# 2. Captures build metadata (commit SHA, timestamp) +# 3. Builds fresh Docker images with --no-cache (no layer caching) +# 4. Restarts containers to use new images +# 5. Polls /api/health endpoint until backend is ready +# 6. Logs all steps to logs/deploy.log +# +# Rationale for --no-cache: +# Docker caching can hide stale assets (JS, CSS, images) when source files change. +# Using --no-cache ensures all layers rebuild fresh, guaranteeing new code is deployed. +# Trade-off: Slightly slower builds (30-60s vs 10-20s with cache), but safer. +# +# See: /docs/DEPLOYMENT.md for troubleshooting set -euo pipefail @@ -18,25 +36,32 @@ cd "$REPO_DIR" log "=== Deploy started ===" -# Pull latest code +# Pull latest code from remote +# Fails if there are local changes or merge conflicts log "Pulling latest code..." git pull -# Capture build metadata +# Capture build metadata to embed in Docker image labels +# These labels allow build-check.sh to verify deployed containers match local code GIT_COMMIT=$(git rev-parse HEAD) BUILD_DATE=$(date -u +"%Y-%m-%dT%H:%M:%SZ") log "Commit: $(git rev-parse --short HEAD) | Date: $BUILD_DATE" -# Build fresh images — no-cache prevents stale assets +# Build fresh images — no-cache prevents Docker layer caching +# This is critical for frontend deployments where CSS/JS changes might not be obvious +# to Docker's layer detection algorithm log "Building images (--no-cache)..." export GIT_COMMIT BUILD_DATE docker compose build --no-cache # Restart containers with new images +# --force-recreate stops old containers and removes them before starting new ones log "Starting containers..." docker compose up -d --force-recreate -# Health check: wait up to 60s for backend +# Health check: poll /api/health endpoint until it responds with 200 OK +# Timeout: 60 seconds (12 retries × 5 seconds each) +# This prevents deployment from completing if the backend is broken log "Health check..." for i in $(seq 1 12); do if curl -sf "$BACKEND_HEALTH" >/dev/null 2>&1; then