12 KiB
Gravl Deployment Guide
This guide covers how to deploy Gravl's backend and frontend services using automated scripts, verify deployment status, and handle troubleshooting and recovery scenarios.
Overview
Gravl uses Docker and Docker Compose for containerization. Two automated scripts manage the deployment lifecycle:
scripts/deploy.sh: Pulls latest code, builds fresh images (with--no-cacheto prevent stale assets), and starts containers with health checksscripts/build-check.sh: Verifies that running containers match the current git HEAD (detects stale deployments)
Prerequisites
Before deploying, ensure you have:
-
Docker & Docker Compose installed and running
docker --version docker compose version -
Git configured with push/pull access to the repository
git remote -v -
Network access to required ports:
- Backend:
localhost:3001(health check athttp://localhost:3001/api/health) - Frontend:
localhost:3000(or configured indocker-compose.yml)
- Backend:
-
Sufficient disk space for Docker images and volumes
docker system df -
No conflicting services using ports 3000-3001
lsof -i :3000 -i :3001 # (macOS/Linux only)
How to Run deploy.sh
Basic Usage
cd /workspace/gravl
scripts/deploy.sh
What It Does
-
Git Pull: Fetches and merges latest code from remote
- Exits if merge conflicts occur (manual resolution required)
-
Captures Metadata:
- Current git commit hash
- Build timestamp
- These are stored as Docker image labels for later verification
-
Builds Docker Images (
--no-cache):- Rebuilds all layers (no caching) to prevent stale assets
- Applies git commit and build timestamp as labels
-
Starts Containers:
- Uses
docker compose up -d --force-recreateto ensure clean start - Both backend and frontend containers are started
- Uses
-
Health Check:
- Waits up to 60 seconds for backend to respond on
/api/health - Retries every 5 seconds (12 attempts max)
- Fails with exit code 1 if health check times out
- Waits up to 60 seconds for backend to respond on
Exit Codes
| Code | Meaning | Next Steps |
|---|---|---|
| 0 | Success | Deployment complete; containers healthy |
| 1 | Failure | See troubleshooting below |
Logs
All deploy activity is logged to logs/deploy.log:
tail -50 logs/deploy.log # Last 50 lines
grep ERROR logs/deploy.log # Find errors
Environment Variables
Optional env vars can be set before running deploy.sh:
| Variable | Default | Purpose |
|---|---|---|
GIT_COMMIT |
auto-detected | Override git commit label (not recommended) |
BUILD_DATE |
auto-detected | Override build timestamp (not recommended) |
How to Check Build Status (build-check.sh)
Run this command anytime to verify deployed containers match your local code:
scripts/build-check.sh
Output Example
Healthy deployment:
Local HEAD: abc1234 (abc1234567890abcdef1234567890abcdef123456)
[gravl-backend] Built: abc1234 on 2026-03-03T18:21:00Z
[gravl-backend] OK: up to date
[gravl-frontend] Built: abc1234 on 2026-03-03T18:21:00Z
[gravl-frontend] OK: up to date
Stale containers (code updated, not redeployed):
Local HEAD: xyz5678 (xyz5678...)
[gravl-backend] Built: abc1234 on 2026-03-03T18:21:00Z
[gravl-backend] STALE: container is behind local code — run scripts/deploy.sh
[gravl-frontend] Built: abc1234 on 2026-03-03T18:21:00Z
[gravl-frontend] STALE: container is behind local code — run scripts/deploy.sh
Missing labels (container built manually, not via deploy.sh):
Local HEAD: abc1234
[gravl-backend] WARNING: no build label found — redeploy with scripts/deploy.sh to add tracking
[gravl-frontend] Not running
Exit Codes
| Code | Meaning |
|---|---|
| 0 | All checks completed (warnings don't fail; see output for status) |
| (no error exit) | Missing containers are noted but don't cause failure |
Troubleshooting
Health Check Failures
Symptom: ERROR: Health check failed after 60s
Causes & Solutions:
-
Backend service didn't start
docker logs gravl-backend | tail -20 # Look for: # - Port conflicts (ERR_EADDRINUSE) # - Missing dependencies (module not found) # - Database connection errors -
Port 3001 is already in use
lsof -i :3001 # Find what's using it docker port gravl-backend # Check exposed port kill -9 <PID> # Kill conflicting process (if safe) scripts/deploy.sh # Retry -
Network issue between host and container
docker inspect gravl-backend --format '{{.NetworkSettings.IPAddress}}' curl -sf http://<container-ip>:3001/api/health # Test directly -
Backend code has syntax error
docker logs gravl-backend 2>&1 | grep -i "syntax\|error\|exception" # Check backend/src/index.js for obvious errors # Revert recent changes: git log --oneline -5 && git checkout <good-commit>
Quick recovery:
# 1. Stop everything
docker compose down
# 2. Check backend logs
docker compose up -d gravl-backend
sleep 5
docker logs gravl-backend | tail -50
# 3. If logs show errors, fix code and retry
git diff HEAD~1..HEAD backend/src/
# ... fix issues ...
scripts/deploy.sh
Stale Containers
Symptom: build-check.sh shows STALE: container is behind local code
Causes:
- Code was updated (
git pull) butdeploy.shhasn't been run - Deployment failed partway through
- Manual restart without redeploy
Solution:
scripts/deploy.sh
scripts/build-check.sh # Verify update
Missing Build Labels
Symptom: WARNING: no build label found — redeploy with scripts/deploy.sh
Causes:
- Container was built with
docker compose builddirectly (not viadeploy.sh) - Container predates the labeling system
Solution:
# Re-deploy to add labels
scripts/deploy.sh
Container Won't Start (CrashLoopBackOff / Exited)
Symptom: docker compose ps shows container in "Exited" state
Steps:
-
Check container logs
docker logs gravl-backend --tail 50 docker logs gravl-frontend --tail 50 -
Check docker-compose.yml for typos
docker compose config # Validates syntax -
Inspect health check endpoint
curl -v http://localhost:3001/api/health # Should see HTTP 200, not 404 or 500 -
If all else fails, clean rebuild
docker compose down docker rmi gravl-backend gravl-frontend docker system prune -f scripts/deploy.sh
Database Connection Issues
Symptom: Backend logs show Connection refused or ECONNREFUSED
Causes:
- Database service not running
- Wrong host/port in
.envor backend code - Network issue between containers
Solutions:
-
Check database service status (if applicable)
docker compose ps # All services running? docker network ls # Check gravl network exists -
Verify connection string in
.envcat .env | grep -i database # Should match docker-compose.yml service name (e.g., gravl-db:5432) -
Test connection from backend container
docker exec gravl-backend ping gravl-db docker exec gravl-backend curl http://gravl-db:5432 # If HTTP, adjust port
Disk Space Issues
Symptom: no space left on device during build
Solution:
# Check disk usage
docker system df
# Clean up unused images/containers
docker system prune -a --volumes
# Then retry deploy
scripts/deploy.sh
Recovery Procedures
Manual Rollback to Previous Commit
Use this when the deployed code is broken and you need to quickly revert.
# 1. Find the last good commit
git log --oneline -10 # Review recent commits
# 2. Check out the known-good commit
git checkout <commit-hash>
# 3. Redeploy
scripts/deploy.sh
# 4. Verify
scripts/build-check.sh
curl -sf http://localhost:3001/api/health
# 5. Document the incident
echo "Rolled back to <commit-hash> due to <reason>" >> logs/rollback.log
Emergency Container Cleanup
Use this when containers are hung, corrupted, or in an unknown state.
# 1. Stop all services
docker compose down
# 2. Remove images (forces fresh rebuild)
docker rmi gravl-backend gravl-frontend
# 3. Clear unused volumes (optional; use with caution!)
# docker volume prune
# 4. Rebuild from scratch
scripts/deploy.sh
# 5. Verify all containers running and healthy
docker compose ps
scripts/build-check.sh
curl -sf http://localhost:3001/api/health
Safety Check: If your data is in Docker volumes, docker volume prune will destroy them. Skip this step unless you're sure you don't need the data.
Staged Rollback (Zero-Downtime)
If you're running a blue-green deployment setup:
# 1. Deploy to green environment
cd /path/to/green
git pull && docker compose build --no-cache && docker compose up -d
# 2. Test green (health check, smoke tests)
curl -sf http://green-backend:3001/api/health
# 3. Switch traffic to green (via load balancer or DNS)
# (Implementation depends on your infrastructure)
# 4. If green has issues, revert traffic to blue immediately
# (Blue kept serving; no downtime)
# 5. Debug green offline
docker logs gravl-backend
Monitoring After Deployment
Immediate Checks (after deploy.sh completes)
# Containers are running
docker compose ps
# Backend is healthy
curl -sf http://localhost:3001/api/health | jq .
# Containers match local code
scripts/build-check.sh
# Logs have no errors
docker logs gravl-backend 2>&1 | grep -i error | head -5
Ongoing Checks (periodically)
# Run build-check regularly (cron every 30 min, or manual)
scripts/build-check.sh
# Monitor resource usage
docker stats gravl-backend gravl-frontend
# Audit logs for issues
docker logs gravl-backend --since 1h --until now | grep ERROR
Example Monitoring Script
#!/bin/bash
# Save as scripts/health-monitor.sh
set -euo pipefail
HEALTHY=true
# Check containers running
docker compose ps | grep -q "Up" || HEALTHY=false
# Check health endpoint
curl -sf http://localhost:3001/api/health || HEALTHY=false
# Check for stale containers
scripts/build-check.sh | grep -q "STALE" && HEALTHY=false
if [ "$HEALTHY" = "true" ]; then
echo "[$(date)] Gravl is healthy ✓"
else
echo "[$(date)] Gravl has issues! See above." >&2
exit 1
fi
Best Practices
-
Always run
build-check.shbefore deploying changes- Ensures you know current state
- Catches stale containers early
-
Review changes before deploying
git log --oneline -5 # Recent commits git diff origin/main..HEAD # What will be deployed -
Test in staging first
- Separate staging environment for pre-production testing
- Deploy to staging, verify, then deploy to production
-
Keep logs rotated
logs/deploy.logcan grow large- Use
logrotateor manual cleanup:tail -1000 logs/deploy.log > logs/deploy.log.1 && > logs/deploy.log
-
Automate regular checks
- Cron job to run
build-check.shevery 30 minutes - Send alerts if "STALE" or "WARNING" found
- Cron job to run
-
Document rollbacks
- Always log why you rolled back
- Review patterns (e.g., "rolled back 3 times this week" = code review process failing)
See Also
- Testing: DEPLOYMENT_TEST_PLAN.md — comprehensive test scenarios
- Code style: CODING-CONVENTIONS.md
- Architecture: Backend README or architecture docs (if available)
Last updated: 2026-03-03 | Maintained by: Gravl Development Team