501 lines
12 KiB
Markdown
501 lines
12 KiB
Markdown
# Gravl Deployment Guide
|
|
|
|
This guide covers how to deploy Gravl's backend and frontend services using automated scripts, verify deployment status, and handle troubleshooting and recovery scenarios.
|
|
|
|
---
|
|
|
|
## Overview
|
|
|
|
Gravl uses Docker and Docker Compose for containerization. Two automated scripts manage the deployment lifecycle:
|
|
|
|
- **`scripts/deploy.sh`**: Pulls latest code, builds fresh images (with `--no-cache` to prevent stale assets), and starts containers with health checks
|
|
- **`scripts/build-check.sh`**: Verifies that running containers match the current git HEAD (detects stale deployments)
|
|
|
|
---
|
|
|
|
## Prerequisites
|
|
|
|
Before deploying, ensure you have:
|
|
|
|
1. **Docker & Docker Compose** installed and running
|
|
```bash
|
|
docker --version
|
|
docker compose version
|
|
```
|
|
|
|
2. **Git** configured with push/pull access to the repository
|
|
```bash
|
|
git remote -v
|
|
```
|
|
|
|
3. **Network access** to required ports:
|
|
- Backend: `localhost:3001` (health check at `http://localhost:3001/api/health`)
|
|
- Frontend: `localhost:3000` (or configured in `docker-compose.yml`)
|
|
|
|
4. **Sufficient disk space** for Docker images and volumes
|
|
```bash
|
|
docker system df
|
|
```
|
|
|
|
5. **No conflicting services** using ports 3000-3001
|
|
```bash
|
|
lsof -i :3000 -i :3001 # (macOS/Linux only)
|
|
```
|
|
|
|
---
|
|
|
|
## How to Run `deploy.sh`
|
|
|
|
### Basic Usage
|
|
|
|
```bash
|
|
cd /workspace/gravl
|
|
scripts/deploy.sh
|
|
```
|
|
|
|
### What It Does
|
|
|
|
1. **Git Pull**: Fetches and merges latest code from remote
|
|
- Exits if merge conflicts occur (manual resolution required)
|
|
|
|
2. **Captures Metadata**:
|
|
- Current git commit hash
|
|
- Build timestamp
|
|
- These are stored as Docker image labels for later verification
|
|
|
|
3. **Builds Docker Images** (`--no-cache`):
|
|
- Rebuilds all layers (no caching) to prevent stale assets
|
|
- Applies git commit and build timestamp as labels
|
|
|
|
4. **Starts Containers**:
|
|
- Uses `docker compose up -d --force-recreate` to ensure clean start
|
|
- Both backend and frontend containers are started
|
|
|
|
5. **Health Check**:
|
|
- Waits up to 60 seconds for backend to respond on `/api/health`
|
|
- Retries every 5 seconds (12 attempts max)
|
|
- Fails with exit code 1 if health check times out
|
|
|
|
### Exit Codes
|
|
|
|
| Code | Meaning | Next Steps |
|
|
|------|---------|-----------|
|
|
| 0 | Success | Deployment complete; containers healthy |
|
|
| 1 | Failure | See troubleshooting below |
|
|
|
|
### Logs
|
|
|
|
All deploy activity is logged to `logs/deploy.log`:
|
|
|
|
```bash
|
|
tail -50 logs/deploy.log # Last 50 lines
|
|
grep ERROR logs/deploy.log # Find errors
|
|
```
|
|
|
|
### Environment Variables
|
|
|
|
Optional env vars can be set before running `deploy.sh`:
|
|
|
|
| Variable | Default | Purpose |
|
|
|----------|---------|---------|
|
|
| `GIT_COMMIT` | auto-detected | Override git commit label (not recommended) |
|
|
| `BUILD_DATE` | auto-detected | Override build timestamp (not recommended) |
|
|
|
|
---
|
|
|
|
## How to Check Build Status (`build-check.sh`)
|
|
|
|
Run this command anytime to verify deployed containers match your local code:
|
|
|
|
```bash
|
|
scripts/build-check.sh
|
|
```
|
|
|
|
### Output Example
|
|
|
|
**Healthy deployment:**
|
|
```
|
|
Local HEAD: abc1234 (abc1234567890abcdef1234567890abcdef123456)
|
|
|
|
[gravl-backend] Built: abc1234 on 2026-03-03T18:21:00Z
|
|
[gravl-backend] OK: up to date
|
|
[gravl-frontend] Built: abc1234 on 2026-03-03T18:21:00Z
|
|
[gravl-frontend] OK: up to date
|
|
```
|
|
|
|
**Stale containers (code updated, not redeployed):**
|
|
```
|
|
Local HEAD: xyz5678 (xyz5678...)
|
|
|
|
[gravl-backend] Built: abc1234 on 2026-03-03T18:21:00Z
|
|
[gravl-backend] STALE: container is behind local code — run scripts/deploy.sh
|
|
[gravl-frontend] Built: abc1234 on 2026-03-03T18:21:00Z
|
|
[gravl-frontend] STALE: container is behind local code — run scripts/deploy.sh
|
|
```
|
|
|
|
**Missing labels (container built manually, not via deploy.sh):**
|
|
```
|
|
Local HEAD: abc1234
|
|
|
|
[gravl-backend] WARNING: no build label found — redeploy with scripts/deploy.sh to add tracking
|
|
[gravl-frontend] Not running
|
|
```
|
|
|
|
### Exit Codes
|
|
|
|
| Code | Meaning |
|
|
|------|---------|
|
|
| 0 | All checks completed (warnings don't fail; see output for status) |
|
|
| (no error exit) | Missing containers are noted but don't cause failure |
|
|
|
|
---
|
|
|
|
## Troubleshooting
|
|
|
|
### Health Check Failures
|
|
|
|
**Symptom:** `ERROR: Health check failed after 60s`
|
|
|
|
**Causes & Solutions:**
|
|
|
|
1. **Backend service didn't start**
|
|
```bash
|
|
docker logs gravl-backend | tail -20
|
|
# Look for:
|
|
# - Port conflicts (ERR_EADDRINUSE)
|
|
# - Missing dependencies (module not found)
|
|
# - Database connection errors
|
|
```
|
|
|
|
2. **Port 3001 is already in use**
|
|
```bash
|
|
lsof -i :3001 # Find what's using it
|
|
docker port gravl-backend # Check exposed port
|
|
kill -9 <PID> # Kill conflicting process (if safe)
|
|
scripts/deploy.sh # Retry
|
|
```
|
|
|
|
3. **Network issue between host and container**
|
|
```bash
|
|
docker inspect gravl-backend --format '{{.NetworkSettings.IPAddress}}'
|
|
curl -sf http://<container-ip>:3001/api/health # Test directly
|
|
```
|
|
|
|
4. **Backend code has syntax error**
|
|
```bash
|
|
docker logs gravl-backend 2>&1 | grep -i "syntax\|error\|exception"
|
|
# Check backend/src/index.js for obvious errors
|
|
# Revert recent changes: git log --oneline -5 && git checkout <good-commit>
|
|
```
|
|
|
|
**Quick recovery:**
|
|
|
|
```bash
|
|
# 1. Stop everything
|
|
docker compose down
|
|
|
|
# 2. Check backend logs
|
|
docker compose up -d gravl-backend
|
|
sleep 5
|
|
docker logs gravl-backend | tail -50
|
|
|
|
# 3. If logs show errors, fix code and retry
|
|
git diff HEAD~1..HEAD backend/src/
|
|
# ... fix issues ...
|
|
scripts/deploy.sh
|
|
```
|
|
|
|
---
|
|
|
|
### Stale Containers
|
|
|
|
**Symptom:** `build-check.sh` shows `STALE: container is behind local code`
|
|
|
|
**Causes:**
|
|
|
|
- Code was updated (`git pull`) but `deploy.sh` hasn't been run
|
|
- Deployment failed partway through
|
|
- Manual restart without redeploy
|
|
|
|
**Solution:**
|
|
|
|
```bash
|
|
scripts/deploy.sh
|
|
scripts/build-check.sh # Verify update
|
|
```
|
|
|
|
---
|
|
|
|
### Missing Build Labels
|
|
|
|
**Symptom:** `WARNING: no build label found — redeploy with scripts/deploy.sh`
|
|
|
|
**Causes:**
|
|
|
|
- Container was built with `docker compose build` directly (not via `deploy.sh`)
|
|
- Container predates the labeling system
|
|
|
|
**Solution:**
|
|
|
|
```bash
|
|
# Re-deploy to add labels
|
|
scripts/deploy.sh
|
|
```
|
|
|
|
---
|
|
|
|
### Container Won't Start (CrashLoopBackOff / Exited)
|
|
|
|
**Symptom:** `docker compose ps` shows container in "Exited" state
|
|
|
|
**Steps:**
|
|
|
|
1. **Check container logs**
|
|
```bash
|
|
docker logs gravl-backend --tail 50
|
|
docker logs gravl-frontend --tail 50
|
|
```
|
|
|
|
2. **Check docker-compose.yml for typos**
|
|
```bash
|
|
docker compose config # Validates syntax
|
|
```
|
|
|
|
3. **Inspect health check endpoint**
|
|
```bash
|
|
curl -v http://localhost:3001/api/health
|
|
# Should see HTTP 200, not 404 or 500
|
|
```
|
|
|
|
4. **If all else fails, clean rebuild**
|
|
```bash
|
|
docker compose down
|
|
docker rmi gravl-backend gravl-frontend
|
|
docker system prune -f
|
|
scripts/deploy.sh
|
|
```
|
|
|
|
---
|
|
|
|
### Database Connection Issues
|
|
|
|
**Symptom:** Backend logs show `Connection refused` or `ECONNREFUSED`
|
|
|
|
**Causes:**
|
|
- Database service not running
|
|
- Wrong host/port in `.env` or backend code
|
|
- Network issue between containers
|
|
|
|
**Solutions:**
|
|
|
|
1. **Check database service status** (if applicable)
|
|
```bash
|
|
docker compose ps # All services running?
|
|
docker network ls # Check gravl network exists
|
|
```
|
|
|
|
2. **Verify connection string in `.env`**
|
|
```bash
|
|
cat .env | grep -i database
|
|
# Should match docker-compose.yml service name (e.g., gravl-db:5432)
|
|
```
|
|
|
|
3. **Test connection from backend container**
|
|
```bash
|
|
docker exec gravl-backend ping gravl-db
|
|
docker exec gravl-backend curl http://gravl-db:5432 # If HTTP, adjust port
|
|
```
|
|
|
|
---
|
|
|
|
### Disk Space Issues
|
|
|
|
**Symptom:** `no space left on device` during build
|
|
|
|
**Solution:**
|
|
|
|
```bash
|
|
# Check disk usage
|
|
docker system df
|
|
|
|
# Clean up unused images/containers
|
|
docker system prune -a --volumes
|
|
|
|
# Then retry deploy
|
|
scripts/deploy.sh
|
|
```
|
|
|
|
---
|
|
|
|
## Recovery Procedures
|
|
|
|
### Manual Rollback to Previous Commit
|
|
|
|
Use this when the deployed code is broken and you need to quickly revert.
|
|
|
|
```bash
|
|
# 1. Find the last good commit
|
|
git log --oneline -10 # Review recent commits
|
|
|
|
# 2. Check out the known-good commit
|
|
git checkout <commit-hash>
|
|
|
|
# 3. Redeploy
|
|
scripts/deploy.sh
|
|
|
|
# 4. Verify
|
|
scripts/build-check.sh
|
|
curl -sf http://localhost:3001/api/health
|
|
|
|
# 5. Document the incident
|
|
echo "Rolled back to <commit-hash> due to <reason>" >> logs/rollback.log
|
|
```
|
|
|
|
### Emergency Container Cleanup
|
|
|
|
Use this when containers are hung, corrupted, or in an unknown state.
|
|
|
|
```bash
|
|
# 1. Stop all services
|
|
docker compose down
|
|
|
|
# 2. Remove images (forces fresh rebuild)
|
|
docker rmi gravl-backend gravl-frontend
|
|
|
|
# 3. Clear unused volumes (optional; use with caution!)
|
|
# docker volume prune
|
|
|
|
# 4. Rebuild from scratch
|
|
scripts/deploy.sh
|
|
|
|
# 5. Verify all containers running and healthy
|
|
docker compose ps
|
|
scripts/build-check.sh
|
|
curl -sf http://localhost:3001/api/health
|
|
```
|
|
|
|
**Safety Check:** If your data is in Docker volumes, `docker volume prune` will destroy them. Skip this step unless you're sure you don't need the data.
|
|
|
|
### Staged Rollback (Zero-Downtime)
|
|
|
|
If you're running a blue-green deployment setup:
|
|
|
|
```bash
|
|
# 1. Deploy to green environment
|
|
cd /path/to/green
|
|
git pull && docker compose build --no-cache && docker compose up -d
|
|
|
|
# 2. Test green (health check, smoke tests)
|
|
curl -sf http://green-backend:3001/api/health
|
|
|
|
# 3. Switch traffic to green (via load balancer or DNS)
|
|
# (Implementation depends on your infrastructure)
|
|
|
|
# 4. If green has issues, revert traffic to blue immediately
|
|
# (Blue kept serving; no downtime)
|
|
|
|
# 5. Debug green offline
|
|
docker logs gravl-backend
|
|
```
|
|
|
|
---
|
|
|
|
## Monitoring After Deployment
|
|
|
|
### Immediate Checks (after `deploy.sh` completes)
|
|
|
|
```bash
|
|
# Containers are running
|
|
docker compose ps
|
|
|
|
# Backend is healthy
|
|
curl -sf http://localhost:3001/api/health | jq .
|
|
|
|
# Containers match local code
|
|
scripts/build-check.sh
|
|
|
|
# Logs have no errors
|
|
docker logs gravl-backend 2>&1 | grep -i error | head -5
|
|
```
|
|
|
|
### Ongoing Checks (periodically)
|
|
|
|
```bash
|
|
# Run build-check regularly (cron every 30 min, or manual)
|
|
scripts/build-check.sh
|
|
|
|
# Monitor resource usage
|
|
docker stats gravl-backend gravl-frontend
|
|
|
|
# Audit logs for issues
|
|
docker logs gravl-backend --since 1h --until now | grep ERROR
|
|
```
|
|
|
|
### Example Monitoring Script
|
|
|
|
```bash
|
|
#!/bin/bash
|
|
# Save as scripts/health-monitor.sh
|
|
set -euo pipefail
|
|
|
|
HEALTHY=true
|
|
|
|
# Check containers running
|
|
docker compose ps | grep -q "Up" || HEALTHY=false
|
|
|
|
# Check health endpoint
|
|
curl -sf http://localhost:3001/api/health || HEALTHY=false
|
|
|
|
# Check for stale containers
|
|
scripts/build-check.sh | grep -q "STALE" && HEALTHY=false
|
|
|
|
if [ "$HEALTHY" = "true" ]; then
|
|
echo "[$(date)] Gravl is healthy ✓"
|
|
else
|
|
echo "[$(date)] Gravl has issues! See above." >&2
|
|
exit 1
|
|
fi
|
|
```
|
|
|
|
---
|
|
|
|
## Best Practices
|
|
|
|
1. **Always run `build-check.sh` before deploying changes**
|
|
- Ensures you know current state
|
|
- Catches stale containers early
|
|
|
|
2. **Review changes before deploying**
|
|
```bash
|
|
git log --oneline -5 # Recent commits
|
|
git diff origin/main..HEAD # What will be deployed
|
|
```
|
|
|
|
3. **Test in staging first**
|
|
- Separate staging environment for pre-production testing
|
|
- Deploy to staging, verify, then deploy to production
|
|
|
|
4. **Keep logs rotated**
|
|
- `logs/deploy.log` can grow large
|
|
- Use `logrotate` or manual cleanup: `tail -1000 logs/deploy.log > logs/deploy.log.1 && > logs/deploy.log`
|
|
|
|
5. **Automate regular checks**
|
|
- Cron job to run `build-check.sh` every 30 minutes
|
|
- Send alerts if "STALE" or "WARNING" found
|
|
|
|
6. **Document rollbacks**
|
|
- Always log why you rolled back
|
|
- Review patterns (e.g., "rolled back 3 times this week" = code review process failing)
|
|
|
|
---
|
|
|
|
## See Also
|
|
|
|
- **Testing**: [DEPLOYMENT_TEST_PLAN.md](./DEPLOYMENT_TEST_PLAN.md) — comprehensive test scenarios
|
|
- **Code style**: [CODING-CONVENTIONS.md](./CODING-CONVENTIONS.md)
|
|
- **Architecture**: Backend README or architecture docs (if available)
|
|
|
|
---
|
|
|
|
*Last updated: 2026-03-03 | Maintained by: Gravl Development Team*
|