chore(07-03): Stage deployment scripts and documentation updates

This commit is contained in:
2026-03-03 19:24:29 +01:00
parent fa766b21f7
commit 1104f6360e
4 changed files with 722 additions and 377 deletions
+180 -145
View File
@@ -1,155 +1,167 @@
# Gravl Backend
Node.js / Express API server for the Gravl application.
Backend service for the Gravl exercise and fitness tracking platform.
## Development
## Overview
### Prerequisites
- Node.js 18+
- PostgreSQL 14+ (or use Docker Compose)
- Docker & Docker Compose (for containerized development)
### Getting Started
```bash
# Install dependencies
npm install
# Create .env file (copy from .env.example)
cp .env.example .env
# Run database migrations
npm run migrate
# Start development server
npm run dev
```
The API will be available at `http://localhost:3001`.
### Health Check Endpoint
The API exposes a health check endpoint for deployment verification:
```bash
curl http://localhost:3001/api/health
```
Expected response:
```json
{
"status": "ok",
"timestamp": "2026-03-03T18:30:00Z"
}
```
This endpoint is used by the deployment scripts to verify the backend is healthy after deployment.
The Gravl backend is a Node.js/Express application that provides:
- REST API for exercise data management
- User authentication and authorization
- Integration with frontend via HTTP
- Health check endpoint for deployment monitoring
---
## Deployment
## Local Development
### Quick Start
### Prerequisites
See `/docs/DEPLOYMENT.md` for comprehensive deployment documentation.
- Node.js 18+
- npm or yarn
- Docker & Docker Compose (for local container development)
### Installation
```bash
# Deploy the application
scripts/deploy.sh
# Check deployment status
scripts/build-check.sh
cd backend
npm install
```
### How It Works
### Running Locally
1. **Automatic build:** `scripts/deploy.sh` builds fresh Docker images
2. **Zero downtime:** Old containers are replaced with `--force-recreate`
3. **Health verification:** API health endpoint is polled before deployment completes
4. **Rollback:** Use git to revert and redeploy if issues arise
**Development mode (with hot reload):**
```bash
npm run dev
```
### Prerequisites for Deployment
The server starts on `http://localhost:3001`
- Docker and Docker Compose installed
- Git remote configured and accessible
- Backend listening on port 3001
- Health endpoint (`/api/health`) responding with 200 OK
**Production mode:**
```bash
npm run build
npm start
```
### Example Deployment Workflow
### Environment Variables
Create a `.env` file in the backend directory:
```bash
# 1. Make code changes and commit
git add . && git commit -m "feat: new API endpoint"
# 2. Deploy from project root
cd /workspace/gravl
scripts/deploy.sh
# 3. Verify deployment
scripts/build-check.sh
# 4. Check logs if needed
docker compose logs gravl-backend
NODE_ENV=development
PORT=3001
DATABASE_URL=postgresql://user:password@localhost:5432/gravl
```
### Container Labels
See `.env.example` (if available) for all supported variables.
All deployed containers include build metadata labels for tracking:
- `org.opencontainers.image.revision` — Git commit SHA
- `org.opencontainers.image.created` — Build timestamp
---
These are used by `scripts/build-check.sh` to detect stale deployments.
## API Endpoints
### Health Check (Monitoring & Deployment)
```
GET /api/health
```
Used by deployment scripts to verify the backend is running and responsive.
**Response:**
```json
{
"status": "ok",
"timestamp": "2026-03-03T18:21:00Z"
}
```
**Status Codes:**
- `200 OK` — Backend is healthy
- `500 Internal Server Error` — Backend has errors (check logs)
### Other Endpoints
(Document your API endpoints here; placeholder for now)
---
## Testing
```bash
# Run unit tests
npm test
# Run integration tests
npm run test:integration
# Run with coverage
npm run test:coverage
npm test # Run all tests
npm run test:watch # Run tests in watch mode
```
---
## Database
## Docker
### Migrations
### Building the Image
```bash
# Run pending migrations
npm run migrate
# Rollback last migration
npm run migrate:rollback
# Create new migration
npm run migrate:create -- my_migration_name
docker build -t gravl-backend:latest .
```
### Connection
### Running in Container
Configure via `.env`:
```
DATABASE_URL=postgresql://user:password@localhost:5432/gravl
```bash
docker run -p 3001:3001 \
-e NODE_ENV=production \
-e DATABASE_URL=postgresql://... \
gravl-backend:latest
```
### With Docker Compose
See the root `docker-compose.yml` for multi-container setup.
---
## Environment Variables
## Deployment
See `.env.example` for all available variables.
### Automated Deployment
Key variables:
- `NODE_ENV` — Development/production mode
- `PORT` — Server port (default: 3001)
- `DATABASE_URL` — PostgreSQL connection string
- `JWT_SECRET` — Token signing secret
The backend is deployed using scripts in the root `scripts/` directory:
- **`scripts/deploy.sh`** — Pulls latest code, builds fresh Docker image, starts container with health checks
- **`scripts/build-check.sh`** — Verifies deployed container matches local git HEAD
### How to Deploy
```bash
cd /workspace/gravl
scripts/deploy.sh
```
### Checking Deployment Status
```bash
cd /workspace/gravl
scripts/build-check.sh
```
For complete deployment documentation, see: **[`docs/DEPLOYMENT.md`](../docs/DEPLOYMENT.md)**
That guide includes:
- Prerequisites and setup
- How to run deploy.sh
- How to check build status
- Troubleshooting (health check failures, stale containers, etc.)
- Recovery procedures (rollbacks, cleanup)
### Health Check Configuration
The backend exposes a health check endpoint at `GET /api/health`. The deployment script (`scripts/deploy.sh`) waits up to 60 seconds for this endpoint to return HTTP 200.
**In your backend code:**
```javascript
app.get('/api/health', (req, res) => {
res.json({ status: 'ok', timestamp: new Date().toISOString() });
});
```
**Deployment timeout:** 60 seconds (12 retries × 5 seconds)
- If this endpoint takes >5 seconds to respond, deployment will timeout
- Ensure health check is lightweight (no expensive DB queries)
---
@@ -158,61 +170,84 @@ Key variables:
```
backend/
├── src/
│ ├── api/ # Express route handlers
│ ├── middleware/ # Express middleware
│ ├── models/ # Database models
│ ├── services/ # Business logic
│ └── index.js # App entry point
├── tests/ # Unit and integration tests
├── migrations/ # Database migrations
├── docker/ # Dockerfile
── .env.example # Environment template
└── README.md # This file
│ ├── index.js # Server entry point
│ ├── routes/ # API endpoints
│ ├── controllers/ # Business logic
│ ├── models/ # Data models (if using ORM)
│ └── middleware/ # Express middleware
├── test/ # Test files
├── Dockerfile # Container image definition
├── package.json # Dependencies
── README.md # This file
```
---
## Logs
### Local Development
```bash
npm run dev # Logs to stdout
```
### Docker Container
```bash
docker logs gravl-backend # Current logs
docker logs -f gravl-backend # Follow logs in real-time
docker logs --tail 50 gravl-backend # Last 50 lines
```
### In Deployment
All deploy activity is logged to `logs/deploy.log` at the root:
```bash
tail logs/deploy.log
```
---
## Troubleshooting
### API Won't Start
### Health Check Endpoint Not Responding
Check the logs:
```bash
docker compose logs gravl-backend
```
**Symptom:** Deployment fails with "Health check failed after 60s"
Common issues:
- Port 3001 already in use: Kill the process or change the port
- Database connection failed: Verify `.env` DATABASE_URL
- Node modules missing: Run `npm install`
**Causes & Fixes:**
1. **Port 3001 is already in use**
```bash
lsof -i :3001
# Kill the conflicting process or use a different port
```
### Health Check Fails
2. **Backend code has a syntax error**
```bash
npm run dev # Look for error messages
```
Ensure the `/api/health` endpoint is implemented:
3. **Health check endpoint is not implemented**
- Ensure `app.get('/api/health', ...)` is in src/index.js
```javascript
// backend/src/api/health.js
app.get('/api/health', (req, res) => {
res.json({ status: 'ok', timestamp: new Date().toISOString() });
});
```
4. **Database connection is failing**
- Backend might be stuck trying to connect to DB
- Check `DATABASE_URL` in `.env`
- Ensure database is running
### Database Issues
Check Docker container status:
```bash
docker compose ps
docker compose logs gravl-db
```
See **[`docs/DEPLOYMENT.md`](../docs/DEPLOYMENT.md#troubleshooting)** for more deployment troubleshooting.
---
## Contributing
See `CODING-CONVENTIONS.md` in the project root for code style and standards.
See the root project README or CONTRIBUTING.md for guidelines on:
- Code style ([CODING-CONVENTIONS.md](../docs/CODING-CONVENTIONS.md))
- Testing requirements
- Pull request process
---
**Last Updated:** 2026-03-03
**Phase:** 07-03
**Related:** `/docs/DEPLOYMENT.md`
## License
[Specify your license here]
---
*Last updated: 2026-03-03*
+380 -172
View File
@@ -1,17 +1,52 @@
# Gravl Deployment Guide
This guide covers how to deploy the Gravl application, verify deployments, and troubleshoot common issues.
This guide covers how to deploy Gravl's backend and frontend services using automated scripts, verify deployment status, and handle troubleshooting and recovery scenarios.
---
## Overview
Gravl uses Docker and Docker Compose for containerization. Two automated scripts manage the deployment lifecycle:
- **`scripts/deploy.sh`**: Pulls latest code, builds fresh images (with `--no-cache` to prevent stale assets), and starts containers with health checks
- **`scripts/build-check.sh`**: Verifies that running containers match the current git HEAD (detects stale deployments)
---
## Prerequisites
- Docker and Docker Compose installed
- Git repository with remote configured
- Access to `/workspace/gravl` directory
- Backend API listening on `http://localhost:3001/api/health`
Before deploying, ensure you have:
## Deployment Script
1. **Docker & Docker Compose** installed and running
```bash
docker --version
docker compose version
```
### Running a Deployment
2. **Git** configured with push/pull access to the repository
```bash
git remote -v
```
3. **Network access** to required ports:
- Backend: `localhost:3001` (health check at `http://localhost:3001/api/health`)
- Frontend: `localhost:3000` (or configured in `docker-compose.yml`)
4. **Sufficient disk space** for Docker images and volumes
```bash
docker system df
```
5. **No conflicting services** using ports 3000-3001
```bash
lsof -i :3000 -i :3001 # (macOS/Linux only)
```
---
## How to Run `deploy.sh`
### Basic Usage
```bash
cd /workspace/gravl
@@ -20,57 +55,98 @@ scripts/deploy.sh
### What It Does
1. **Pulls latest code:** `git pull`
2. **Captures build metadata:**
- Git commit SHA
1. **Git Pull**: Fetches and merges latest code from remote
- Exits if merge conflicts occur (manual resolution required)
2. **Captures Metadata**:
- Current git commit hash
- Build timestamp
3. **Builds fresh images:** `docker compose build --no-cache`
- `--no-cache` ensures all layers are rebuilt (prevents stale assets)
4. **Restarts containers:** `docker compose up -d --force-recreate`
5. **Health check:** Polls `/api/health` for up to 60 seconds
6. **Logs deployment:** Records all steps to `logs/deploy.log`
- These are stored as Docker image labels for later verification
3. **Builds Docker Images** (`--no-cache`):
- Rebuilds all layers (no caching) to prevent stale assets
- Applies git commit and build timestamp as labels
4. **Starts Containers**:
- Uses `docker compose up -d --force-recreate` to ensure clean start
- Both backend and frontend containers are started
5. **Health Check**:
- Waits up to 60 seconds for backend to respond on `/api/health`
- Retries every 5 seconds (12 attempts max)
- Fails with exit code 1 if health check times out
### Output Example
### Exit Codes
| Code | Meaning | Next Steps |
|------|---------|-----------|
| 0 | Success | Deployment complete; containers healthy |
| 1 | Failure | See troubleshooting below |
### Logs
All deploy activity is logged to `logs/deploy.log`:
```bash
tail -50 logs/deploy.log # Last 50 lines
grep ERROR logs/deploy.log # Find errors
```
[2026-03-03 18:30:00] === Deploy started ===
[2026-03-03 18:30:01] Pulling latest code...
[2026-03-03 18:30:05] Commit: 53f4df6 | Date: 2026-03-03T18:30:00Z
[2026-03-03 18:30:06] Building images (--no-cache)...
[2026-03-03 18:30:45] Starting containers...
[2026-03-03 18:30:50] Health check...
[2026-03-03 18:30:55] Backend healthy
[2026-03-03 18:30:56] === Deploy complete: 53f4df6 ===
```
### Environment Variables
Optional env vars can be set before running `deploy.sh`:
| Variable | Default | Purpose |
|----------|---------|---------|
| `GIT_COMMIT` | auto-detected | Override git commit label (not recommended) |
| `BUILD_DATE` | auto-detected | Override build timestamp (not recommended) |
---
## Checking Deployment Status
## How to Check Build Status (`build-check.sh`)
### Build Status Check
Run this command anytime to verify deployed containers match your local code:
```bash
cd /workspace/gravl
scripts/build-check.sh
```
### Output Example
**Healthy deployment:**
```
Local HEAD: 53f4df6 (53f4df6f8a5c4d2e1f0a9b8c7d6e5f4a3b2c1d0)
Local HEAD: abc1234 (abc1234567890abcdef1234567890abcdef123456)
[gravl-backend] Built: 53f4df6 on 2026-03-03T18:30:00Z
[gravl-backend] Built: abc1234 on 2026-03-03T18:21:00Z
[gravl-backend] OK: up to date
[gravl-frontend] Built: 53f4df6 on 2026-03-03T18:30:00Z
[gravl-frontend] Built: abc1234 on 2026-03-03T18:21:00Z
[gravl-frontend] OK: up to date
```
### What the Check Tells You
**Stale containers (code updated, not redeployed):**
```
Local HEAD: xyz5678 (xyz5678...)
- **OK: up to date** — Containers match the local git commit (everything is current)
- **STALE: container is behind local code**Code has changed but containers haven't been redeployed yet
- **WARNING: no build label found** — Container is old (pre-07-02) and lacks build tracking labels
[gravl-backend] Built: abc1234 on 2026-03-03T18:21:00Z
[gravl-backend] STALE: container is behind local code — run scripts/deploy.sh
[gravl-frontend] Built: abc1234 on 2026-03-03T18:21:00Z
[gravl-frontend] STALE: container is behind local code — run scripts/deploy.sh
```
**Missing labels (container built manually, not via deploy.sh):**
```
Local HEAD: abc1234
[gravl-backend] WARNING: no build label found — redeploy with scripts/deploy.sh to add tracking
[gravl-frontend] Not running
```
### Exit Codes
| Code | Meaning |
|------|---------|
| 0 | All checks completed (warnings don't fail; see output for status) |
| (no error exit) | Missing containers are noted but don't cause failure |
---
@@ -78,215 +154,347 @@ Local HEAD: 53f4df6 (53f4df6f8a5c4d2e1f0a9b8c7d6e5f4a3b2c1d0)
### Health Check Failures
**Symptom:** Deployment fails with "ERROR: Health check failed after 60s"
**Symptom:** `ERROR: Health check failed after 60s`
**Possible Causes & Solutions:**
**Causes & Solutions:**
| Cause | Solution |
|-------|----------|
| Backend not starting | Check logs: `docker compose logs gravl-backend` |
| Health endpoint not implemented | Implement `GET /api/health` in backend (returns `200 OK`) |
| Network issues | Verify network: `docker network inspect gravl` or restart: `docker compose restart` |
| Port already in use | Check: `lsof -i :3001` and kill the process or change port |
| Insufficient resources | Free disk space: `df -h` or reduce image size |
1. **Backend service didn't start**
```bash
docker logs gravl-backend | tail -20
# Look for:
# - Port conflicts (ERR_EADDRINUSE)
# - Missing dependencies (module not found)
# - Database connection errors
```
2. **Port 3001 is already in use**
```bash
lsof -i :3001 # Find what's using it
docker port gravl-backend # Check exposed port
kill -9 <PID> # Kill conflicting process (if safe)
scripts/deploy.sh # Retry
```
3. **Network issue between host and container**
```bash
docker inspect gravl-backend --format '{{.NetworkSettings.IPAddress}}'
curl -sf http://<container-ip>:3001/api/health # Test directly
```
4. **Backend code has syntax error**
```bash
docker logs gravl-backend 2>&1 | grep -i "syntax\|error\|exception"
# Check backend/src/index.js for obvious errors
# Revert recent changes: git log --oneline -5 && git checkout <good-commit>
```
**Quick recovery:**
**Manual Restart:**
```bash
docker compose restart gravl-backend
# Wait a few seconds
curl -sf http://localhost:3001/api/health
# 1. Stop everything
docker compose down
# 2. Check backend logs
docker compose up -d gravl-backend
sleep 5
docker logs gravl-backend | tail -50
# 3. If logs show errors, fix code and retry
git diff HEAD~1..HEAD backend/src/
# ... fix issues ...
scripts/deploy.sh
```
---
### Stale Containers
**Symptom:** `build-check.sh` shows "STALE: container is behind local code"
**Symptom:** `build-check.sh` shows `STALE: container is behind local code`
**Cause:** Code has been updated but containers haven't been redeployed.
**Causes:**
- Code was updated (`git pull`) but `deploy.sh` hasn't been run
- Deployment failed partway through
- Manual restart without redeploy
**Solution:**
```bash
scripts/deploy.sh
scripts/build-check.sh # Should now show OK
scripts/build-check.sh # Verify update
```
---
### Missing Docker Labels
### Missing Build Labels
**Symptom:** `build-check.sh` shows "WARNING: no build label found"
**Symptom:** `WARNING: no build label found — redeploy with scripts/deploy.sh`
**Cause:** Containers were built before phase 07-02 (before labels were added).
**Causes:**
- Container was built with `docker compose build` directly (not via `deploy.sh`)
- Container predates the labeling system
**Solution:**
```bash
scripts/deploy.sh # Rebuilds with labels
```
---
### Deployment Hangs
**Symptom:** `scripts/deploy.sh` doesn't complete or appears stuck.
**Possible Causes & Solutions:**
| Symptom | Solution |
|---------|----------|
| Stuck at "Building images" | Docker build is slow. Check: `docker builder prune` to free cache |
| Stuck at "Health check" | Backend not responding. Try: `docker compose logs` to see errors |
| Git pull conflicts | Resolve conflicts manually: `cd /workspace/gravl && git status` |
**Force Stop:**
```bash
# Kill the deploy script
pkill -f scripts/deploy.sh
# Manually check status
docker compose ps
docker compose logs
```
---
## Rollback Procedures
### Quick Rollback
If the current deployment is broken:
```bash
# Revert to previous commit
git reset --hard HEAD~1
# Redeploy
scripts/deploy.sh
# Verify
scripts/build-check.sh
```
### Multi-Commit Rollback
If you need to go back several commits:
```bash
# View recent commits
git log --oneline -10
# Rollback to a specific commit (example: abc1234)
git reset --hard abc1234
# Redeploy
# Re-deploy to add labels
scripts/deploy.sh
```
### Rollback Verification
---
After rolling back, verify the system is stable:
### Container Won't Start (CrashLoopBackOff / Exited)
**Symptom:** `docker compose ps` shows container in "Exited" state
**Steps:**
1. **Check container logs**
```bash
docker logs gravl-backend --tail 50
docker logs gravl-frontend --tail 50
```
2. **Check docker-compose.yml for typos**
```bash
docker compose config # Validates syntax
```
3. **Inspect health check endpoint**
```bash
curl -v http://localhost:3001/api/health
# Should see HTTP 200, not 404 or 500
```
4. **If all else fails, clean rebuild**
```bash
docker compose down
docker rmi gravl-backend gravl-frontend
docker system prune -f
scripts/deploy.sh
```
---
### Database Connection Issues
**Symptom:** Backend logs show `Connection refused` or `ECONNREFUSED`
**Causes:**
- Database service not running
- Wrong host/port in `.env` or backend code
- Network issue between containers
**Solutions:**
1. **Check database service status** (if applicable)
```bash
docker compose ps # All services running?
docker network ls # Check gravl network exists
```
2. **Verify connection string in `.env`**
```bash
cat .env | grep -i database
# Should match docker-compose.yml service name (e.g., gravl-db:5432)
```
3. **Test connection from backend container**
```bash
docker exec gravl-backend ping gravl-db
docker exec gravl-backend curl http://gravl-db:5432 # If HTTP, adjust port
```
---
### Disk Space Issues
**Symptom:** `no space left on device` during build
**Solution:**
```bash
# Check containers match the previous code
scripts/build-check.sh
# Check disk usage
docker system df
# Check API is healthy
curl -sf http://localhost:3001/api/health | jq .
# Clean up unused images/containers
docker system prune -a --volumes
# Check frontend is responsive
curl -sf http://localhost:3000/ | head -c 500
# Then retry deploy
scripts/deploy.sh
```
---
## Manual Container Cleanup
## Recovery Procedures
If containers become corrupted or stuck:
### Manual Rollback to Previous Commit
Use this when the deployed code is broken and you need to quickly revert.
```bash
# Stop all containers
# 1. Find the last good commit
git log --oneline -10 # Review recent commits
# 2. Check out the known-good commit
git checkout <commit-hash>
# 3. Redeploy
scripts/deploy.sh
# 4. Verify
scripts/build-check.sh
curl -sf http://localhost:3001/api/health
# 5. Document the incident
echo "Rolled back to <commit-hash> due to <reason>" >> logs/rollback.log
```
### Emergency Container Cleanup
Use this when containers are hung, corrupted, or in an unknown state.
```bash
# 1. Stop all services
docker compose down
# Remove volumes (WARNING: deletes data)
docker compose down -v
# 2. Remove images (forces fresh rebuild)
docker rmi gravl-backend gravl-frontend
# Verify they're gone
docker compose ps
# 3. Clear unused volumes (optional; use with caution!)
# docker volume prune
# Full redeploy
# 4. Rebuild from scratch
scripts/deploy.sh
# 5. Verify all containers running and healthy
docker compose ps
scripts/build-check.sh
curl -sf http://localhost:3001/api/health
```
**Safety Check:** If your data is in Docker volumes, `docker volume prune` will destroy them. Skip this step unless you're sure you don't need the data.
### Staged Rollback (Zero-Downtime)
If you're running a blue-green deployment setup:
```bash
# 1. Deploy to green environment
cd /path/to/green
git pull && docker compose build --no-cache && docker compose up -d
# 2. Test green (health check, smoke tests)
curl -sf http://green-backend:3001/api/health
# 3. Switch traffic to green (via load balancer or DNS)
# (Implementation depends on your infrastructure)
# 4. If green has issues, revert traffic to blue immediately
# (Blue kept serving; no downtime)
# 5. Debug green offline
docker logs gravl-backend
```
---
## Monitoring & Logs
## Monitoring After Deployment
### Deployment Log
### Immediate Checks (after `deploy.sh` completes)
```bash
tail -f logs/deploy.log
# Containers are running
docker compose ps
# Backend is healthy
curl -sf http://localhost:3001/api/health | jq .
# Containers match local code
scripts/build-check.sh
# Logs have no errors
docker logs gravl-backend 2>&1 | grep -i error | head -5
```
### Container Logs
### Ongoing Checks (periodically)
```bash
# Backend logs
docker compose logs gravl-backend
# Run build-check regularly (cron every 30 min, or manual)
scripts/build-check.sh
# Frontend logs
docker compose logs gravl-frontend
# Monitor resource usage
docker stats gravl-backend gravl-frontend
# All logs with timestamps
docker compose logs --timestamps --follow
# Audit logs for issues
docker logs gravl-backend --since 1h --until now | grep ERROR
```
### Build Info
### Example Monitoring Script
```bash
# List deployed images
docker images | grep gravl
#!/bin/bash
# Save as scripts/health-monitor.sh
set -euo pipefail
# Inspect container labels (build metadata)
docker inspect gravl-backend | jq '.Config.Labels'
HEALTHY=true
# Check containers running
docker compose ps | grep -q "Up" || HEALTHY=false
# Check health endpoint
curl -sf http://localhost:3001/api/health || HEALTHY=false
# Check for stale containers
scripts/build-check.sh | grep -q "STALE" && HEALTHY=false
if [ "$HEALTHY" = "true" ]; then
echo "[$(date)] Gravl is healthy ✓"
else
echo "[$(date)] Gravl has issues! See above." >&2
exit 1
fi
```
---
## Best Practices
1. **Always test in staging first** — Validate the deploy in a non-production environment
2. **Check status before deploying** — Run `scripts/build-check.sh` to ensure no stale containers
3. **Review logs after deployment** — Check `logs/deploy.log` for warnings or errors
4. **Plan rollbacks** — Know which commits are stable before deploying
5. **Monitor health endpoints** — Regularly ping `/api/health` in production
6. **Backup before major changes** — Tag releases in git before significant deployments
7. **Use semantic commits** — Make it easy to identify which commits introduced changes
1. **Always run `build-check.sh` before deploying changes**
- Ensures you know current state
- Catches stale containers early
2. **Review changes before deploying**
```bash
git log --oneline -5 # Recent commits
git diff origin/main..HEAD # What will be deployed
```
3. **Test in staging first**
- Separate staging environment for pre-production testing
- Deploy to staging, verify, then deploy to production
4. **Keep logs rotated**
- `logs/deploy.log` can grow large
- Use `logrotate` or manual cleanup: `tail -1000 logs/deploy.log > logs/deploy.log.1 && > logs/deploy.log`
5. **Automate regular checks**
- Cron job to run `build-check.sh` every 30 minutes
- Send alerts if "STALE" or "WARNING" found
6. **Document rollbacks**
- Always log why you rolled back
- Review patterns (e.g., "rolled back 3 times this week" = code review process failing)
---
## FAQ
## See Also
**Q: Can I deploy without building (e.g., just restart containers)?**
A: No. The script always rebuilds to prevent stale code. This is intentional for safety.
**Q: How long should a deployment take?**
A: Typically 60-90 seconds (build time + health check). If longer, check Docker build performance.
**Q: What if I need to deploy a specific commit?**
A: Check it out first, then deploy:
```bash
git checkout <commit-sha>
scripts/deploy.sh
```
**Q: Can I skip the health check?**
A: Not recommended. The health check prevents deploying broken code. Fix the health endpoint instead.
**Q: What data is lost if I rollback?**
A: Container rollback only reverts code. Database data persists unless you `docker compose down -v`.
- **Testing**: [DEPLOYMENT_TEST_PLAN.md](./DEPLOYMENT_TEST_PLAN.md) — comprehensive test scenarios
- **Code style**: [CODING-CONVENTIONS.md](./CODING-CONVENTIONS.md)
- **Architecture**: Backend README or architecture docs (if available)
---
**Last Updated:** 2026-03-03
**Document Version:** 1.0
**Phase:** 07-03
*Last updated: 2026-03-03 | Maintained by: Gravl Development Team*
+64 -26
View File
@@ -1,68 +1,106 @@
#!/bin/bash
# Compare deployed container versions against local git HEAD
# Warns if containers are stale (built from an older commit)
# Gravl Build Status Checker
#
# Purpose:
# Verifies that deployed containers match the current git HEAD.
# Warns if containers are stale (built from older commits).
# Helps you catch situations where code was updated but not redeployed.
#
# How it works:
# 1. Gets current local git commit (HEAD)
# 2. Queries each container's build labels
# 3. Compares container label commit vs local HEAD
# 4. Reports status: "OK", "STALE", or "WARNING"
#
# Exit codes:
# 0 = All checks completed (see output for individual status)
# (Warnings don't cause non-zero exit)
#
# Usage:
# ./scripts/build-check.sh
#
# What it shows:
# - Local git HEAD commit SHA
# - Each container's built commit SHA (from Docker labels)
# - Whether containers are up-to-date or stale
# - Warnings if labels are missing (pre-07-02 containers)
# Example output:
# Local HEAD: abc1234 (abc1234567890abcdef...)
#
# Label fields read:
# - org.opencontainers.image.revision = Git commit SHA embedded by deploy.sh
# - org.opencontainers.image.created = Build timestamp (ISO 8601 format)
#
# Exit codes:
# 0 = All containers up to date
# 1+ = Warnings or stale containers detected
#
# See: /docs/DEPLOYMENT.md for troubleshooting
# [gravl-backend] Built: abc1234 on 2026-03-03T18:21:00Z
# [gravl-backend] OK: up to date
# [gravl-frontend] Built: abc1234 on 2026-03-03T18:21:00Z
# [gravl-frontend] OK: up to date
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
REPO_DIR="$(dirname "$SCRIPT_DIR")"
cd "$REPO_DIR"
# Get the current local git commit (what's checked out locally)
LOCAL_COMMIT=$(git rev-parse HEAD)
echo "Local HEAD: $(git rev-parse --short HEAD) ($LOCAL_COMMIT)"
echo ""
# Check a single container's build status
# Args: $1 = container name
# ============================================================================
# check() helper function
# ============================================================================
# Queries a container's build labels and compares against local HEAD.
#
# Parameters:
# $1 = Container name (e.g., "gravl-backend")
#
# Label fields used:
# org.opencontainers.image.revision = commit hash when image was built
# Format: 40-character SHA (same as git rev-parse HEAD)
# Set by: scripts/deploy.sh -> docker compose build args
#
# org.opencontainers.image.created = RFC3339 timestamp when image was built
# Format: 2026-03-03T18:21:00Z
# Set by: scripts/deploy.sh -> docker compose build args
# Purpose: Shows humans when the image was built (for diagnostics)
#
# Status outcomes:
# - "Not running": Container doesn't exist or isn't running
# - "WARNING": Container exists but has no revision label
# Fix: Re-deploy with scripts/deploy.sh
# - "OK": Container label commit = local HEAD (up to date)
# - "STALE": Container label commit != local HEAD
# Fix: Run scripts/deploy.sh to update container
check() {
local name="$1"
# Container not running
# Check if container exists and is running
if ! docker inspect "$name" &>/dev/null; then
echo "[$name] Not running"
return
fi
# Read Docker labels set by deploy.sh
# If labels are missing, container was built before phase 07-02
# Extract build labels from container config
# These labels are set in the docker-compose.yml build args,
# and the Dockerfile COPYs them into image labels.
local commit date
commit=$(docker inspect "$name" --format '{{index .Config.Labels "org.opencontainers.image.revision"}}' 2>/dev/null)
date=$(docker inspect "$name" --format '{{index .Config.Labels "org.opencontainers.image.created"}}' 2>/dev/null)
# Check if revision label exists
if [ -z "$commit" ] || [ "$commit" = "unknown" ]; then
echo "[$name] WARNING: no build label found — redeploy with scripts/deploy.sh to add tracking"
return
fi
# Display container info
# Display when this container's image was built
echo "[$name] Built: ${commit:0:7} on ${date:-unknown}"
# Compare container commit to local HEAD
# Compare container's commit against local HEAD
# If they match, container is up to date.
# If they differ, code has changed locally but container hasn't been redeployed.
if [ "$commit" = "$LOCAL_COMMIT" ]; then
echo "[$name] OK: up to date"
echo "[$name] OK: up to date"
else
echo "[$name] STALE: container is behind local code — run scripts/deploy.sh"
echo "[$name] STALE: container is behind local code — run scripts/deploy.sh"
fi
}
# Check both containers
# ============================================================================
# Check Each Service
# ============================================================================
# These are the service names defined in docker-compose.yml.
# Adjust if you rename services.
check "gravl-backend"
check "gravl-frontend"
+98 -34
View File
@@ -1,24 +1,31 @@
#!/bin/bash
# Gravl deployment script
# Prevents stale containers by always building fresh with --no-cache
#
# Gravl Deployment Script
#
# Purpose:
# Automates the deployment of Gravl services to production/staging.
# Ensures fresh builds and verifies service health after startup.
#
# Prevents stale containers by always building fresh with --no-cache:
# The --no-cache flag rebuilds all Docker layers from scratch.
# This prevents stale application code, assets, or dependencies
# from being cached and deployed. Essential for reliable deployments.
#
# Workflow:
# 1. Pull latest code from git
# 2. Capture build metadata (commit hash, timestamp)
# 3. Build Docker images (--no-cache for freshness)
# 4. Start containers with new images
# 5. Health check: wait for backend to respond
#
# Exit codes:
# 0 = Success (deployment complete, services healthy)
# 1 = Failure (see error message in logs)
#
# Usage:
# ./scripts/deploy.sh
#
# What it does:
# 1. Pulls latest code from git
# 2. Captures build metadata (commit SHA, timestamp)
# 3. Builds fresh Docker images with --no-cache (no layer caching)
# 4. Restarts containers to use new images
# 5. Polls /api/health endpoint until backend is ready
# 6. Logs all steps to logs/deploy.log
#
# Rationale for --no-cache:
# Docker caching can hide stale assets (JS, CSS, images) when source files change.
# Using --no-cache ensures all layers rebuild fresh, guaranteeing new code is deployed.
# Trade-off: Slightly slower builds (30-60s vs 10-20s with cache), but safer.
#
# See: /docs/DEPLOYMENT.md for troubleshooting
# Logs:
# All output saved to logs/deploy.log (see tail to follow)
set -euo pipefail
@@ -27,49 +34,106 @@ REPO_DIR="$(dirname "$SCRIPT_DIR")"
LOG_FILE="$REPO_DIR/logs/deploy.log"
BACKEND_HEALTH="http://localhost:3001/api/health"
# Logging helper: prints timestamp + message to both stdout and log file
log() {
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $*" | tee -a "$LOG_FILE"
}
# Ensure logs directory exists
mkdir -p "$REPO_DIR/logs"
cd "$REPO_DIR"
log "=== Deploy started ==="
# Pull latest code from remote
# Fails if there are local changes or merge conflicts
# ============================================================================
# STEP 1: Git Pull
# ============================================================================
# Fetches latest code from remote and merges into current branch.
# Fails if there are merge conflicts (manual intervention required).
log "Pulling latest code..."
git pull
# Capture build metadata to embed in Docker image labels
# These labels allow build-check.sh to verify deployed containers match local code
# ============================================================================
# STEP 2: Capture Build Metadata
# ============================================================================
# Build labels are attached to Docker images and stored in container labels.
# These are used by build-check.sh to verify deployed containers match local HEAD.
#
# Labels:
# org.opencontainers.image.revision = git commit hash (40-char SHA)
# Purpose: Track which commit the image was built from
# Example: abc1234567890abcdef1234567890abcdef123456
#
# org.opencontainers.image.created = RFC3339 timestamp
# Purpose: Track when the image was built
# Example: 2026-03-03T18:21:00Z
GIT_COMMIT=$(git rev-parse HEAD)
BUILD_DATE=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
log "Commit: $(git rev-parse --short HEAD) | Date: $BUILD_DATE"
# Build fresh images — no-cache prevents Docker layer caching
# This is critical for frontend deployments where CSS/JS changes might not be obvious
# to Docker's layer detection algorithm
log "Building images (--no-cache)..."
# ============================================================================
# STEP 3: Build Docker Images (--no-cache)
# ============================================================================
# Why --no-cache?
# Docker layer caching can hide stale assets (CSS, JS bundles, dependencies).
# Example: If package.json changes but npm install is cached, old dependencies are used.
# --no-cache forces full rebuild of all layers every time.
#
# Build args are passed to Dockerfile via export, allowing them to be used
# in RUN instructions or referenced in labels (see docker-compose.yml).
log "Building images (--no-cache to prevent stale assets)..."
export GIT_COMMIT BUILD_DATE
docker compose build --no-cache
# Restart containers with new images
# --force-recreate stops old containers and removes them before starting new ones
# ============================================================================
# STEP 4: Start Containers with New Images
# ============================================================================
# docker compose up -d --force-recreate:
# -d = Run in background (detached mode)
# --force-recreate = Stop and remove existing containers, start fresh
# Ensures old containers with old images are not reused.
#
# This step also networks containers (creates/reuses docker network).
log "Starting containers..."
docker compose up -d --force-recreate
# Health check: poll /api/health endpoint until it responds with 200 OK
# Timeout: 60 seconds (12 retries × 5 seconds each)
# This prevents deployment from completing if the backend is broken
log "Health check..."
# ============================================================================
# STEP 5: Health Check
# ============================================================================
# Waits for backend to respond on /api/health endpoint.
# This proves the service started correctly and is ready for traffic.
#
# Timeout configuration:
# Loop: 12 iterations
# Interval: 5 seconds per iteration
# Total: 60 seconds max wait time
#
# Why 60 seconds?
# - Docker startup: ~5-10 seconds
# - Node.js app initialization: ~5 seconds
# - Database connection: ~5-10 seconds
# - Buffer for system load: ~30 seconds
#
# If this timeout is too short, you may see false negatives (healthy app fails check).
# If too long, deployment takes unnecessarily long to fail.
#
# Endpoint details:
# URL: http://localhost:3001/api/health
# Method: GET
# Expected status: 200
# Should complete in <1 second
log "Health check: waiting for backend (60s timeout)..."
for i in $(seq 1 12); do
if curl -sf "$BACKEND_HEALTH" >/dev/null 2>&1; then
log "Backend healthy"
log "Backend healthy"
break
fi
[ "$i" -eq 12 ] && { log "ERROR: Health check failed after 60s"; exit 1; }
log "Waiting... ($i/12)"
if [ "$i" -eq 12 ]; then
log "✗ ERROR: Health check failed after 60s"
log " Try: docker logs gravl-backend | tail -20"
exit 1
fi
log " Waiting... ($i/12 attempts, 5s intervals)"
sleep 5
done