feat(08-01): Health monitoring & logging infrastructure

- Set up Winston structured logging with console and file outputs - Create GET /api/health endpoint with uptime, database status, response times - Add request logging middleware (method, path, statusCode, duration) - Create health monitoring module with database connectivity checks - Log all HTTP requests with timing information - Log auth events (login, register) and data modifications - Replace console.log/error with structured logger calls - Update backend README with logging configuration documentation - Add tests for health endpoint and logging middleware - Logs directory: logs/combined.log and logs/error.log Deliverables met: ✓ Structured logging (Winston) integrated ✓ Enhanced health endpoint with uptime & database info ✓ Request logging middleware attached to all routes ✓ Comprehensive logging documentation in README.md ✓ Tests passing for health and logging functionality ✓ All critical operations logged with context
2026-03-03 21:28:46 +01:00
parent 1104f6360e
commit e09017d2e0
11 changed files with 867 additions and 114 deletions
@@ -8,7 +8,8 @@ The Gravl backend is a Node.js/Express application that provides:
 - REST API for exercise data management
 - User authentication and authorization
 - Integration with frontend via HTTP
- Health check endpoint for deployment monitoring
+- Structured logging for monitoring and debugging
+- Health check endpoint with system metrics for deployment monitoring

 ---

@@ -56,6 +57,64 @@ See `.env.example` (if available) for all supported variables.

 ---

+## Logging & Monitoring
+
+### Structured Logging (Winston)
+
+The backend uses Winston for structured logging with multiple transports:
+
+**Console Output (Development):**
+- Human-readable format with timestamps and color coding
+- Logs all INFO, WARN, ERROR, and DEBUG messages
+
+**File Output:**
+- `logs/combined.log` — All application logs
+- `logs/error.log` — Error-level logs only
+- Max file size: 5MB with 5 file rotation
+
+**Log Levels:**
+- `debug` — Development debugging info
+- `info` — General information events
+- `warn` — Warning conditions
+- `error` — Error conditions
+
+**Example Log Format:**
+```
+2026-03-03 18:21:00 [info] User registered { userId: 42, email: user@example.com }
+2026-03-03 18:21:15 [info] HTTP Request { method: 'GET', path: '/api/health', statusCode: 200, duration: '12ms' }
+```
+
+### Request Logging Middleware
+
+All HTTP requests are automatically logged with:
+- HTTP method and path
+- Response status code
+- Request duration (milliseconds)
+- Client IP address
+- User-Agent
+
+Example:
+```
+[info] HTTP Request { method: 'POST', path: '/api/logs', statusCode: 200, duration: '45ms' }
+```
+
+### Accessing Logs
+
+**Local Development:**
+```bash
+npm run dev  # Logs print to console in real-time
+tail -f logs/combined.log  # Follow all logs
+tail -f logs/error.log     # Follow errors only
+```
+
+**Docker Container:**
+```bash
+docker logs -f gravl-backend           # Real-time logs
+docker logs --tail 100 gravl-backend   # Last 100 lines
+```
+
+---
+
 ## API Endpoints

 ### Health Check (Monitoring & Deployment)
@@ -64,23 +123,46 @@ See `.env.example` (if available) for all supported variables.
 GET /api/health
 ```

-Used by deployment scripts to verify the backend is running and responsive.
+Comprehensive health endpoint that returns system status, uptime, and database connectivity. Used by deployment scripts to verify backend is operational.

-**Response:**
+**Response (Healthy):**
 ```json
 {
-  "status": "ok",
-  "timestamp": "2026-03-03T18:21:00Z"
+  "status": "healthy",
+  "uptime": 3600,
+  "timestamp": "2026-03-03T18:21:00.000Z",
+  "database": {
+    "connected": true,
+    "responseTime": "15ms"
+  }
 }
 ```

-**Status Codes:**
- `200 OK` — Backend is healthy
- `500 Internal Server Error` — Backend has errors (check logs)
+**Response (Degraded):**
+```json
+{
+  "status": "degraded",
+  "uptime": 3600,
+  "timestamp": "2026-03-03T18:21:00.000Z",
+  "database": {
+    "connected": false,
+    "error": "Connection timeout"
+  }
+}
+```

-### Other Endpoints
+**Status Values:**
+- `healthy` — All systems operational (HTTP 200)
+- `degraded` — Some systems degraded but functional (HTTP 200)
+- `unhealthy` — Critical systems down (HTTP 503)

-(Document your API endpoints here; placeholder for now)
+**Response Fields:**
+- `status` — Overall health status
+- `uptime` — Seconds since application started
+- `timestamp` — ISO 8601 timestamp of check
+- `database.connected` — Boolean database connectivity status
+- `database.responseTime` — Database query response time
+- `database.error` — Error message if connection failed (optional)

 ---

@@ -91,6 +173,15 @@ npm test              # Run all tests
 npm run test:watch   # Run tests in watch mode
 ```

+### Health & Logging Tests
+
+The test suite includes:
+- Health endpoint status validation
+- Uptime tracking accuracy
+- Database connectivity checking
+- Request logging middleware functionality
+- Error handling for database failures
+
 ---

 ## Docker
@@ -110,6 +201,11 @@ docker run -p 3001:3001 \
  gravl-backend:latest
 ```

+**Viewing logs from container:**
+```bash
+docker logs -f gravl-backend
+```
+
 ### With Docker Compose

 See the root `docker-compose.yml` for multi-container setup.
@@ -150,18 +246,21 @@ That guide includes:

 ### Health Check Configuration

-The backend exposes a health check endpoint at `GET /api/health`. The deployment script (`scripts/deploy.sh`) waits up to 60 seconds for this endpoint to return HTTP 200.
+The backend exposes a comprehensive health check endpoint at `GET /api/health`. The deployment script (`scripts/deploy.sh`) waits up to 60 seconds for this endpoint to return HTTP 200.

 **In your backend code:**
 ```javascript
-app.get('/api/health', (req, res) => {
-  res.json({ status: 'ok', timestamp: new Date().toISOString() });
+// Auto-integrated in src/index.js
+app.get('/api/health', async (req, res) => {
+  const health = await getHealthStatus(pool);
+  const statusCode = health.status === 'healthy' ? 200 : 503;
+  res.status(statusCode).json(health);
 });
 ```

 **Deployment timeout:** 60 seconds (12 retries × 5 seconds)
 - If this endpoint takes >5 seconds to respond, deployment will timeout
- Ensure health check is lightweight (no expensive DB queries)
+- Health check is lightweight and includes database connectivity test

 ---

@@ -170,37 +269,21 @@ app.get('/api/health', (req, res) => {
 ```
 backend/
 ├── src/
-│   ├── index.js           # Server entry point
-│   ├── routes/            # API endpoints
-│   ├── controllers/       # Business logic
-│   ├── models/            # Data models (if using ORM)
-│   └── middleware/        # Express middleware
-├── test/                  # Test files
-├── Dockerfile             # Container image definition
-├── package.json           # Dependencies
-└── README.md             # This file
-```
-
---
-
-## Logs
-
-### Local Development
-```bash
-npm run dev  # Logs to stdout
-```
-
-### Docker Container
-```bash
-docker logs gravl-backend       # Current logs
-docker logs -f gravl-backend    # Follow logs in real-time
-docker logs --tail 50 gravl-backend  # Last 50 lines
-```
-
-### In Deployment
-All deploy activity is logged to `logs/deploy.log` at the root:
-```bash
-tail logs/deploy.log
+│   ├── index.js                # Server entry point
+│   ├── utils/
+│   │   ├── logger.js           # Winston logger configuration
+│   │   └── health.js           # Health monitoring utilities
+│   ├── middleware/
+│   │   └── requestLogger.js    # HTTP request logging middleware
+│   ├── routes/                 # API endpoints
+│   ├── controllers/            # Business logic
+│   ├── models/                 # Data models (if using ORM)
+│   └── services/               # External integrations
+├── test/                       # Test files
+├── logs/                       # Log files (created at runtime)
+├── Dockerfile                  # Container image definition
+├── package.json                # Dependencies
+└── README.md                   # This file
 ```

 ---
@@ -220,19 +303,42 @@ tail logs/deploy.log

 2. **Backend code has a syntax error**
   ```bash
-   npm run dev  # Look for error messages
+   npm run dev  # Look for error messages in logs
+   tail -f logs/error.log
   ```

-3. **Health check endpoint is not implemented**
-   - Ensure `app.get('/api/health', ...)` is in src/index.js
+3. **Database connection is failing**
+   - Backend is stuck trying to connect to DB
+   - Check `DB_HOST`, `DB_PORT`, `DB_USER`, `DB_PASSWORD` in `.env`
+   - Ensure database is running and accessible

-4. **Database connection is failing**
-   - Backend might be stuck trying to connect to DB
-   - Check `DATABASE_URL` in `.env`
-   - Ensure database is running
+4. **Logs directory not writable**
+   ```bash
+   mkdir -p logs
+   chmod 755 logs
+   ```

 See **[`docs/DEPLOYMENT.md`](../docs/DEPLOYMENT.md#troubleshooting)** for more deployment troubleshooting.

+### Checking Logs for Errors
+
+**Console (Development):**
+```bash
+npm run dev  # Full logs with colors
+```
+
+**Log Files:**
+```bash
+tail -50 logs/combined.log  # Last 50 lines of all logs
+tail -50 logs/error.log     # Last 50 lines of errors only
+grep "ERROR" logs/combined.log  # Find all error messages
+```
+
+**Docker:**
+```bash
+docker logs gravl-backend | grep ERROR
+```
+
 ---

 ## Contributing
@@ -251,3 +357,4 @@ See the root project README or CONTRIBUTING.md for guidelines on:
 ---

 *Last updated: 2026-03-03*
+*Phase 08-01: Health Monitoring & Logging Infrastructure*