Add: LLM harness MANIFEST, CLAUDE entry point, LOADER, and templates

2026-04-25 23:15:58 +02:00
parent 421945c9e5
commit dd0cdfdc13
5 changed files with 726 additions and 0 deletions
@@ -0,0 +1,141 @@
+# CLAUDE.md - Claude Code Agent Harness
+
+**Entry point for Claude Code within the LLM Harness system.**
+
+---
+
+## Who You Are (from agent's perspective)
+
+You are Claude Code running within a task-specific harness.
+- You receive minimal, targeted context
+- You execute ONE specific task
+- You report results back to Bumblebee
+- You iterate based on feedback
+
+---
+
+## How to Run
+
+```bash
+# Via LOADER
+source LOADER.sh
+brief_agent claude-code "architecture" "gravl"
+
+# Or direct CLI
+claude-code \
+  --task "architecture" \
+  --project "gravl" \
+  --model "opus"
+```
+
+---
+
+## You Will Receive
+
+**The ONLY context loaded for your task:**
+
+1. **MANIFEST.md** (which task, which model)
+2. **gravl/HARNESS.md** (project structure)
+3. **gravl/modules/design/** (design-specific specs)
+4. **NOT loaded:**
+   - Implementation details
+   - Testing modules
+   - Other projects
+   - Full codebase
+
+---
+
+## Your Task Template
+
+You will be briefed like:
+
+```
+Task: Design Database Schema
+Project: gravl
+Module: design
+
+Context Loaded:
+- ~/workspace/gravl/HARNESS.md
+- ~/workspace/gravl/modules/design/SPEC.md
+- ~/workspace/gravl/modules/design/CONSTRAINTS.md
+
+Output Expected:
+- ~/workspace/gravl/modules/design/SCHEMA.md
+- ~/workspace/gravl/modules/design/ER-DIAGRAM.md
+
+Quality Gate:
+- All constraints satisfied
+- Performance targets met (see CONSTRAINTS.md)
+- Ready for implementation review
+```
+
+---
+
+## Your Output Format
+
+**Always structure as:**
+
+```markdown
+# [Task Name]
+
+## Overview
+[Brief summary of what was designed]
+
+## Key Decisions
+- Decision 1: Rationale
+- Decision 2: Rationale
+
+## Deliverable
+[The actual design/spec/PRD]
+
+## Validation
+- ✓ Constraint 1: [how met]
+- ✓ Constraint 2: [how met]
+- ? Open question: [if any]
+
+## Next Steps
+[What should happen next - implementation? review?]
+```
+
+---
+
+## Working with Bumblebee
+
+**Bumblebee will:**
+1. Load your context (minimally)
+2. Brief you with this harness
+3. Receive your output
+4. Review it CRITICALLY
+5. Give feedback if not approved
+6. Loop until you get it right
+
+**You should:**
+- Follow the task precisely
+- Output in the template above
+- Ask clarifying questions if ambiguous
+- Validate against constraints
+- Be ready to iterate
+
+---
+
+## Model Assignment
+
+This harness will assign you to different models based on task:
+
+- **Opus**: Architecture, analysis, complex problems
+- **Sonnet**: Implementation, straightforward tasks
+- **Haiku**: Simple tasks, low-context work
+
+(See MANIFEST.md for full mapping)
+
+---
+
+## Remember
+
+You're not working alone. You're part of a system where:
+- Bumblebee plans and critiques
+- You implement to specs
+- Feedback loops improve quality
+- Each task is focused and minimal
+
+Do one thing excellently, not many things poorly.
@@ -0,0 +1,111 @@
+#!/bin/bash
+
+# LLM Harness Universal Loader
+# Loads minimal necessary context for agent tasks
+# Follows PSR-4 / module loading principle
+
+set -e
+
+HARNESS_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+MANIFEST="${HARNESS_ROOT}/MANIFEST.md"
+
+# Color output
+RED='\033[0;31m'
+GREEN='\033[0;32m'
+YELLOW='\033[1;33m'
+NC='\033[0m'
+
+# ============================================================================
+# CORE FUNCTIONS
+# ============================================================================
+
+get_agent_for_task() {
+    local task="$1"
+    
+    if [[ ! -f "$MANIFEST" ]]; then
+        echo -e "${RED}Error: MANIFEST.md not found${NC}"
+        return 1
+    fi
+    
+    # Parse JSON from MANIFEST (simplified - would use jq in production)
+    echo "Task: $task → Check MANIFEST.md for assignment"
+}
+
+get_model_for_task() {
+    local task="$1"
+    echo "Model assignment from MANIFEST.md"
+}
+
+load_project_harness() {
+    local project="$1"
+    local harness_file="${HARNESS_ROOT}/../${project}/HARNESS.md"
+    
+    if [[ ! -f "$harness_file" ]]; then
+        echo -e "${RED}Error: ${project}/HARNESS.md not found${NC}"
+        return 1
+    fi
+    
+    echo -e "${GREEN}✓ Loaded: ${project}/HARNESS.md${NC}"
+    cat "$harness_file" | head -50
+}
+
+load_module() {
+    local project="$1"
+    local module="$2"
+    local module_path="${HARNESS_ROOT}/../${project}/modules/${module}"
+    
+    if [[ ! -d "$module_path" ]]; then
+        echo -e "${RED}Error: Module ${project}/modules/${module} not found${NC}"
+        return 1
+    fi
+    
+    echo -e "${GREEN}✓ Loading module: ${project}/${module}${NC}"
+    
+    # Load module files in order
+    [[ -f "${module_path}/APPROACH.md" ]] && echo "• APPROACH.md"
+    [[ -f "${module_path}/CHECKLIST.md" ]] && echo "• CHECKLIST.md"
+    [[ -f "${module_path}/SPEC.md" ]] && echo "• SPEC.md"
+    
+    echo ""
+}
+
+brief_agent() {
+    local agent="$1"
+    local task="$2"
+    local project="$3"
+    
+    echo -e "${YELLOW}=== Agent Brief ===${NC}"
+    echo "Agent: $agent"
+    echo "Task: $task"
+    echo "Project: $project"
+    echo ""
+    echo "Loading context..."
+    
+    # Load agent entry point
+    local agent_file="${HARNESS_ROOT}/${agent^^}.md"
+    if [[ -f "$agent_file" ]]; then
+        echo -e "${GREEN}✓ Agent instructions loaded${NC}"
+    fi
+}
+
+# ============================================================================
+# USAGE
+# ============================================================================
+
+if [[ $# -eq 0 ]]; then
+    echo "LLM Harness Loader"
+    echo ""
+    echo "Usage:"
+    echo "  source LOADER.sh"
+    echo "  get_agent_for_task <task>"
+    echo "  load_project_harness <project>"
+    echo "  load_module <project> <module>"
+    echo "  brief_agent <agent> <task> <project>"
+    echo ""
+    echo "Example:"
+    echo "  brief_agent claude-code architecture gravl"
+    exit 0
+fi
+
+# Execute command
+"$@"
@@ -0,0 +1,145 @@
+# LLM HARNESS MANIFEST
+
+**Universal task router for all LLM tools and models.**
+
+---
+
+## Model/Agent Assignment Matrix
+
+```json
+{
+  "tasks": {
+    "architecture": {
+      "model": "opus",
+      "agent": "claude-code",
+      "use_case": "System design, planning, long-context analysis",
+      "token_budget": 100000
+    },
+    "implementation": {
+      "model": "sonnet",
+      "agent": "claude-code",
+      "use_case": "Code generation, feature building",
+      "token_budget": 50000
+    },
+    "code-review": {
+      "model": "opus",
+      "agent": "cursor",
+      "use_case": "Critical code review, quality gates",
+      "token_budget": 50000
+    },
+    "testing": {
+      "model": "sonnet",
+      "agent": "claude-code",
+      "use_case": "Test generation, validation",
+      "token_budget": 30000
+    },
+    "research": {
+      "model": "sonnet",
+      "agent": "gemini",
+      "use_case": "Topic research, knowledge synthesis",
+      "token_budget": 40000
+    },
+    "documentation": {
+      "model": "sonnet",
+      "agent": "claude-code",
+      "use_case": "Writing docs, guides, specifications",
+      "token_budget": 30000
+    },
+    "analysis": {
+      "model": "opus",
+      "agent": "claude-code",
+      "use_case": "Deep analysis, trade-off evaluation",
+      "token_budget": 60000
+    }
+  },
+  
+  "agents": {
+    "claude-code": {
+      "entry_point": "CLAUDE.md",
+      "cli": "claude-code",
+      "models": ["opus", "sonnet", "haiku"],
+      "strengths": ["architecture", "implementation", "detailed code work"]
+    },
+    "cursor": {
+      "entry_point": "CURSOR.md",
+      "cli": "cursor",
+      "models": ["opus", "sonnet"],
+      "strengths": ["code review", "refactoring", "IDE integration"]
+    },
+    "gemini": {
+      "entry_point": "GEMINI.md",
+      "cli": "gemini",
+      "models": ["gemini-3-pro", "gemini-3-flash"],
+      "strengths": ["research", "multimodal", "web search"]
+    }
+  },
+  
+  "fallback": {
+    "if_agent_unavailable": "use_next_best_agent",
+    "if_model_unavailable": "downgrade_to_sonnet"
+  }
+}
+```
+
+---
+
+## Usage
+
+```bash
+# Load manifest
+source LOADER.sh
+
+# Get task assignment
+get_agent_for_task "architecture"
+# Output: opus + claude-code
+
+# Get module path for project
+get_module_path "gravl" "design"
+# Output: ~/workspace/gravl/modules/design
+```
+
+---
+
+## Project Harness Structure
+
+Each project (gravl, job-portal, etc) should have:
+
+```
+~/workspace/[project]/
+├── HARNESS.md          ← Project-specific structure
+├── modules/
+│   ├── design/
+│   │   ├── APPROACH.md
+│   │   └── specs/
+│   ├── implementation/
+│   ├── testing/
+│   └── docs/
+└── [project files]
+```
+
+---
+
+## Minimal Data Feeding Principle
+
+When instructing an agent:
+1. Load MANIFEST.md (which model/agent)
+2. Load PROJECT/HARNESS.md (project structure)
+3. Load ONLY relevant modules (task-specific data)
+4. DON'T load:
+   - Other projects
+   - Unrelated modules
+   - Implementation details if doing architecture
+   - Full specs if doing design only
+
+---
+
+## RAPL Loop Integration
+
+When task fails validation:
+1. Capture error
+2. Replan (fetch failing data)
+3. Adjust approach
+4. Log learning (store in `learnings/`)
+5. Retry with new strategy
+
+Document: `~/workspace/learnings/YYYY-MM-DD.md`
@@ -0,0 +1,190 @@
+# MODULE STRUCTURE TEMPLATE
+
+**Standard structure for any module within a project harness.**
+
+---
+
+## Module Anatomy
+
+```
+modules/[module-name]/
+├── APPROACH.md       ← How to execute this phase
+├── CHECKLIST.md      ← What to verify
+├── SPEC.md           ← What to build (input for agent)
+└── [outputs]/        ← Generated by agent
+    ├── DESIGN.md
+    ├── SCHEMA.md
+    └── ...
+```
+
+---
+
+## APPROACH.md
+
+**Purpose:** Tell agents HOW to execute this phase.
+
+```markdown
+# [Phase Name] Approach
+
+## Goal
+[What should be accomplished]
+
+## Process
+1. [Step 1]
+2. [Step 2]
+3. [Step 3]
+
+## Constraints to Respect
+- [Constraint 1]
+- [Constraint 2]
+
+## Success Criteria
+- [Criterion 1]
+- [Criterion 2]
+
+## Output Files
+- [file1.md]
+- [file2.md]
+```
+
+---
+
+## CHECKLIST.md
+
+**Purpose:** Verify the phase completed correctly.
+
+```markdown
+# [Phase Name] Checklist
+
+## Functional
+- [ ] All requirements from SPEC.md are met
+- [ ] All constraints are satisfied
+- [ ] No blocking open questions
+
+## Quality
+- [ ] Follows code/design standards
+- [ ] Performance targets met
+- [ ] Security considerations addressed
+
+## Documentation
+- [ ] All decisions documented
+- [ ] Rationales explained
+- [ ] Ambiguities clarified
+
+## Ready for Next Phase?
+- [ ] All checkboxes above passed
+- [ ] No blockers remain
+```
+
+---
+
+## SPEC.md (Example)
+
+**Purpose:** Define requirements for the agent.
+
+```markdown
+# [Module] Specification
+
+## Overview
+[What problem does this solve?]
+
+## Requirements
+1. [Functional requirement 1]
+2. [Functional requirement 2]
+3. [Non-functional requirement]
+
+## Constraints (from project CONSTRAINTS.md)
+- [Constraint 1]
+- [Constraint 2]
+
+## Acceptance Criteria
+- [Criterion 1]
+- [Criterion 2]
+
+## Related Files
+- Architecture: [link]
+- Database: [link]
+- Testing: [link]
+```
+
+---
+
+## Module Loading Example
+
+When bribing Claude Code for "design" phase:
+
+```bash
+# Load these files (and ONLY these):
+- ~/workspace/[project]/HARNESS.md (project structure)
+- ~/workspace/[project]/modules/design/APPROACH.md (how to execute)
+- ~/workspace/[project]/modules/design/SPEC.md (what to build)
+- ~/workspace/[project]/modules/design/CHECKLIST.md (success criteria)
+- ~/workspace/gravl/modules/design/CONSTRAINTS.md (from parent)
+
+# Output expected:
+- ~/workspace/[project]/modules/design/ARCHITECTURE.md
+- ~/workspace/[project]/modules/design/ER-DIAGRAM.md
+- ~/workspace/[project]/modules/design/DECISION_LOG.md
+```
+
+---
+
+## Minimal Data Principle
+
+✅ **DO load:**
+- Module specs for THIS phase
+- Related specs from same project
+- Constraints that affect THIS phase
+
+❌ **DON'T load:**
+- Future phase modules
+- Implementation code
+- Test files
+- Other projects
+- Unused historical context
+
+**Goal:** Agent has EXACTLY what it needs, nothing more.
+
+---
+
+## Example: Design Module
+
+```
+modules/design/
+├── APPROACH.md              ← "Design system architecture"
+├── CHECKLIST.md             ← "All constraints verified?"
+├── SPEC.md                  ← "Build database schema + API spec"
+├── CONSTRAINTS.md           ← "Performance, security, scaling"
+├── EXISTING_ARCHITECTURE.md ← "Current system overview"
+└── [outputs]/
+    ├── SCHEMA.md            ← Agent creates this
+    ├── API_SPEC.md          ← Agent creates this
+    ├── DECISIONS.md         ← Agent creates this
+    └── ER_DIAGRAM.md        ← Agent creates this
+```
+
+When bribing agent for design:
+```
+Load:
+- APPROACH.md
+- SPEC.md
+- CONSTRAINTS.md
+- EXISTING_ARCHITECTURE.md
+
+Don't load:
+- implementation/
+- testing/
+- docs/
+```
+
+---
+
+## Best Practices
+
+1. **Keep specs focused** — One module = one task
+2. **Document decisions** — Why, not just what
+3. **Version constraints** — Tag major constraint changes
+4. **Link related modules** — Show dependencies
+5. **Checkpoint at boundaries** — Save state between phases
+
+This keeps context minimal and agents focused.
@@ -0,0 +1,139 @@
+# PROJECT HARNESS TEMPLATE
+
+**Copy this to your project and customize.**
+
+---
+
+## [Project Name] Structure
+
+```
+~/workspace/[project]/
+├── HARNESS.md              ← This file (project-specific)
+├── modules/
+│   ├── design/             ← Architecture & planning phase
+│   │   ├── SPEC.md         ← What to build
+│   │   ├── CONSTRAINTS.md  ← Non-functional requirements
+│   │   ├── SCHEMA.md       ← Data model (if applicable)
+│   │   └── ARCHITECTURE.md ← System design
+│   │
+│   ├── implementation/      ← Building phase
+│   │   ├── GUIDE.md        ← How to build
+│   │   ├── CHECKLIST.md    ← What to verify
+│   │   └── CODE_STANDARDS.md ← Style & patterns
+│   │
+│   ├── testing/            ← QA phase
+│   │   ├── TEST_PLAN.md    ← What to test
+│   │   ├── CASES.md        ← Test cases
+│   │   └── COVERAGE.md     ← Coverage targets
+│   │
+│   └── docs/               ← Documentation phase
+│       ├── USER_GUIDE.md   ← For end users
+│       ├── API.md          ← For developers
+│       └── DEPLOYMENT.md   ← For ops
+│
+├── [project files/directories]
+└── README.md               ← Project overview
+```
+
+---
+
+## Phases
+
+### Phase 1: Design
+- **Agent:** claude-code (opus model)
+- **Module:** design/
+- **Input:** SPEC.md, CONSTRAINTS.md
+- **Output:** ARCHITECTURE.md, SCHEMA.md
+- **Validation:** All constraints addressed
+
+### Phase 2: Implementation
+- **Agent:** claude-code (sonnet model)
+- **Module:** implementation/
+- **Input:** ARCHITECTURE.md from Phase 1
+- **Output:** Working code
+- **Validation:** Passes CODE_STANDARDS.md
+
+### Phase 3: Review
+- **Agent:** cursor (opus model)
+- **Module:** [code to review]
+- **Input:** Implementation from Phase 2
+- **Output:** Review report, approved/rejected
+- **Validation:** Critical feedback addressed
+
+### Phase 4: Testing
+- **Agent:** claude-code (sonnet model)
+- **Module:** testing/
+- **Input:** TEST_PLAN.md
+- **Output:** Test code, coverage report
+- **Validation:** Coverage targets met
+
+### Phase 5: Documentation
+- **Agent:** claude-code (sonnet model)
+- **Module:** docs/
+- **Input:** Code + ARCHITECTURE.md
+- **Output:** USER_GUIDE.md, API.md, DEPLOYMENT.md
+- **Validation:** Complete, accurate, usable
+
+---
+
+## Key Constraints
+
+**[List your project's non-functional requirements]**
+
+- Performance: [e.g., API responses <100ms p95]
+- Scalability: [e.g., support 10k concurrent users]
+- Security: [e.g., encryption at rest/transit]
+- Availability: [e.g., 99.9% uptime SLA]
+- Compatibility: [e.g., Node.js 20+, PostgreSQL 14+]
+
+---
+
+## Technology Stack
+
+**Backend:** [e.g., Node.js + Express]  
+**Database:** [e.g., PostgreSQL + Redis]  
+**Frontend:** [e.g., React + Tailwind]  
+**DevOps:** [e.g., Docker + Kubernetes]
+
+---
+
+## Module Loading Rules
+
+When briering agents:
+- Only load the CURRENT phase's module
+- Don't load future phases
+- Include CONSTRAINTS.md in every brief
+- Include project README for context
+
+Example: When designing, DON'T load implementation/ or testing/
+
+---
+
+## Checkpoint & Recovery
+
+Each phase saves checkpoint:
+```json
+{
+  "phase": "design",
+  "status": "completed|in_progress|blocked|error",
+  "timestamp": "2026-04-25T23:00:00Z",
+  "agent": "claude-code",
+  "model": "opus",
+  "result_file": "modules/design/ARCHITECTURE.md",
+  "next_phase": "implementation"
+}
+```
+
+If phase fails, save error and replan.
+
+---
+
+## Customization
+
+Change this template to fit YOUR project:
+- Add/remove phases
+- Change agent assignments
+- Define your own constraints
+- Add project-specific workflows
+
+Just keep the structure PSR-4 compatible.