AI Workflow Optimization Tool for Claude Code
A comprehensive harness that optimizes Claude Code sessions by addressing the four most common failures:
- Early "done" - Agent declares victory too soon → Feature list as source of truth
- Messy repo - Half-finished, no history → Git + progress log ritual
- No real testing - Marks features done without verification → E2E browser tests
- Chaotic setup - Re-learns how to run app every time → Single
init.shstartup script
- Session Continuity:
progress.mdmaintains context between sessions - Feature Management: Track features/tasks with status, subtasks, and E2E validation
- Context Tracking: Monitor estimated token usage with session-based lifecycle
- Compaction Indicator: Shows estimated compaction count when usage exceeds 100%
- Auto-Save Handoff: Automatically saves handoff document on session exit
- Discoveries Tracking: Capture findings, requirements, and institutional knowledge
- Startup Ritual:
init.sh(Bash) andinit.ps1(PowerShell) scripts - Git Safety Hooks: Block dangerous operations (commits to main, force pushes)
- Auto-Hooks Setup: Creates
.claude/settings.local.jsonwith hooks during init - Subagent Delegation: Rule-based task delegation with token savings estimation
- Orchestration Engine: Coordinate automatic subagent delegation with state machine
- Optimization Suite: Exploration cache, file filtering, output compression, lazy loading
- E2E Testing: Playwright integration with test generation
- MCP Server: Playwright browser automation via Model Context Protocol
- Stack Detection: Automatically detects your project's language, framework, database
# Clone the repository
git clone https://github.com/xanthar/claude-harness.git
# Install in development mode
cd claude-harness
pip install -e .
# Or install directly
pip install git+https://github.com/xanthar/claude-harness.gitcd your-project
claude-harness initThe initializer will:
- Detect your project stack (language, framework, database)
- Ask configuration questions
- Generate harness files in
.claude-harness/ - Create
scripts/init.shandscripts/init.ps1startup scripts - Set up E2E testing structure
- Update/create
.claude/CLAUDE.md - Create
.claude/settings.local.jsonwith hooks (project-specific)
./scripts/init.shThis will:
- Check git status (warn if on protected branch)
- Activate virtual environment (Python)
- Check if app is running, optionally start it
- Verify database connection
- Run quick test check
- Show session progress and current feature
After upgrading claude-harness, refresh your project's scripts:
claude-harness refreshThis regenerates init.sh, hooks, and init.ps1 with the latest improvements while preserving your data (features.json, progress.md, config.json).
To also update CLAUDE.md with the latest harness integration section:
claude-harness refresh --update-claude-md# List features
claude-harness feature list
# Add a feature with subtasks
claude-harness feature add "User authentication" -s "Login form" -s "JWT handling" -s "Logout"
# Start working on a feature
claude-harness feature start F-001
# Mark subtask as done
claude-harness feature done F-001 0
# Mark tests as passing
claude-harness feature tests F-001
# Complete the feature
claude-harness feature complete F-001# Show current progress
claude-harness progress show
# Add completed item
claude-harness progress completed "Implemented login form"
# Add work in progress
claude-harness progress wip "Working on JWT handling"
# Add blocker
claude-harness progress blocker "Need API keys for OAuth"
# Start new session (archives previous)
claude-harness progress new-session# Install Playwright
claude-harness e2e install
# Generate test for a feature
claude-harness e2e generate F-001
# Run E2E tests
claude-harness e2e run
claude-harness e2e run --headed # Visible browserClaude Harness includes an MCP server for browser automation, allowing Claude Code to interact with web applications directly.
Setup for Claude Desktop:
Add to claude_desktop_config.json:
{
"mcpServers": {
"playwright": {
"command": "python",
"args": ["-m", "claude_harness.mcp.playwright_server"]
}
}
}Available Tools:
| Tool | Description |
|---|---|
browser_launch |
Launch browser (chromium/firefox/webkit) |
browser_navigate |
Navigate to URL |
browser_click |
Click elements |
browser_fill |
Fill form inputs |
browser_type |
Type with keystroke simulation |
browser_screenshot |
Take screenshots |
browser_get_text |
Get element text |
browser_wait |
Wait for elements |
browser_evaluate |
Run JavaScript |
browser_select |
Select dropdown options |
browser_check |
Check checkboxes |
browser_press |
Press keyboard keys |
browser_close |
Close browser |
browser_content |
Get page HTML |
browser_query_all |
Query multiple elements |
Run standalone:
python -m claude_harness.mcp.playwright_serverMonitor estimated token usage with session-based lifecycle:
# Show context usage
claude-harness context show
claude-harness context show --full # Detailed view with compaction info
# Show session info
claude-harness context session-info
# Mark session as closed (triggers reset on next start)
claude-harness context session-close
# Reset for new session
claude-harness context reset
# Set context budget
claude-harness context budget 200000
# Track per-task usage
claude-harness context start-task F-001
# ... do work ...
claude-harness context end-task F-001
# Output metadata for embedding
claude-harness context metadataSession-Based Features:
- Each session gets a unique
session_id - Metrics automatically reset when a closed session is detected
- Shows compaction indicator when usage exceeds 100% (e.g.,
250% (~2 compactions))
The status command shows compact context usage:
[ * ] Context: 15.2% used | ~169,600 tokens remaining | 12 files read | 5 commands
[!!!] Context: 250% (~2 compactions) | 12 files read | 5 commands
When context is filling up, compress your session for seamless continuation:
# Generate a session summary
claude-harness context summary
# Create a handoff document for the next session
claude-harness context handoff
claude-harness context handoff --save # Save to file
# Full compression: handoff + archive progress + reset metrics
claude-harness context compressThe handoff document includes:
- Project context and stack info
- Current feature progress and subtasks
- Completed work this session
- Files modified
- Pending features
- Recommended next steps
Workflow for long sessions:
- Work until context hits warning level (~70%)
- Run
claude-harness context compress - Start a new Claude Code session
- Read the saved handoff document for context
- Continue seamlessly
Claude Code hooks enable automatic tracking and safety enforcement. During claude-harness init, hooks are automatically configured in .claude/settings.local.json (project-specific, not committed).
Auto-created hooks include:
- PreToolUse: Git safety checks (block commits to protected branches)
- PostToolUse:
- Context tracking (file reads)
- Auto-progress tracking (file writes/edits added to progress.md)
- Activity logging
- SessionEnd: Auto-save handoff, mark session closed, show summary
Note: The
SessionEndhook fires on all session endings including/exit. The olderStophook only fires when Claude naturally stops, so we useSessionEndto ensure handoffs are always saved.
See docs/HOOKS.md for detailed manual setup and customization.
Manual setup - add to .claude/settings.local.json:
{
"hooks": {
"PreToolUse": [
{
"matcher": "Bash",
"hooks": [{"type": "command", "command": ".claude-harness/hooks/check-git-safety.sh"}]
}
],
"PostToolUse": [
{
"matcher": "Read",
"hooks": [{"type": "command", "command": ".claude-harness/hooks/track-read.sh"}]
},
{
"matcher": "Write",
"hooks": [{"type": "command", "command": ".claude-harness/hooks/track-write.sh"}]
},
{
"matcher": "Edit",
"hooks": [{"type": "command", "command": ".claude-harness/hooks/track-edit.sh"}]
},
{
"matcher": "Bash",
"hooks": [{"type": "command", "command": ".claude-harness/hooks/log-activity.sh"}]
}
],
"SessionEnd": [
{
"hooks": [{"type": "command", "command": ".claude-harness/hooks/session-stop.sh"}]
}
]
}
}Capture findings, requirements, and institutional knowledge during sessions:
# Add a discovery
claude-harness discovery add "Auth requires JWT secret in env" --context "Found during testing" --tags security,config
# List all discoveries
claude-harness discovery list
claude-harness discovery list --tag security # Filter by tag
claude-harness discovery list --feature F-001 # Filter by feature
# Search discoveries
claude-harness discovery search "JWT"
# Show discovery details
claude-harness discovery show D-001
# View statistics
claude-harness discovery stats
# Generate summary for handoff
claude-harness discovery summary
# List all tags
claude-harness discovery tagsDiscoveries are persisted in .claude-harness/discoveries.json and included in handoff documents.
your-project/
├── .claude/
│ ├── CLAUDE.md # Enhanced with harness integration
│ └── settings.local.json # Claude Code hooks (project-specific)
├── .claude-harness/
│ ├── config.json # Project configuration
│ ├── features.json # Feature/task tracking
│ ├── progress.md # Session continuity log
│ ├── context_metrics.json # Context usage tracking
│ ├── discoveries.json # Captured findings and knowledge
│ ├── hooks/
│ │ ├── check-git-safety.sh
│ │ ├── track-read.sh
│ │ ├── track-write.sh
│ │ ├── track-edit.sh
│ │ ├── log-activity.sh
│ │ └── session-stop.sh
│ └── session-history/ # Archived sessions
├── scripts/
│ ├── init.sh # Startup ritual (Bash)
│ └── init.ps1 # Startup ritual (PowerShell)
└── e2e/
├── conftest.py # Playwright fixtures
├── pytest.ini
└── tests/ # E2E test files
The .claude-harness/config.json file contains all project settings:
{
"project_name": "my-project",
"stack": {
"language": "python",
"framework": "flask",
"database": "postgresql"
},
"startup": {
"port": 8000,
"health_endpoint": "/api/v1/health",
"start_command": "python run.py"
},
"git": {
"protected_branches": ["main", "master"],
"require_merge_confirmation": true
},
"testing": {
"framework": "pytest",
"coverage_threshold": 80
}
}The .claude-harness/features.json file tracks all features:
{
"current_phase": "Phase 1 - Core Features",
"features": [
{
"id": "F-001",
"name": "User authentication",
"status": "in_progress",
"priority": 1,
"tests_passing": false,
"e2e_validated": false,
"subtasks": [
{"name": "Login form", "done": true},
{"name": "JWT handling", "done": false}
]
}
],
"completed": [],
"blocked": []
}The .claude-harness/progress.md file maintains session continuity:
# Session Progress Log
## Last Session: 2025-12-12 17:30 UTC
### Completed This Session
- [x] Implemented login form
- [x] Added form validation
### Current Work In Progress
- [ ] F-001: User authentication - JWT handling
### Blockers
- None
### Next Session Should
1. Run `./scripts/init.sh` to verify environment
2. Continue with JWT handling subtask
3. Write unit tests for auth module
### Files Modified This Session
- app/auth/login.py
- app/templates/login.htmlThe harness adds mandatory rituals to your CLAUDE.md:
- Run
./scripts/init.sh - Read
.claude-harness/progress.md - Check
.claude-harness/features.json - Pick ONE feature to work on
- Update status to "in_progress"
- Update progress.md with session summary
- Update feature status/subtasks
- Commit work if appropriate
| Command | Description |
|---|---|
claude-harness init |
Initialize harness in project |
claude-harness refresh [--update-claude-md] |
Refresh scripts without losing data |
claude-harness status |
Show current status |
claude-harness detect |
Preview stack detection |
claude-harness run |
Execute init.sh |
| Command | Description |
|---|---|
feature list |
List features |
feature add NAME |
Add new feature |
feature info ID |
Show feature details |
feature start ID |
Start working on feature |
feature complete ID |
Complete feature |
feature block ID |
Block feature with reason |
feature unblock ID |
Unblock feature |
feature subtask ID NAME |
Add subtask |
feature done ID INDEX/NAME |
Complete subtask |
feature note ID TEXT |
Add note to feature |
feature tests ID |
Mark tests as passing |
feature e2e ID |
Mark E2E as validated |
feature sync |
Infer subtask status from modified files |
feature phase NAME |
Set current phase |
| Command | Description |
|---|---|
progress show |
Show progress |
progress completed ITEM |
Add completed item |
progress wip ITEM |
Add WIP item |
progress blocker ITEM |
Add blocker |
progress file PATH |
Track modified file |
progress new-session |
Start new session |
progress history |
Show session history |
progress update |
Update progress fields |
| Command | Description |
|---|---|
context show |
Show context usage |
context reset |
Reset context metrics |
context budget N |
Set token budget |
context start-task ID |
Start tracking task |
context end-task ID |
End tracking task |
context summary |
Generate session summary |
context handoff |
Generate handoff document |
context compress |
Compress session |
context session-info |
Show session details |
context session-close |
Mark session as closed |
context metadata |
Output metadata for embedding |
| Command | Description |
|---|---|
discovery add SUMMARY |
Add a discovery |
discovery list |
List all discoveries |
discovery show ID |
Show discovery details |
discovery search QUERY |
Search discoveries |
discovery delete ID |
Delete a discovery |
discovery tags |
List all unique tags |
discovery stats |
Show statistics |
discovery summary |
Generate summary |
| Command | Description |
|---|---|
delegation status |
Show delegation status |
delegation enable |
Enable delegation |
delegation disable |
Disable delegation |
delegation rules |
List delegation rules |
delegation add-rule |
Add custom rule |
delegation remove-rule NAME |
Remove rule |
delegation enable-rule NAME |
Enable specific rule |
delegation disable-rule NAME |
Disable specific rule |
delegation suggest ID |
Get suggestions for feature |
delegation auto --on/--off |
Configure auto-delegation |
| Command | Description |
|---|---|
orchestrate status |
Show orchestration status |
orchestrate evaluate |
Evaluate feature for delegation |
orchestrate queue [ID] |
Generate delegation queue |
orchestrate start ID |
Start a delegation |
orchestrate complete ID |
Complete a delegation |
orchestrate reset |
Reset orchestration session |
| Command | Description |
|---|---|
optimize status |
Show optimization status |
optimize cache |
Show cache info |
optimize cache-clear |
Clear cache |
optimize prune |
Prune stale cache entries |
optimize categorize PATH |
Categorize a file |
optimize filter |
Show filter configuration |
optimize compress TEXT |
Compress output text |
optimize loading-plan |
Show lazy loading plan |
optimize summary |
Show optimization summary |
| Command | Description |
|---|---|
e2e install |
Install Playwright |
e2e run |
Run E2E tests |
e2e generate ID |
Generate E2E test |
| Command | Description |
|---|---|
commands generate |
Generate slash commands |
commands list |
List available commands |
The ch alias is also available:
ch init
ch status
ch feature list- Python (pip, poetry, pipenv)
- JavaScript/TypeScript (npm, yarn, pnpm)
- Go, Rust (basic detection)
- Python: Flask, Django, FastAPI
- JS/TS: Express, Next.js, React, Vue, NestJS
- PostgreSQL, MySQL, SQLite, MongoDB, Redis
- pytest, unittest, Jest, Vitest, Mocha, Playwright
The harness enforces these principles:
- ONE feature at a time - Focus prevents half-finished work
- Progress over perfection - Track what's done, what's blocked
- Tests or it didn't happen - Features need tests and E2E validation
- Clean repo always - Every session ends with a commit
- Context is king - Progress.md ensures no context is lost
Claude Harness and the Sequential Thinking MCP Server serve different purposes and can be used together:
| Aspect | Claude Harness | Sequential Thinking MCP |
|---|---|---|
| Purpose | Project workflow management | Structured reasoning process |
| Focus | Session continuity & task tracking | Step-by-step thinking during tasks |
| Persistence | Saves to disk (features.json, progress.md) | In-memory only (session-scoped) |
| Scope | Across multiple sessions | Within a single reasoning task |
| What it tracks | Features, progress, context usage, git | Individual thought steps & revisions |
┌─────────────────────────────────────────────────────────┐
│ Claude Session │
│ │
│ ┌──────────────────┐ ┌──────────────────────────┐ │
│ │ Sequential │ │ Claude Harness │ │
│ │ Thinking MCP │ │ │ │
│ │ │ │ • What feature am I on? │ │
│ │ • How do I solve │ │ • What's done/remaining? │ │
│ │ this problem? │ │ • How much context used? │ │
│ │ • Step 1... │ │ • Session handoff │ │
│ │ • Revise step 2 │ │ │ │
│ │ • Branch idea... │ │ │ │
│ └──────────────────┘ └──────────────────────────┘ │
│ ↑ ↑ │
│ MICRO: reasoning MACRO: workflow │
│ within a task across sessions │
└─────────────────────────────────────────────────────────┘
- Harness tells Claude "Work on F003: Add authentication"
- Sequential Thinking helps Claude reason through HOW to implement it
- Harness tracks that F003 is complete and what files changed
- CHANGELOG.md - Version history and release notes
- ROADMAP.md - Planned features and improvements
- docs/HOOKS.md - Detailed hook configuration guide
- Fork the repository
- Create a feature branch (
feat/your-feature) - Make your changes
- Write tests (aim for 100% coverage on new code)
- Update CHANGELOG.md
- Submit a pull request
See ROADMAP.md for planned features accepting contributions.
MIT License - see LICENSE file
Created by Morten Elmstroem Hansen
Optimizing Claude Code workflows, one harness at a time.