Production-hardened agent system for Claude Code that actually uses the full agent spec.
Most Claude Code agent collections are just system prompts with tools: [Read, Write, Edit, Bash]. They ignore 80% of the agent specification — no turn limits, no permission modes, no memory, no MCP integration, no tool restrictions, no real orchestration.
This project is different. Every agent is built for real work, not demos.
| Feature | Other collections | Forge |
|---|---|---|
maxTurns + turn budgets |
- | Per-agent limits with phase budgets |
permissionMode |
- | bypassPermissions for CI, default for orchestration |
disallowedTools |
- | Read-only quality agents can't write files |
memory scopes |
- | user / project / local per agent role |
mcpServers in agents |
- | ESLint, PHPStan, Semgrep, Memory Service |
Task() orchestration |
- | Conductor spawns specific agents, not "any" |
| Bash timeout tables | - | Per-command timeouts to prevent hangs |
| Quality gate pipeline | - | Mandatory review after every code change |
| Verdict-driven flow | - | Agent verdicts control pipeline progression |
We reviewed the top Claude Code agent repos on GitHub (including ones with 30k+ and 60k+ stars). None use
maxTurns,permissionMode,disallowedTools, orTask()syntax. Full comparison
User Task
|
v
[Conductor] ── orchestrates, never writes code
|
|── delegates to ──> [Dev Agent] (backend / frontend / devops)
| |
| v
| code changes
| |
|── quality gates ──> [Code Reviewer] always
| [Security Auditor] if auth/input/API
| [Test Engineer] if logic changed
| [Doc Writer] if behavior changed
| [Monitoring Engineer] if infra changed
|
v
Structured Report + Verdict
Key rules:
- Conductor never writes code —
Write/Editare indisallowedTools - Quality agents are read-only — they can't modify your codebase
- Dev agents run sequentially — no file conflicts
- Quality gates run in parallel — fast feedback
- Every agent has a turn budget — no infinite loops
- Pipeline stops on CRITICAL verdicts — no broken code gets through
| Agent | Model | Turns | Memory | Role |
|---|---|---|---|---|
| conductor | opus | 200 | user | Decomposes tasks, delegates, runs quality gates |
| Agent | Model | Turns | Memory | Role |
|---|---|---|---|---|
| code-reviewer | sonnet | 60 | local | Bugs, design, performance, tech debt |
| security-auditor | sonnet | 60 | local | OWASP, injection, auth, secrets, SAST |
| test-engineer | sonnet | 90 | local | Coverage gaps + writes tests (dual mode) |
| doc-writer | sonnet | 80 | local | README, API docs, changelog (dual mode) |
| monitoring-engineer | sonnet | 70 | local | Logging, metrics, alerts, health checks |
| Agent | Model | Turns | Memory | Role |
|---|---|---|---|---|
| backend-dev | sonnet | 120 | project | Backend APIs, models, migrations, services |
| frontend-dev | sonnet | 120 | project | Components, pages, state, routing, styling |
| devops-engineer | sonnet | 100 | project | Docker, CI/CD, Nginx, deployment |
# Clone the repo
git clone https://github.com/Mationetap/claude-agents-forge.git
# Copy agents to Claude Code's global agent directory
cp claude-agents-forge/agents/**/*.md ~/.claude/agents/# Start Claude Code with the conductor as the main agent
claude --agent conductorThe conductor will:
- Detect your project (reads
package.json,composer.json,CLAUDE.md) - Decompose your task into subtasks
- Delegate to the right dev agent
- Run quality gates automatically
- Report results with structured verdicts
# Direct code review
claude --agent code-reviewer
# Security audit
claude --agent security-auditor
# Just write code with a dev agent
claude --agent backend-devEvery agent has a maxTurns limit and an internal phase budget:
| Phase | Max turns |
|------------------|-----------|
| Stack detection | 2 |
| File reading | 20 |
| Static analysis | 5 |
| Analysis | 14 |
| Memory update | 1 |
| Report | 2 |
| Reserve | 7 |
If an agent reaches ~75% of its turn limit without starting the report phase, it immediately compiles a partial report. A partial report is always better than no report.
Every Bash command has an explicit timeout to prevent hangs:
| Command | Timeout (ms) |
|----------------------------|-------------|
| git diff/log/status | 30000 |
| phpstan/eslint | 60000 |
| semgrep | 120000 |
| test execution | 120000 |
| file checks (test -f, wc) | 5000 |
Quality agents use disallowedTools to enforce read-only behavior:
disallowedTools:
- Write
- Edit
- NotebookEdit
- Agent # can't spawn sub-agents
- WebSearch # no network access
- WebFetch # no network accessThis is a hard guarantee — not a prompt instruction that can be ignored.
Quality agents return structured verdicts that control pipeline flow:
| Verdict | Action |
|---|---|
PASS |
Continue / mark done |
PASS WITH NOTES |
Ask user: fix or accept? |
NEEDS CHANGES |
Route back to dev agent |
CRITICAL ISSUES |
Block — route back for fix |
Max 3 correction round-trips per pipeline to prevent infinite loops.
Create a new file in agents/development/:
---
name: rails-dev
description: >
Ruby on Rails backend developer. Implements APIs, models, migrations,
background jobs, and admin panels. Only modifies Ruby files.
tools:
- Read
- Write
- Edit
- Glob
- Grep
- Bash
disallowedTools:
- Agent
- WebSearch
- WebFetch
- NotebookEdit
model: sonnet
permissionMode: bypassPermissions
maxTurns: 120
memory: project
---
# Rails Dev — Backend Developer
Your system prompt here...Then add it to the conductor's Task() list:
tools:
- Task(backend-dev, frontend-dev, rails-dev, ...)In conductor.md, the quality gate matrix controls which gates run:
| Change type | security | tests | review | docs | monitoring |
|--------------------------|:---:|:---:|:---:|:---:|:---:|
| Auth/login/permissions | YES | YES | YES | YES | - |
| New API endpoint | YES | YES | YES | YES | - |
| Frontend component | - | if complex | YES | - | - |
| Docker/CI/CD | - | - | YES | - | YES |
| CSS/styling only | - | - | YES | - | - |
Edit this matrix to match your workflow.
| Scope | Location | Best for |
|---|---|---|
user |
~/.claude/agent-memory/<agent>/ |
Cross-project knowledge (conductor) |
project |
.claude/agent-memory/<agent>/ |
Project-specific, shared via git (dev agents) |
local |
.claude/agent-memory-local/<agent>/ |
Project-specific, private (quality agents) |
Quality agents use local memory — they learn your project's conventions, linting config, and common patterns across sessions without polluting git.
Agents can declare MCP servers in their frontmatter:
mcpServers:
- eslint # JS/TS linting
- phpstan # PHP static analysis
- semgrep # Security scanning
- memory # MCP Memory ServiceMCP tools are tried first; if they fail, agents fall back to CLI equivalents. This "MCP first, Bash fallback" pattern handles known MCP reliability issues in subagents.
Configure your MCP servers in ~/.claude.json. The exact package names depend on which MCP server implementations you use. Example structure:
{
"mcpServers": {
"eslint": {
"command": "npx",
"args": ["-y", "your-eslint-mcp-package"]
},
"semgrep": {
"command": "npx",
"args": ["-y", "your-semgrep-mcp-package"]
}
}
}See the Claude Code MCP documentation for available servers and setup instructions.
claude-agents-forge/
agents/
orchestration/
conductor.md # Central orchestrator (opus, 200 turns)
quality/
code-reviewer.md # Read-only code analysis (sonnet, 60 turns)
security-auditor.md # Read-only security audit (sonnet, 60 turns)
test-engineer.md # Dual-mode: review + write tests (sonnet, 90 turns)
doc-writer.md # Dual-mode: review + write docs (sonnet, 80 turns)
monitoring-engineer.md # Dual-mode: review + write configs (sonnet, 70 turns)
development/
backend-dev.md # Backend development (sonnet, 120 turns)
frontend-dev.md # Frontend development (sonnet, 120 turns)
devops-engineer.md # Infrastructure & CI/CD (sonnet, 100 turns)
docs/
getting-started.md # Detailed setup guide
agent-spec.md # Full agent YAML spec reference
comparison.md # Side-by-side comparison with other repos
LICENSE
CONTRIBUTING.md
Q: Why not 135 agents like other repos? A: Because 9 well-configured agents that actually work beat 135 generic prompts. Each of our agents uses features that none of theirs do. Quality over quantity.
Q: Do I need all the MCP servers? A: No. MCP integration is optional — agents fall back to CLI tools automatically. Start without MCP, add servers when you need them.
Q: Can I use this with my own stack?
A: Yes. The quality agents (code-reviewer, security-auditor, test-engineer) are stack-agnostic. Dev agents are templates — fork backend-dev.md and customize for your stack.
Q: Does this work on Windows?
A: Yes. Agents use Bash tool timeouts (not the timeout CLI command), POSIX paths in Git Bash, and handle Windows-specific edge cases.
Q: Why does the conductor use opus? A: The conductor makes complex routing decisions and manages multi-agent pipelines. Opus handles this better than sonnet. Dev and quality agents use sonnet — they execute, not decide.
| v1.0 Foundation | ✅ Done | 9 agents, full spec coverage, verdict pipeline |
| v1.1 More Agents | 🚧 Planned | database-engineer, api-designer, refactoring-agent, and more |
| v2.0 Agent Teams | 🚧 Planned | Parallel agents with shared tasks and mailbox |
| v2.2 Memory | 🚧 Planned | MCP Memory Service, cross-session learning, auto-calibration |
| v3.0 Plugin System | 🚧 Planned | One-command install, agent marketplace |
| v3.1 CI/CD | 🚧 Planned | GitHub Actions, PR comment bot, status checks |
See CONTRIBUTING.md for guidelines.
Key rules:
- Every agent MUST use
maxTurns,memory, anddisallowedTools - Quality agents MUST be read-only (no
Write/Edit) - Dev agents MUST declare their file scope
- No "awesome list" padding — only agents you've tested in real work
claude-code claude-code-agents claude-code-subagents ai-agents multi-agent orchestration code-review security-audit devops claude anthropic llm-agents agentic-coding claude-code-skills production-ready
MIT