Autonomous AI development toolkit for Claude Code. AgentSquad gives Claude Code a safe, completion-enforcing runtime for autonomous task execution -- structured task repositories, worker spawning, compliance enforcement, multi-agent teams, and cross-model collaboration. Battle-tested across 24 lessons from production use.
cd your-project
npx agentsquad startThis auto-detects your project type, installs AgentSquad, syncs GitHub issues, and starts the Conductor. Works with Node.js, Python, Go, and Rust projects.
cd your-project
npx agentsquad init- Task repository (
.tasks/) — structured file-based tracking with status.json, acceptance criteria, execution logs - Worker spawning — tmux-based autonomous Claude Code workers with iteration budgets
- Status updates — safe jq-based JSON mutations with path traversal validation
- Worker monitoring — JSON output of all active workers and their health
- Notifications — webhook-based alerts on status transitions (Slack, Discord, generic)
- Compliance enforcement — 7-point anti-premature-stopping checklist every iteration
- Cross-model collaboration — delegate to a secondary model (Codex, etc.) for think/build/debug
AgentSquad uses a loop methodology where Claude Code runs autonomously inside a structured framework:
- Task files define what to do (acceptance criteria, environment, interfaces)
- Workers are spawned in tmux windows, each with a dynamically built prompt
- Status tracking happens via scripts (never direct JSON edits) for safety
- Compliance hooks prevent premature stopping, context rot, and scope creep
- Notifications keep you informed via webhooks
| Type | Purpose |
|---|---|
| Plan | Architecture, design, research — produces a plan document |
| Implement | Build features, fix bugs — produces code + tests + PR |
| Debug | Investigate and fix issues — hypothesis-driven with execution log |
Every task in .tasks/<task-id>/ has:
| File | Purpose |
|---|---|
status.json |
Machine-readable state (status, complexity, attempts, timestamps) |
acceptance-criteria.md |
What "done" looks like — the worker's contract |
execution-log.md |
Real-time progress log — you monitor this |
Optional files: environment.md (task-specific env setup), screenshots/, .worker-prompt.md (auto-generated).
The loop framework injects a 7-point anti-premature-stopping checklist every iteration:
- Are all acceptance criteria met?
- Do build and tests pass?
- Is the execution log up to date?
- Has status been updated via the script?
- Are there no unresolved blockers?
- Has the completion promise been fulfilled?
- Is there any remaining work you haven't attempted?
Workers cannot stop until all 7 points are satisfied. Push and PR creation happen after worker completion (handled by the orchestrator).
The Conductor is AgentSquad's single orchestration engine. One idempotent tick handles the full lifecycle:
# Single tick (default)
bash scripts/agentsquad/conductor.sh --once
# Continuous mode (tick every 3 minutes)
bash scripts/agentsquad/conductor.sh --loop 3m
# Dry-run (preview, no changes)
bash scripts/agentsquad/conductor.sh --dry-run
# From Claude Code
/conductor # single tick
/loop 5m /conductor # continuousEach tick: finalizes completed workers (push + PR), checks review artifacts, applies approval policy, merges approved tasks, health checks (warn/kill stuck workers), spawns new workers until at capacity (MAX_WORKERS). See docs/conductor.md.
To triage GitHub issues into the task queue, use /orchestrate — it fetches issues, parses dependencies, creates task directories, then hands off to the Conductor.
| Mode | Who approves | When to use |
|---|---|---|
manual (default) |
Human reviews PR, then runs update-status.sh <id> status approved |
Production repos, security-sensitive work |
auto |
Conductor auto-approves after CI green + policy check | Trusted repos, low-risk tasks |
paused |
Nobody — all merges halted | Emergencies, code freezes |
Configure in .claude/agentsquad.json:
{ "approval": { "default": "manual" } }Per-task override via approval_mode field in status.json. Sensitive paths (configured in approval.auto_merge.sensitive_paths) force manual review regardless of mode.
For complex tasks spanning multiple domains, spawn specialist agents:
| Files touched | Agent suggestion |
|---------------------|---------------------------|
| src/lib/ai/** | voice/AI specialist |
| src/components/** | UI/frontend specialist |
| src/lib/api/** | systems/backend specialist|
| **/*.test.* | QA/testing specialist |
Define agents in .claude/agents/ with paired skills in .claude/skills/.
Install additional capabilities:
agentsquad add collab # Cross-model collaboration (think/build/debug)
agentsquad add github # GitHub issue orchestration with label management
agentsquad add vercel # Vercel preview deployment + E2E testing
agentsquad add notifications # Slack and Telegram notification scripts
agentsquad add supabase # Supabase branch database managementWorkers get iteration budgets based on task complexity:
| Complexity | Max Iterations |
|---|---|
| simple | 15 |
| medium | 20 |
| high | 30 |
Override per-task: spawn-worker.sh <task-id> 40
| Variable | Default | Purpose |
|---|---|---|
AGENTSQUAD_TASKS_DIR |
.tasks |
Task repository directory |
AGENTSQUAD_TMUX_SESSION |
project dirname | tmux session name |
AGENTSQUAD_NOTIFY_WEBHOOK |
(none) | Webhook URL for notifications |
AGENTSQUAD_MAX_WORKERS |
3 | Max concurrent workers |
AGENTSQUAD_SECONDARY_MODEL |
gpt-5.4 | Model for collab pack |
AGENTSQUAD_SECONDARY_CLI |
codex | CLI command for collab pack |
AgentSquad encodes 24 hard-won lessons from production autonomous development. See docs/learnings.md for the full list, including:
- Three-layer session isolation (env var, state file, session ID)
export $()is the most dangerous shell pattern- Must use Opus model for workers (Sonnet runs out of context)
- Sleep 8 seconds after tmux window creation
- Fresh Claude sessions per issue (prevents context rot)
- Topological sort with cycle detection for dependencies
- Getting Started — 2-minute quickstart
- Concepts — methodology deep-dive
- Configuration — settings, hooks, env vars
- Agents — domain-specific agent definitions
- Tasks — task repository format and lifecycle
- Packs — optional pack system
- Collab — cross-model collaboration
- Learnings — 24 battle-tested lessons
MIT -- Alessio Zazzarini