A Claude Code plugin that evolves any project toward its final form. One command picks the single highest-impact change the project needs right now — a bug to fix, dead code to remove, or a capability to add — implements it, verifies it, commits it, and loops. It stops when there's nothing meaningful left to do.
Autonomous. One change per cycle. Reverts cleanly on failure. Stays inside the target directory.
Version: 2.4.0 | License: MIT
- Why Evolve?
- Installation
- How to Use It
- The Priority Model
- The Loop — What Each Iteration Does
- Safety Rails
- Goal-Directed Mode (
--goals) - Resume Mode (
--resume) - Integrations
- Artifacts
- Troubleshooting
- Sister Plugins
- License
"Refactor this codebase" is too vague. Linters find style issues. Test runners report failures. Dependabot bumps versions. None of those are the same as asking: what is the single most valuable change this project needs next?
Evolve answers that question on every iteration. It triages the project across three categories — Fix, Clean, Upgrade — picks the top item from the highest-priority category that has work, implements it, verifies it passes tests, commits it, and loops. Over 20 iterations you get 20 surgical, independently reverting commits with a clear source trail for each.
What separates it from a linter or refactor tool:
- It decides, then acts. A linter tells you what's wrong. Evolve picks one thing, fixes it, verifies it, and commits it — every cycle.
- Priority-ordered, not batch. Bugs block new features. Dead code blocks new code. Evolve enforces that order. You never get an "add feature" commit on top of a failing test.
- Verifies every change. Each iteration re-runs the test suite after the change. If verification fails, the working tree is reverted with
git checkout -- .and the candidate is retried with a different approach (max 2 attempts) or skipped. - Self-terminating. Evolve stops when the remaining candidates are cosmetic, busywork, or diminishing-returns. A project in final form gets a "nothing left to do" report, not 50 cycles of pointless renames.
- Containment guarantee. Files outside the target directory are never touched. No global installs, no settings changes, no cross-project edits. If a change would require reaching outside, it's skipped.
- Source-annotated history. Every commit body records what surfaced the change (test failure, codemap finding, web research, wiki, claude-mem).
git logis self-documenting.
curl -fsSL https://raw.githubusercontent.com/charleschenai/evolve/main/install.sh | bashThe installer clones the repo into ~/.claude/plugins/marketplaces/loops, verifies the file structure, clears stale plugin cache, and adds the required entries to ~/.claude/settings.json. Requires git and python3 (for non-destructive settings merge).
The marketplace directory is named
loops(the plugin name); the repo isevolve. The user-facing command is/evolve.
- Clone into Claude Code's plugin directory:
git clone https://github.com/charleschenai/evolve.git \
~/.claude/plugins/marketplaces/loops- Add to
~/.claude/settings.json:
{
"enabledPlugins": {
"loops@loops": true
},
"extraKnownMarketplaces": {
"loops": {
"source": {
"source": "directory",
"path": "/home/YOUR_USER/.claude/plugins/marketplaces/loops"
}
}
}
}Replace /home/YOUR_USER with your actual home directory path.
- Restart Claude Code. Skills are cached at session start.
bash ~/.claude/plugins/marketplaces/loops/install.sh --checkChecks that required files are present, cache is clean, and settings.json has the plugin registered.
cd ~/.claude/plugins/marketplaces/loops && git pullThen restart Claude Code.
bash ~/.claude/plugins/marketplaces/loops/install.sh --uninstallRemoves the plugin directory and cache. Leaves settings.json entries in place (harmless) — remove manually if desired.
/evolve [count] [target] [--dry-run] [--goals] [--resume] [--focus <subdir>] [--only <mode>]
| Argument | Description |
|---|---|
count |
Number of iterations. Omit or 0 for infinite — runs until nothing left or evolve asks a human. |
target |
Directory to evolve. Defaults to current working directory. |
--dry-run |
Scan and triage only. Reports what it would do; makes no changes and creates no commits. |
--goals |
Interactive goal picker. Runs discovery, shows a numbered menu, persists selection to .evolve-goals, then grinds. |
--resume |
Continue a previous run. Reads EVOLUTION.log and .evolve-goals to pick up where the last session left off. |
--focus <subdir> |
Restrict scanning and changes to a subdirectory. Tests and git operations still run at the project root. |
--only <mode> |
Only act on one category: fix, clean, or upgrade. Skip all others. |
From simple to advanced:
/evolve 5 # 5 cycles on current directory
/evolve 10 ~/Desktop/codemap # 10 cycles on a specific project
/evolve ~/Desktop/my-project # run until final form (infinite)
/evolve --dry-run ~/Desktop/my-app # preview what it would do — no changes
/evolve --only clean 20 ~/Desktop/app # 20 cycles, cleanup only
/evolve --only fix ~/Desktop/app # grind through all fixable bugs, ignore the rest
/evolve --focus src/api 10 ~/Desktop/app # only evolve the API subdirectory
/evolve --goals ~/Desktop/my-project # pick goals from a menu, then grind
/evolve --resume ~/Desktop/my-project # continue where the last session left off
Flags compose. The skill documents these combined behaviors explicitly:
| Flags | Behavior |
|---|---|
--goals --resume |
Load .evolve-goals, skip discovery, continue grinding remaining goals |
--dry-run --goals |
Run discovery and show the menu, but don't start grinding |
--dry-run --resume |
Show what the next iteration would do based on current state |
--focus --goals |
Discovery phase only scans the focused subdirectory |
--only fix --goals |
Only show fix goals in the discovery menu |
--only clean |
Clean dead code only — skip bugs and upgrades entirely |
Every iteration, evolve triages the project and acts on the highest-priority category that has work. The order is fixed:
| Priority | Category | What it looks for | Commit prefix |
|---|---|---|---|
| 1 | Fix | Security issues, crashes, compile errors, wrong output, failing tests, linter errors | fix: |
| 2 | Clean | Dead files, unused functions, dead dependencies, unused imports, commented-out code blocks | clean: |
| 3 | Upgrade | Missing capabilities, better patterns, new features aligned with project purpose | upgrade: |
The rule: Fix first. Clean second. Upgrade last.
A project with bugs shouldn't get new features. A project with dead code shouldn't grow more code. Upgrades only land on top of a clean, working foundation. This ordering matters more than people expect — it's why evolve can run for 20+ iterations without producing the usual "add feature on top of broken test" chaos.
Within each category, evolve picks the single highest-impact item — the worst bug, the largest dead code block, the most useful missing feature — not the easiest one.
In order of priority:
- Dead files — zero imports/references from the rest of the project
- Dead functions/classes — defined but never called
- Dead dependencies — in the manifest but never imported
- Unused imports — imported but never referenced
- Commented-out code blocks (>3 lines of actual code)
Not waste: explanatory comments, test fixtures, type contracts, feature flags.
Evolve verifies before removing — greps for all references (including strings, configs, dynamic imports, CLI entry points, reflection, metaprogramming). If there's any doubt, the candidate is skipped.
Research-informed, on-mission additions. Evolve uses WebSearch (and context7/wiki/claude-mem when relevant — see Integrations) to find what similar tools do that this one doesn't. It picks the highest-impact gap.
Scope guard: Upgrades must align with the project's existing purpose. Don't add a web UI to a CLI tool. Don't add AI features to a logging library. Extend what the project already does — don't pivot it. If an upgrade doesn't clearly fit, evolve skips it.
Evolve executes 11 steps per cycle:
Test → Codemap → Triage → Research → Pick ONE → Safety → Implement → Validate → Verify → Log+Commit → Report
↓ nothing found → final form → end report + push + release
↓ risky → STOP and ask human
↓ verify fails → revert + retry (max 2)
1. Test. Runs the project's test suite and captures output, errors, warnings. If there's no test suite, falls back to the build step, CLI --help, library parse/import check, or skips to step 2 for config/docs projects. Also runs linters when available: cargo clippy, eslint, ruff / flake8, go vet. Linter errors count as bugs; warnings as upgrade candidates.
2. Deep scan with codemap. If codemap is installed (see Integrations), runs dead-functions, orphan-files, complexity, hotspots, and unreachable analyses. Without codemap, falls back to Grep/Glob for unused exports, unreferenced files, and TODO/FIXME/HACK markers. Also checks for outdated dependencies (npm outdated, pip list --outdated, cargo outdated, go list -m -u all). Minor/patch bumps become upgrade candidates; major bumps get flagged for the safety check.
Large codebases (>500 files): Evolve dispatches parallel subagents via the Agent tool — one for dead code, one for complexity, one for TODO markers — and merges findings in step 3. Much faster than sequential scanning.
3. Triage. Combines test + lint + codemap findings and runs a security sweep via Grep: hardcoded credentials (password\s*=\s*['"]), known security debt (TODO.*security|FIXME.*auth|HACK.*token), and taint analysis if codemap supports it. Categorizes everything as Security/Bug → Fix, Waste → Clean, Gap → Upgrade. Acts on the highest-priority category with items. Security issues are the top Fix priority.
4. Research. WebSearch for upgrade patterns, error messages, or framework best practices. WebFetch for docs, changelogs, examples. When applicable, queries wiki (AI/ML projects), context7 (framework docs), and claude-mem (past session context) — see Integrations.
5. Pick ONE. The single highest-impact item from the active category. Evolve states what it's doing and why in one sentence.
6. Safety check. Classifies the risk:
- High risk (could break production, data loss, irreversible) → STOP and ask human.
- Medium risk (might break tests, uncertain scope) → test in a
git worktree addisolate. Merge back on success, discard on failure. - Low risk → proceed directly.
If --dry-run, prints the picked item and source, then skips to the next iteration without modifying anything.
7. Implement. One focused change, minimal blast radius. No unrelated refactors, no surrounding-code improvements — just the one thing.
8. Post-change codemap validation. If codemap is available, runs blast-radius, complexity, and dead-functions on the changed files. Reverts if blast radius is unexpectedly large (>20% of codebase), complexity spiked, or the change created new dead code.
9. Verify. Re-runs the step-1 test. The change must be visible with no regressions. If verification fails, the working tree is reverted with git checkout -- . and the candidate is retried with a different approach (max 2 attempts). After two failures, it's skipped and the next candidate is picked.
10. Log + commit. Appends to EVOLUTION.log and commits in a single Bash call (the skill explicitly fuses these so there's no stale log-without-commit state). Commit message body includes source and iteration number — git log stays self-documenting even without the log file.
11. Report. Prints one line: [N/total] <prefix>: <what> — verified on <target>. Updates a visual task tracker via TaskCreate / TaskUpdate. Then loops.
- Push every 20 iterations —
git push origin HEADafter step 11 wheniteration_count % 20 == 0. Silent no-op if no remote. - Always push on completion — all remaining commits get pushed when the run ends.
- GitHub release on completion — if
ghis installed and there's a remote, evolve creates a release titled"Evolution: <N> changes applied"with EVOLUTION.log entries as notes. Skipped if fewer than 3 changes. - iMessage notification — if iMessage MCP tools are available and a self-chat is configured, sends a completion summary via
reply. Never messages other contacts.
Evolve is autonomous. The safety model is what makes that acceptable.
Before the first iteration, evolve verifies:
- Target exists and is a git repo with at least one commit. Refuses otherwise (evolve needs git for commits, reverts, and push).
- Working tree is clean —
git status --porcelainmust be empty. Refuses to start on a dirty tree to avoid mixing evolve's commits with the user's unfinished work. Tells the user to commit or stash first. - Warns but proceeds if the branch has an open PR (changes will land on the PR) or is behind the remote (suggests
git pullfirst).
If the target project has a CLAUDE.md, evolve reads it first and respects everything in it — build commands, conventions, constraints, "don't touch this module" instructions. Project-specific rules override evolve's defaults.
Evolve only modifies files inside the target directory. Hard rule, no exceptions:
- No edits to
~/.claude/,settings.json, plugins, build systems, or anything the user's other tools depend on. - No system-level installs (
brew,apt,pip install --global,cargo install). Project-local manifest additions (npm/cargo/pip dependencies) are fine. - No modifications to other projects, even if they depend on this one.
- If a change would require reaching outside the target, it's skipped and the next candidate is picked.
- New files are fine — inside the target only.
- The project's CLI/output contract can be extended (new flags, new actions) but existing behavior is never broken.
Each iteration produces exactly one commit with one change. Small, verifiable, reversible. This is what makes individual reverts safe — no entangled multi-change commits.
If verification fails:
git checkout -- .restores the working tree.- Retry with a different approach (max 2 attempts).
- If both attempts fail, the candidate is skipped and the loop picks the next item.
Medium-risk changes run in a git worktree (git worktree add) before merging — if verification fails in the worktree, the main branch never saw the change.
The skill has explicit criteria for "stop, don't invent busywork." After each triage, evolve asks whether the next best action is actually worth doing. It stops and reports final form if any of these are true:
- No bugs, no dead code, and no upgrades that align with the project's purpose.
- The remaining candidates are all cosmetic (renaming, rewording, reformatting).
- The last 3 iterations were all in the same category with diminishing impact.
- Evolve is tempted to add features the project doesn't need just to keep going.
- The only upgrades left would increase complexity without clear user value.
If you have to justify why a change matters, it probably doesn't. That's in the skill verbatim.
Instead of letting evolve pick what to work on, pass --goals and choose from a menu.
Evolve runs steps 1–4 (Test, Codemap, Triage, Research) but collects all findings across all categories instead of picking one. It presents them as a numbered menu:
=== Evolution Goals ===
Project: ~/Desktop/my-project
Fixes available:
1. [fix] Auth middleware crashes on expired JWT tokens
2. [fix] Race condition in WebSocket reconnect
Cleanup available:
3. [clean] 12 dead functions in utils/ (codemap)
4. [clean] 3 unused dependencies in package.json
Upgrades possible:
5. [upgrade] Add rate limiting to API endpoints
6. [upgrade] Migrate from CommonJS to ESM
7. [upgrade] Add OpenTelemetry tracing
8. [upgrade] Connection pooling for database
Pick goals (comma-separated, or 'all'): _
The selection is written to .evolve-goals in the target directory and committed with the first change. This enables --resume to continue across sessions.
Evolve works through goals in priority order — fixes first, then clean, then upgrades. Each goal may take multiple iterations. .evolve-goals is updated as each goal completes (pending → in-progress → done / skipped). The triage step is narrowed: it only considers items that serve selected goals and ignores other findings. If a goal fails twice, it's marked [skipped] and evolve moves on.
Progress reporting:
[3/∞] upgrade: add rate limiting to /api/users — 1 of 3 goals complete
Pass --resume to continue a previous run. Evolve:
- Reads
EVOLUTION.login the target directory and parses entries to understand what's already been applied. - Loads
.evolve-goalsif present and continues in goal-directed mode with remaining[pending]items. - Counts completed iterations from the log and adjusts the remaining count accordingly.
- Prints a summary of prior progress, then continues from the next iteration.
If no EVOLUTION.log exists, --resume behaves identically to a fresh run.
# Session 1 — crashes or gets interrupted after 7 iterations
/evolve 20 ~/Desktop/my-project
# ... ctrl-C or session ends ...
# Session 2 — pick up where you left off
/evolve --resume ~/Desktop/my-project
# reads log, continues from iteration 8Combined with --goals:
# Session 1 — pick goals, start grinding
/evolve --goals ~/Desktop/my-project
# ... crash at iteration 5 of 10 goals ...
# Session 2 — resume with the same goal list
/evolve --goals --resume ~/Desktop/my-project
# skips discovery, loads .evolve-goals, continuesEvolve uses several optional tools when available and degrades gracefully when they're not.
| Tool | Purpose | Detection | Fallback |
|---|---|---|---|
| codemap | AST-level dead code, orphan files, complexity, blast radius, unreachable paths | which codemap |
Grep/Glob for unused exports, unreferenced files, TODO/FIXME |
| WebSearch / WebFetch | Research upgrades, best practices, error messages | Built-in | None — upgrades require research |
| context7 MCP | Framework docs for React, Next.js, Express, Django, Tailwind, Prisma, etc. | MCP server available | Skip — falls back to WebSearch |
| wiki (pgwiki hybrid_search) | AI/ML techniques, papers, architectural decisions | MCP server available | Skip — AI/ML projects only, not general code |
| claude-mem | Past-session context for this project | MCP server available | Skip — no historical context |
| Agent tool | Parallel subagent scanning for >500-file projects | Built-in | Sequential scanning |
| TaskCreate / TaskUpdate | Visual progress tracking | Built-in | None — silent progress |
| git worktree | Isolated testing for medium-risk changes | git worktree add |
Direct application with revert-on-failure |
| gh CLI | GitHub release on completion | which gh |
Skip release, still pushes commits |
| iMessage MCP | Completion notification for long runs | MCP server + self-chat configured | Skip notification |
Codemap integration specifically — when present, evolve uses it in three places:
- Step 2 (Deep Scan): runs
dead-functions,orphan-files,complexity,hotspots,unreachableto populate triage with structural issues tests won't catch. - Step 3 (Triage): if
taint-analysisis supported, uses it to find unsanitized inputs reaching sensitive sinks. - Step 8 (Post-Change Validation): runs
blast-radius,complexity,dead-functionson changed files. Reverts if blast radius >20%, complexity spiked, or new dead code appeared.
Evolve writes two files into the target repo.
Append-only log of every change applied. One block per iteration:
[2026-04-23 14:45] fix: reorder Steps 10-11 so EVOLUTION.log is written before commit
source: manual review — Step 10 committed before Step 11 wrote the log, so log entries were never included in the commit they described
[2026-04-23 14:50] upgrade: add batch push (every 20 iterations) and GitHub release on completion
source: user feedback — pushing every iteration spams GitHub, batch at 20 with release summary
[2026-04-23 15:02] clean: remove 47 lines — DOT graph and verbose bash examples that LLM doesn't need
source: 361 lines was too long — DOT doesn't render in Claude Code, bash syntax is known
Each entry: timestamp, prefix (fix/clean/upgrade), description, and source (what surfaced the change — test output, codemap finding, web research, wiki, claude-mem, user feedback). The log is what --resume reads to reconstruct prior progress.
Created by --goals mode. Tracks selected goals and their status:
# .evolve-goals — auto-generated by /evolve --goals
# Status: pending | in-progress | done | skipped
[pending] fix: Auth middleware crashes on expired JWT tokens
[pending] clean: 12 dead functions in utils/
[pending] upgrade: Add rate limiting to API endpoints
Committed with the first change. Updated as goals progress. Read by --resume to continue goal-directed runs.
Every commit from evolve looks like this:
upgrade: add rate limiting to /api/users
Source: WebSearch — Express rate-limit best practices + codemap hotspot
Iteration: 3 of 10
Prefix (fix:/clean:/upgrade:) matches the priority category. The body records source and iteration so git log is self-documenting even without EVOLUTION.log.
| Symptom | Cause | Fix |
|---|---|---|
/evolve not appearing as a command |
Plugin not enabled | Check enabledPlugins in ~/.claude/settings.json, restart Claude Code |
| Plugin not detected at all | Missing marketplace entry | Check extraKnownMarketplaces in ~/.claude/settings.json |
| Stale behavior after update | Skills cached at session start | Restart Claude Code |
| "Refusing to start — working tree has uncommitted changes" | Pre-flight check: dirty tree | git commit or git stash before running evolve |
| "Refusing to start — not a git repo" | Target isn't initialized for git | git init and make at least one commit first |
| Iteration reverts repeatedly | Verification failing twice in a row | Evolve skips the candidate after 2 failures and moves on — normal behavior |
| Evolve stops early with "final form" | No meaningful work left, or remaining candidates are cosmetic / busywork | Intentional — evolve self-terminates rather than invent work |
| No push happening | No git remote configured | Silent no-op — add a remote or ignore |
| No GitHub release created | gh not installed, no remote, or fewer than 3 changes |
All three are the documented skip conditions |
| No codemap step running | codemap not installed |
Optional — falls back to Grep/Glob scanning |
| Dry-run still seems to do work | Dry-run only skips steps 7–11 | Steps 1–6 (test, scan, triage, research, pick, safety) still run to produce the report — no writes |
--focus <subdir> tests still run at project root |
By design — tests/git operate at project root, only scanning/changes are scoped | Working as documented |
install.sh --check shows "BROKEN" |
Missing required files | Re-run the installer |
Evolve is part of a family of Claude Code plugins by the same author:
| Plugin | What it does |
|---|---|
| auditor | Audits a codebase from 8 expert perspectives in parallel. Produces a report with numbered findings; fix #3 implements a fix. Great as a first pass before running evolve. |
| codemap | AST-powered structural analysis — dead functions, orphan files, complexity, call graphs, blast radius. Evolve uses codemap internally when available. |
| second-opinion | /so — gets a second opinion on a plan from 5 dynamically chosen expert reviewers before you commit to it. |
Typical flow: /audit to survey → fix all critical to handle emergencies → /evolve to grind through the long tail → /so before any big architectural commit.
MIT — see LICENSE.