Evolve

A Claude Code plugin that evolves any project toward its final form. One command picks the single highest-impact change the project needs right now — a bug to fix, dead code to remove, or a capability to add — implements it, verifies it, commits it, and loops. It stops when there's nothing meaningful left to do.

Autonomous. One change per cycle. Reverts cleanly on failure. Stays inside the target directory.

Version: 2.4.0 | License: MIT

Why Evolve?

"Refactor this codebase" is too vague. Linters find style issues. Test runners report failures. Dependabot bumps versions. None of those are the same as asking: what is the single most valuable change this project needs next?

Evolve answers that question on every iteration. It triages the project across three categories — Fix, Clean, Upgrade — picks the top item from the highest-priority category that has work, implements it, verifies it passes tests, commits it, and loops. Over 20 iterations you get 20 surgical, independently reverting commits with a clear source trail for each.

What separates it from a linter or refactor tool:

It decides, then acts. A linter tells you what's wrong. Evolve picks one thing, fixes it, verifies it, and commits it — every cycle.
Priority-ordered, not batch. Bugs block new features. Dead code blocks new code. Evolve enforces that order. You never get an "add feature" commit on top of a failing test.
Verifies every change. Each iteration re-runs the test suite after the change. If verification fails, the working tree is reverted with git checkout -- . and the candidate is retried with a different approach (max 2 attempts) or skipped.
Self-terminating. Evolve stops when the remaining candidates are cosmetic, busywork, or diminishing-returns. A project in final form gets a "nothing left to do" report, not 50 cycles of pointless renames.
Containment guarantee. Files outside the target directory are never touched. No global installs, no settings changes, no cross-project edits. If a change would require reaching outside, it's skipped.
Source-annotated history. Every commit body records what surfaced the change (test failure, codemap finding, web research, wiki, claude-mem). git log is self-documenting.

Installation

One-line install

curl -fsSL https://raw.githubusercontent.com/charleschenai/evolve/main/install.sh | bash

The installer clones the repo into ~/.claude/plugins/marketplaces/loops, verifies the file structure, clears stale plugin cache, and adds the required entries to ~/.claude/settings.json. Requires git and python3 (for non-destructive settings merge).

The marketplace directory is named loops (the plugin name); the repo is evolve. The user-facing command is /evolve.

Manual install

Clone into Claude Code's plugin directory:

git clone https://github.com/charleschenai/evolve.git \
  ~/.claude/plugins/marketplaces/loops

Add to ~/.claude/settings.json:

{
  "enabledPlugins": {
    "loops@loops": true
  },
  "extraKnownMarketplaces": {
    "loops": {
      "source": {
        "source": "directory",
        "path": "/home/YOUR_USER/.claude/plugins/marketplaces/loops"
      }
    }
  }
}

Replace /home/YOUR_USER with your actual home directory path.

Restart Claude Code. Skills are cached at session start.

Verify

bash ~/.claude/plugins/marketplaces/loops/install.sh --check

Checks that required files are present, cache is clean, and settings.json has the plugin registered.

Update

cd ~/.claude/plugins/marketplaces/loops && git pull

Then restart Claude Code.

Uninstall

bash ~/.claude/plugins/marketplaces/loops/install.sh --uninstall

Removes the plugin directory and cache. Leaves settings.json entries in place (harmless) — remove manually if desired.

How to Use It

Syntax

/evolve [count] [target] [--dry-run] [--goals] [--resume] [--focus <subdir>] [--only <mode>]

Argument	Description
`count`	Number of iterations. Omit or `0` for infinite — runs until nothing left or evolve asks a human.
`target`	Directory to evolve. Defaults to current working directory.
`--dry-run`	Scan and triage only. Reports what it would do; makes no changes and creates no commits.
`--goals`	Interactive goal picker. Runs discovery, shows a numbered menu, persists selection to `.evolve-goals`, then grinds.
`--resume`	Continue a previous run. Reads `EVOLUTION.log` and `.evolve-goals` to pick up where the last session left off.
`--focus <subdir>`	Restrict scanning and changes to a subdirectory. Tests and git operations still run at the project root.
`--only <mode>`	Only act on one category: `fix`, `clean`, or `upgrade`. Skip all others.

Examples

From simple to advanced:

/evolve 5                              # 5 cycles on current directory
/evolve 10 ~/Desktop/codemap           # 10 cycles on a specific project
/evolve ~/Desktop/my-project           # run until final form (infinite)

/evolve --dry-run ~/Desktop/my-app     # preview what it would do — no changes
/evolve --only clean 20 ~/Desktop/app  # 20 cycles, cleanup only
/evolve --only fix ~/Desktop/app       # grind through all fixable bugs, ignore the rest
/evolve --focus src/api 10 ~/Desktop/app  # only evolve the API subdirectory

/evolve --goals ~/Desktop/my-project   # pick goals from a menu, then grind
/evolve --resume ~/Desktop/my-project  # continue where the last session left off

Flag combinations

Flags compose. The skill documents these combined behaviors explicitly:

Flags	Behavior
`--goals --resume`	Load `.evolve-goals`, skip discovery, continue grinding remaining goals
`--dry-run --goals`	Run discovery and show the menu, but don't start grinding
`--dry-run --resume`	Show what the next iteration would do based on current state
`--focus --goals`	Discovery phase only scans the focused subdirectory
`--only fix --goals`	Only show fix goals in the discovery menu
`--only clean`	Clean dead code only — skip bugs and upgrades entirely

The Priority Model

Every iteration, evolve triages the project and acts on the highest-priority category that has work. The order is fixed:

Priority	Category	What it looks for	Commit prefix
1	Fix	Security issues, crashes, compile errors, wrong output, failing tests, linter errors	`fix:`
2	Clean	Dead files, unused functions, dead dependencies, unused imports, commented-out code blocks	`clean:`
3	Upgrade	Missing capabilities, better patterns, new features aligned with project purpose	`upgrade:`

The rule: Fix first. Clean second. Upgrade last.

A project with bugs shouldn't get new features. A project with dead code shouldn't grow more code. Upgrades only land on top of a clean, working foundation. This ordering matters more than people expect — it's why evolve can run for 20+ iterations without producing the usual "add feature on top of broken test" chaos.

Within each category, evolve picks the single highest-impact item — the worst bug, the largest dead code block, the most useful missing feature — not the easiest one.

What counts as waste (Clean mode)

In order of priority:

Dead files — zero imports/references from the rest of the project
Dead functions/classes — defined but never called
Dead dependencies — in the manifest but never imported
Unused imports — imported but never referenced
Commented-out code blocks (>3 lines of actual code)

Not waste: explanatory comments, test fixtures, type contracts, feature flags.

Evolve verifies before removing — greps for all references (including strings, configs, dynamic imports, CLI entry points, reflection, metaprogramming). If there's any doubt, the candidate is skipped.

What counts as an upgrade

Research-informed, on-mission additions. Evolve uses WebSearch (and context7/wiki/claude-mem when relevant — see Integrations) to find what similar tools do that this one doesn't. It picks the highest-impact gap.

Scope guard: Upgrades must align with the project's existing purpose. Don't add a web UI to a CLI tool. Don't add AI features to a logging library. Extend what the project already does — don't pivot it. If an upgrade doesn't clearly fit, evolve skips it.

The Loop — What Each Iteration Does

Evolve executes 11 steps per cycle:

Test → Codemap → Triage → Research → Pick ONE → Safety → Implement → Validate → Verify → Log+Commit → Report
                    ↓ nothing found → final form → end report + push + release
                    ↓ risky → STOP and ask human
                    ↓ verify fails → revert + retry (max 2)

1. Test. Runs the project's test suite and captures output, errors, warnings. If there's no test suite, falls back to the build step, CLI --help, library parse/import check, or skips to step 2 for config/docs projects. Also runs linters when available: cargo clippy, eslint, ruff / flake8, go vet. Linter errors count as bugs; warnings as upgrade candidates.

2. Deep scan with codemap. If codemap is installed (see Integrations), runs dead-functions, orphan-files, complexity, hotspots, and unreachable analyses. Without codemap, falls back to Grep/Glob for unused exports, unreferenced files, and TODO/FIXME/HACK markers. Also checks for outdated dependencies (npm outdated, pip list --outdated, cargo outdated, go list -m -u all). Minor/patch bumps become upgrade candidates; major bumps get flagged for the safety check.

Large codebases (>500 files): Evolve dispatches parallel subagents via the Agent tool — one for dead code, one for complexity, one for TODO markers — and merges findings in step 3. Much faster than sequential scanning.

3. Triage. Combines test + lint + codemap findings and runs a security sweep via Grep: hardcoded credentials (password\s*=\s*['"]), known security debt (TODO.*security|FIXME.*auth|HACK.*token), and taint analysis if codemap supports it. Categorizes everything as Security/Bug → Fix, Waste → Clean, Gap → Upgrade. Acts on the highest-priority category with items. Security issues are the top Fix priority.

4. Research. WebSearch for upgrade patterns, error messages, or framework best practices. WebFetch for docs, changelogs, examples. When applicable, queries wiki (AI/ML projects), context7 (framework docs), and claude-mem (past session context) — see Integrations.

5. Pick ONE. The single highest-impact item from the active category. Evolve states what it's doing and why in one sentence.

6. Safety check. Classifies the risk:

High risk (could break production, data loss, irreversible) → STOP and ask human.
Medium risk (might break tests, uncertain scope) → test in a git worktree add isolate. Merge back on success, discard on failure.
Low risk → proceed directly.

If --dry-run, prints the picked item and source, then skips to the next iteration without modifying anything.

7. Implement. One focused change, minimal blast radius. No unrelated refactors, no surrounding-code improvements — just the one thing.

8. Post-change codemap validation. If codemap is available, runs blast-radius, complexity, and dead-functions on the changed files. Reverts if blast radius is unexpectedly large (>20% of codebase), complexity spiked, or the change created new dead code.

9. Verify. Re-runs the step-1 test. The change must be visible with no regressions. If verification fails, the working tree is reverted with git checkout -- . and the candidate is retried with a different approach (max 2 attempts). After two failures, it's skipped and the next candidate is picked.

10. Log + commit. Appends to EVOLUTION.log and commits in a single Bash call (the skill explicitly fuses these so there's no stale log-without-commit state). Commit message body includes source and iteration number — git log stays self-documenting even without the log file.

11. Report. Prints one line: [N/total] <prefix>: <what> — verified on <target>. Updates a visual task tracker via TaskCreate / TaskUpdate. Then loops.

Publishing behavior

Push every 20 iterations — git push origin HEAD after step 11 when iteration_count % 20 == 0. Silent no-op if no remote.
Always push on completion — all remaining commits get pushed when the run ends.
GitHub release on completion — if gh is installed and there's a remote, evolve creates a release titled "Evolution: <N> changes applied" with EVOLUTION.log entries as notes. Skipped if fewer than 3 changes.
iMessage notification — if iMessage MCP tools are available and a self-chat is configured, sends a completion summary via reply. Never messages other contacts.

Safety Rails

Evolve is autonomous. The safety model is what makes that acceptable.

Pre-flight checks

Before the first iteration, evolve verifies:

Target exists and is a git repo with at least one commit. Refuses otherwise (evolve needs git for commits, reverts, and push).
Working tree is clean — git status --porcelain must be empty. Refuses to start on a dirty tree to avoid mixing evolve's commits with the user's unfinished work. Tells the user to commit or stash first.
Warns but proceeds if the branch has an open PR (changes will land on the PR) or is behind the remote (suggests git pull first).

CLAUDE.md awareness

If the target project has a CLAUDE.md, evolve reads it first and respects everything in it — build commands, conventions, constraints, "don't touch this module" instructions. Project-specific rules override evolve's defaults.

Containment

Evolve only modifies files inside the target directory. Hard rule, no exceptions:

No edits to ~/.claude/, settings.json, plugins, build systems, or anything the user's other tools depend on.
No system-level installs (brew, apt, pip install --global, cargo install). Project-local manifest additions (npm/cargo/pip dependencies) are fine.
No modifications to other projects, even if they depend on this one.
If a change would require reaching outside the target, it's skipped and the next candidate is picked.
New files are fine — inside the target only.
The project's CLI/output contract can be extended (new flags, new actions) but existing behavior is never broken.

One change per cycle

Each iteration produces exactly one commit with one change. Small, verifiable, reversible. This is what makes individual reverts safe — no entangled multi-change commits.

Revert-on-failure

If verification fails:

git checkout -- . restores the working tree.
Retry with a different approach (max 2 attempts).
If both attempts fail, the candidate is skipped and the loop picks the next item.

Medium-risk changes run in a git worktree (git worktree add) before merging — if verification fails in the worktree, the main branch never saw the change.

Self-termination

The skill has explicit criteria for "stop, don't invent busywork." After each triage, evolve asks whether the next best action is actually worth doing. It stops and reports final form if any of these are true:

No bugs, no dead code, and no upgrades that align with the project's purpose.
The remaining candidates are all cosmetic (renaming, rewording, reformatting).
The last 3 iterations were all in the same category with diminishing impact.
Evolve is tempted to add features the project doesn't need just to keep going.
The only upgrades left would increase complexity without clear user value.

If you have to justify why a change matters, it probably doesn't. That's in the skill verbatim.

Goal-Directed Mode (`--goals`)

Instead of letting evolve pick what to work on, pass --goals and choose from a menu.

Phase 1: Discovery

Evolve runs steps 1–4 (Test, Codemap, Triage, Research) but collects all findings across all categories instead of picking one. It presents them as a numbered menu:

=== Evolution Goals ===
Project: ~/Desktop/my-project

Fixes available:
  1. [fix] Auth middleware crashes on expired JWT tokens
  2. [fix] Race condition in WebSocket reconnect

Cleanup available:
  3. [clean] 12 dead functions in utils/ (codemap)
  4. [clean] 3 unused dependencies in package.json

Upgrades possible:
  5. [upgrade] Add rate limiting to API endpoints
  6. [upgrade] Migrate from CommonJS to ESM
  7. [upgrade] Add OpenTelemetry tracing
  8. [upgrade] Connection pooling for database

Pick goals (comma-separated, or 'all'): _

Phase 2: Persist

The selection is written to .evolve-goals in the target directory and committed with the first change. This enables --resume to continue across sessions.

Phase 3: Grind

Evolve works through goals in priority order — fixes first, then clean, then upgrades. Each goal may take multiple iterations. .evolve-goals is updated as each goal completes (pending → in-progress → done / skipped). The triage step is narrowed: it only considers items that serve selected goals and ignores other findings. If a goal fails twice, it's marked [skipped] and evolve moves on.

Progress reporting:

[3/∞] upgrade: add rate limiting to /api/users — 1 of 3 goals complete

Resume Mode (`--resume`)

Pass --resume to continue a previous run. Evolve:

Reads EVOLUTION.log in the target directory and parses entries to understand what's already been applied.
Loads .evolve-goals if present and continues in goal-directed mode with remaining [pending] items.
Counts completed iterations from the log and adjusts the remaining count accordingly.
Prints a summary of prior progress, then continues from the next iteration.

If no EVOLUTION.log exists, --resume behaves identically to a fresh run.

Typical workflow

# Session 1 — crashes or gets interrupted after 7 iterations
/evolve 20 ~/Desktop/my-project
# ... ctrl-C or session ends ...

# Session 2 — pick up where you left off
/evolve --resume ~/Desktop/my-project
# reads log, continues from iteration 8

Combined with --goals:

# Session 1 — pick goals, start grinding
/evolve --goals ~/Desktop/my-project
# ... crash at iteration 5 of 10 goals ...

# Session 2 — resume with the same goal list
/evolve --goals --resume ~/Desktop/my-project
# skips discovery, loads .evolve-goals, continues

Integrations

Evolve uses several optional tools when available and degrades gracefully when they're not.

Tool	Purpose	Detection	Fallback
codemap	AST-level dead code, orphan files, complexity, blast radius, unreachable paths	`which codemap`	Grep/Glob for unused exports, unreferenced files, TODO/FIXME
WebSearch / WebFetch	Research upgrades, best practices, error messages	Built-in	None — upgrades require research
context7 MCP	Framework docs for React, Next.js, Express, Django, Tailwind, Prisma, etc.	MCP server available	Skip — falls back to WebSearch
wiki (pgwiki hybrid_search)	AI/ML techniques, papers, architectural decisions	MCP server available	Skip — AI/ML projects only, not general code
claude-mem	Past-session context for this project	MCP server available	Skip — no historical context
Agent tool	Parallel subagent scanning for >500-file projects	Built-in	Sequential scanning
TaskCreate / TaskUpdate	Visual progress tracking	Built-in	None — silent progress
git worktree	Isolated testing for medium-risk changes	`git worktree add`	Direct application with revert-on-failure
gh CLI	GitHub release on completion	`which gh`	Skip release, still pushes commits
iMessage MCP	Completion notification for long runs	MCP server + self-chat configured	Skip notification

Codemap integration specifically — when present, evolve uses it in three places:

Step 2 (Deep Scan): runs dead-functions, orphan-files, complexity, hotspots, unreachable to populate triage with structural issues tests won't catch.
Step 3 (Triage): if taint-analysis is supported, uses it to find unsanitized inputs reaching sensitive sinks.
Step 8 (Post-Change Validation): runs blast-radius, complexity, dead-functions on changed files. Reverts if blast radius >20%, complexity spiked, or new dead code appeared.

Artifacts

Evolve writes two files into the target repo.

`EVOLUTION.log`

Append-only log of every change applied. One block per iteration:

[2026-04-23 14:45] fix: reorder Steps 10-11 so EVOLUTION.log is written before commit
  source: manual review — Step 10 committed before Step 11 wrote the log, so log entries were never included in the commit they described

[2026-04-23 14:50] upgrade: add batch push (every 20 iterations) and GitHub release on completion
  source: user feedback — pushing every iteration spams GitHub, batch at 20 with release summary

[2026-04-23 15:02] clean: remove 47 lines — DOT graph and verbose bash examples that LLM doesn't need
  source: 361 lines was too long — DOT doesn't render in Claude Code, bash syntax is known

Each entry: timestamp, prefix (fix/clean/upgrade), description, and source (what surfaced the change — test output, codemap finding, web research, wiki, claude-mem, user feedback). The log is what --resume reads to reconstruct prior progress.

`.evolve-goals`

Created by --goals mode. Tracks selected goals and their status:

# .evolve-goals — auto-generated by /evolve --goals
# Status: pending | in-progress | done | skipped
[pending] fix: Auth middleware crashes on expired JWT tokens
[pending] clean: 12 dead functions in utils/
[pending] upgrade: Add rate limiting to API endpoints

Committed with the first change. Updated as goals progress. Read by --resume to continue goal-directed runs.

Commit format

Every commit from evolve looks like this:

upgrade: add rate limiting to /api/users

Source: WebSearch — Express rate-limit best practices + codemap hotspot
Iteration: 3 of 10

Prefix (fix:/clean:/upgrade:) matches the priority category. The body records source and iteration so git log is self-documenting even without EVOLUTION.log.

Troubleshooting

Symptom	Cause	Fix
`/evolve` not appearing as a command	Plugin not enabled	Check `enabledPlugins` in `~/.claude/settings.json`, restart Claude Code
Plugin not detected at all	Missing marketplace entry	Check `extraKnownMarketplaces` in `~/.claude/settings.json`
Stale behavior after update	Skills cached at session start	Restart Claude Code
"Refusing to start — working tree has uncommitted changes"	Pre-flight check: dirty tree	`git commit` or `git stash` before running evolve
"Refusing to start — not a git repo"	Target isn't initialized for git	`git init` and make at least one commit first
Iteration reverts repeatedly	Verification failing twice in a row	Evolve skips the candidate after 2 failures and moves on — normal behavior
Evolve stops early with "final form"	No meaningful work left, or remaining candidates are cosmetic / busywork	Intentional — evolve self-terminates rather than invent work
No push happening	No git remote configured	Silent no-op — add a remote or ignore
No GitHub release created	`gh` not installed, no remote, or fewer than 3 changes	All three are the documented skip conditions
No codemap step running	`codemap` not installed	Optional — falls back to Grep/Glob scanning
Dry-run still seems to do work	Dry-run only skips steps 7–11	Steps 1–6 (test, scan, triage, research, pick, safety) still run to produce the report — no writes
`--focus <subdir>` tests still run at project root	By design — tests/git operate at project root, only scanning/changes are scoped	Working as documented
`install.sh --check` shows "BROKEN"	Missing required files	Re-run the installer

Sister Plugins

Evolve is part of a family of Claude Code plugins by the same author:

Plugin	What it does
auditor	Audits a codebase from 8 expert perspectives in parallel. Produces a report with numbered findings; `fix #3` implements a fix. Great as a first pass before running evolve.
codemap	AST-powered structural analysis — dead functions, orphan files, complexity, call graphs, blast radius. Evolve uses codemap internally when available.
second-opinion	`/so` — gets a second opinion on a plan from 5 dynamically chosen expert reviewers before you commit to it.

Typical flow: /audit to survey → fix all critical to handle emergencies → /evolve to grind through the long tail → /so before any big architectural commit.

License

MIT — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
.claude-plugin		.claude-plugin
plugin		plugin
.gitignore		.gitignore
EVOLUTION.log		EVOLUTION.log
README.md		README.md
install.sh		install.sh

Folders and files

Latest commit

History

Repository files navigation

Evolve

Table of Contents

Why Evolve?

Installation

One-line install

Manual install

Verify

Update

Uninstall

How to Use It

Syntax

Examples

Flag combinations

The Priority Model

What counts as waste (Clean mode)

What counts as an upgrade

The Loop — What Each Iteration Does

Publishing behavior

Safety Rails

Pre-flight checks

CLAUDE.md awareness

Containment

One change per cycle

Revert-on-failure

Self-termination

Goal-Directed Mode (--goals)

Phase 1: Discovery

Phase 2: Persist

Phase 3: Grind

Resume Mode (--resume)

Typical workflow

Integrations

Artifacts

EVOLUTION.log

.evolve-goals

Commit format

Troubleshooting

Sister Plugins

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Goal-Directed Mode (`--goals`)

Resume Mode (`--resume`)

`EVOLUTION.log`

`.evolve-goals`

Packages