Skip to content

charleschenai/evolve

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Evolve

A Claude Code plugin that evolves any project toward its final form. One command picks the single highest-impact change the project needs right now — a bug to fix, dead code to remove, or a capability to add — implements it, verifies it, commits it, and loops. It stops when there's nothing meaningful left to do.

Autonomous. One change per cycle. Reverts cleanly on failure. Stays inside the target directory.

Version: 2.4.0 | License: MIT


Table of Contents


Why Evolve?

"Refactor this codebase" is too vague. Linters find style issues. Test runners report failures. Dependabot bumps versions. None of those are the same as asking: what is the single most valuable change this project needs next?

Evolve answers that question on every iteration. It triages the project across three categories — Fix, Clean, Upgrade — picks the top item from the highest-priority category that has work, implements it, verifies it passes tests, commits it, and loops. Over 20 iterations you get 20 surgical, independently reverting commits with a clear source trail for each.

What separates it from a linter or refactor tool:

  • It decides, then acts. A linter tells you what's wrong. Evolve picks one thing, fixes it, verifies it, and commits it — every cycle.
  • Priority-ordered, not batch. Bugs block new features. Dead code blocks new code. Evolve enforces that order. You never get an "add feature" commit on top of a failing test.
  • Verifies every change. Each iteration re-runs the test suite after the change. If verification fails, the working tree is reverted with git checkout -- . and the candidate is retried with a different approach (max 2 attempts) or skipped.
  • Self-terminating. Evolve stops when the remaining candidates are cosmetic, busywork, or diminishing-returns. A project in final form gets a "nothing left to do" report, not 50 cycles of pointless renames.
  • Containment guarantee. Files outside the target directory are never touched. No global installs, no settings changes, no cross-project edits. If a change would require reaching outside, it's skipped.
  • Source-annotated history. Every commit body records what surfaced the change (test failure, codemap finding, web research, wiki, claude-mem). git log is self-documenting.

Installation

One-line install

curl -fsSL https://raw.githubusercontent.com/charleschenai/evolve/main/install.sh | bash

The installer clones the repo into ~/.claude/plugins/marketplaces/loops, verifies the file structure, clears stale plugin cache, and adds the required entries to ~/.claude/settings.json. Requires git and python3 (for non-destructive settings merge).

The marketplace directory is named loops (the plugin name); the repo is evolve. The user-facing command is /evolve.

Manual install

  1. Clone into Claude Code's plugin directory:
git clone https://github.com/charleschenai/evolve.git \
  ~/.claude/plugins/marketplaces/loops
  1. Add to ~/.claude/settings.json:
{
  "enabledPlugins": {
    "loops@loops": true
  },
  "extraKnownMarketplaces": {
    "loops": {
      "source": {
        "source": "directory",
        "path": "/home/YOUR_USER/.claude/plugins/marketplaces/loops"
      }
    }
  }
}

Replace /home/YOUR_USER with your actual home directory path.

  1. Restart Claude Code. Skills are cached at session start.

Verify

bash ~/.claude/plugins/marketplaces/loops/install.sh --check

Checks that required files are present, cache is clean, and settings.json has the plugin registered.

Update

cd ~/.claude/plugins/marketplaces/loops && git pull

Then restart Claude Code.

Uninstall

bash ~/.claude/plugins/marketplaces/loops/install.sh --uninstall

Removes the plugin directory and cache. Leaves settings.json entries in place (harmless) — remove manually if desired.


How to Use It

Syntax

/evolve [count] [target] [--dry-run] [--goals] [--resume] [--focus <subdir>] [--only <mode>]
Argument Description
count Number of iterations. Omit or 0 for infinite — runs until nothing left or evolve asks a human.
target Directory to evolve. Defaults to current working directory.
--dry-run Scan and triage only. Reports what it would do; makes no changes and creates no commits.
--goals Interactive goal picker. Runs discovery, shows a numbered menu, persists selection to .evolve-goals, then grinds.
--resume Continue a previous run. Reads EVOLUTION.log and .evolve-goals to pick up where the last session left off.
--focus <subdir> Restrict scanning and changes to a subdirectory. Tests and git operations still run at the project root.
--only <mode> Only act on one category: fix, clean, or upgrade. Skip all others.

Examples

From simple to advanced:

/evolve 5                              # 5 cycles on current directory
/evolve 10 ~/Desktop/codemap           # 10 cycles on a specific project
/evolve ~/Desktop/my-project           # run until final form (infinite)

/evolve --dry-run ~/Desktop/my-app     # preview what it would do — no changes
/evolve --only clean 20 ~/Desktop/app  # 20 cycles, cleanup only
/evolve --only fix ~/Desktop/app       # grind through all fixable bugs, ignore the rest
/evolve --focus src/api 10 ~/Desktop/app  # only evolve the API subdirectory

/evolve --goals ~/Desktop/my-project   # pick goals from a menu, then grind
/evolve --resume ~/Desktop/my-project  # continue where the last session left off

Flag combinations

Flags compose. The skill documents these combined behaviors explicitly:

Flags Behavior
--goals --resume Load .evolve-goals, skip discovery, continue grinding remaining goals
--dry-run --goals Run discovery and show the menu, but don't start grinding
--dry-run --resume Show what the next iteration would do based on current state
--focus --goals Discovery phase only scans the focused subdirectory
--only fix --goals Only show fix goals in the discovery menu
--only clean Clean dead code only — skip bugs and upgrades entirely

The Priority Model

Every iteration, evolve triages the project and acts on the highest-priority category that has work. The order is fixed:

Priority Category What it looks for Commit prefix
1 Fix Security issues, crashes, compile errors, wrong output, failing tests, linter errors fix:
2 Clean Dead files, unused functions, dead dependencies, unused imports, commented-out code blocks clean:
3 Upgrade Missing capabilities, better patterns, new features aligned with project purpose upgrade:

The rule: Fix first. Clean second. Upgrade last.

A project with bugs shouldn't get new features. A project with dead code shouldn't grow more code. Upgrades only land on top of a clean, working foundation. This ordering matters more than people expect — it's why evolve can run for 20+ iterations without producing the usual "add feature on top of broken test" chaos.

Within each category, evolve picks the single highest-impact item — the worst bug, the largest dead code block, the most useful missing feature — not the easiest one.

What counts as waste (Clean mode)

In order of priority:

  1. Dead files — zero imports/references from the rest of the project
  2. Dead functions/classes — defined but never called
  3. Dead dependencies — in the manifest but never imported
  4. Unused imports — imported but never referenced
  5. Commented-out code blocks (>3 lines of actual code)

Not waste: explanatory comments, test fixtures, type contracts, feature flags.

Evolve verifies before removing — greps for all references (including strings, configs, dynamic imports, CLI entry points, reflection, metaprogramming). If there's any doubt, the candidate is skipped.

What counts as an upgrade

Research-informed, on-mission additions. Evolve uses WebSearch (and context7/wiki/claude-mem when relevant — see Integrations) to find what similar tools do that this one doesn't. It picks the highest-impact gap.

Scope guard: Upgrades must align with the project's existing purpose. Don't add a web UI to a CLI tool. Don't add AI features to a logging library. Extend what the project already does — don't pivot it. If an upgrade doesn't clearly fit, evolve skips it.


The Loop — What Each Iteration Does

Evolve executes 11 steps per cycle:

Test → Codemap → Triage → Research → Pick ONE → Safety → Implement → Validate → Verify → Log+Commit → Report
                    ↓ nothing found → final form → end report + push + release
                    ↓ risky → STOP and ask human
                    ↓ verify fails → revert + retry (max 2)

1. Test. Runs the project's test suite and captures output, errors, warnings. If there's no test suite, falls back to the build step, CLI --help, library parse/import check, or skips to step 2 for config/docs projects. Also runs linters when available: cargo clippy, eslint, ruff / flake8, go vet. Linter errors count as bugs; warnings as upgrade candidates.

2. Deep scan with codemap. If codemap is installed (see Integrations), runs dead-functions, orphan-files, complexity, hotspots, and unreachable analyses. Without codemap, falls back to Grep/Glob for unused exports, unreferenced files, and TODO/FIXME/HACK markers. Also checks for outdated dependencies (npm outdated, pip list --outdated, cargo outdated, go list -m -u all). Minor/patch bumps become upgrade candidates; major bumps get flagged for the safety check.

Large codebases (>500 files): Evolve dispatches parallel subagents via the Agent tool — one for dead code, one for complexity, one for TODO markers — and merges findings in step 3. Much faster than sequential scanning.

3. Triage. Combines test + lint + codemap findings and runs a security sweep via Grep: hardcoded credentials (password\s*=\s*['"]), known security debt (TODO.*security|FIXME.*auth|HACK.*token), and taint analysis if codemap supports it. Categorizes everything as Security/Bug → Fix, Waste → Clean, Gap → Upgrade. Acts on the highest-priority category with items. Security issues are the top Fix priority.

4. Research. WebSearch for upgrade patterns, error messages, or framework best practices. WebFetch for docs, changelogs, examples. When applicable, queries wiki (AI/ML projects), context7 (framework docs), and claude-mem (past session context) — see Integrations.

5. Pick ONE. The single highest-impact item from the active category. Evolve states what it's doing and why in one sentence.

6. Safety check. Classifies the risk:

  • High risk (could break production, data loss, irreversible) → STOP and ask human.
  • Medium risk (might break tests, uncertain scope) → test in a git worktree add isolate. Merge back on success, discard on failure.
  • Low risk → proceed directly.

If --dry-run, prints the picked item and source, then skips to the next iteration without modifying anything.

7. Implement. One focused change, minimal blast radius. No unrelated refactors, no surrounding-code improvements — just the one thing.

8. Post-change codemap validation. If codemap is available, runs blast-radius, complexity, and dead-functions on the changed files. Reverts if blast radius is unexpectedly large (>20% of codebase), complexity spiked, or the change created new dead code.

9. Verify. Re-runs the step-1 test. The change must be visible with no regressions. If verification fails, the working tree is reverted with git checkout -- . and the candidate is retried with a different approach (max 2 attempts). After two failures, it's skipped and the next candidate is picked.

10. Log + commit. Appends to EVOLUTION.log and commits in a single Bash call (the skill explicitly fuses these so there's no stale log-without-commit state). Commit message body includes source and iteration number — git log stays self-documenting even without the log file.

11. Report. Prints one line: [N/total] <prefix>: <what> — verified on <target>. Updates a visual task tracker via TaskCreate / TaskUpdate. Then loops.

Publishing behavior

  • Push every 20 iterationsgit push origin HEAD after step 11 when iteration_count % 20 == 0. Silent no-op if no remote.
  • Always push on completion — all remaining commits get pushed when the run ends.
  • GitHub release on completion — if gh is installed and there's a remote, evolve creates a release titled "Evolution: <N> changes applied" with EVOLUTION.log entries as notes. Skipped if fewer than 3 changes.
  • iMessage notification — if iMessage MCP tools are available and a self-chat is configured, sends a completion summary via reply. Never messages other contacts.

Safety Rails

Evolve is autonomous. The safety model is what makes that acceptable.

Pre-flight checks

Before the first iteration, evolve verifies:

  • Target exists and is a git repo with at least one commit. Refuses otherwise (evolve needs git for commits, reverts, and push).
  • Working tree is cleangit status --porcelain must be empty. Refuses to start on a dirty tree to avoid mixing evolve's commits with the user's unfinished work. Tells the user to commit or stash first.
  • Warns but proceeds if the branch has an open PR (changes will land on the PR) or is behind the remote (suggests git pull first).

CLAUDE.md awareness

If the target project has a CLAUDE.md, evolve reads it first and respects everything in it — build commands, conventions, constraints, "don't touch this module" instructions. Project-specific rules override evolve's defaults.

Containment

Evolve only modifies files inside the target directory. Hard rule, no exceptions:

  • No edits to ~/.claude/, settings.json, plugins, build systems, or anything the user's other tools depend on.
  • No system-level installs (brew, apt, pip install --global, cargo install). Project-local manifest additions (npm/cargo/pip dependencies) are fine.
  • No modifications to other projects, even if they depend on this one.
  • If a change would require reaching outside the target, it's skipped and the next candidate is picked.
  • New files are fine — inside the target only.
  • The project's CLI/output contract can be extended (new flags, new actions) but existing behavior is never broken.

One change per cycle

Each iteration produces exactly one commit with one change. Small, verifiable, reversible. This is what makes individual reverts safe — no entangled multi-change commits.

Revert-on-failure

If verification fails:

  1. git checkout -- . restores the working tree.
  2. Retry with a different approach (max 2 attempts).
  3. If both attempts fail, the candidate is skipped and the loop picks the next item.

Medium-risk changes run in a git worktree (git worktree add) before merging — if verification fails in the worktree, the main branch never saw the change.

Self-termination

The skill has explicit criteria for "stop, don't invent busywork." After each triage, evolve asks whether the next best action is actually worth doing. It stops and reports final form if any of these are true:

  • No bugs, no dead code, and no upgrades that align with the project's purpose.
  • The remaining candidates are all cosmetic (renaming, rewording, reformatting).
  • The last 3 iterations were all in the same category with diminishing impact.
  • Evolve is tempted to add features the project doesn't need just to keep going.
  • The only upgrades left would increase complexity without clear user value.

If you have to justify why a change matters, it probably doesn't. That's in the skill verbatim.


Goal-Directed Mode (--goals)

Instead of letting evolve pick what to work on, pass --goals and choose from a menu.

Phase 1: Discovery

Evolve runs steps 1–4 (Test, Codemap, Triage, Research) but collects all findings across all categories instead of picking one. It presents them as a numbered menu:

=== Evolution Goals ===
Project: ~/Desktop/my-project

Fixes available:
  1. [fix] Auth middleware crashes on expired JWT tokens
  2. [fix] Race condition in WebSocket reconnect

Cleanup available:
  3. [clean] 12 dead functions in utils/ (codemap)
  4. [clean] 3 unused dependencies in package.json

Upgrades possible:
  5. [upgrade] Add rate limiting to API endpoints
  6. [upgrade] Migrate from CommonJS to ESM
  7. [upgrade] Add OpenTelemetry tracing
  8. [upgrade] Connection pooling for database

Pick goals (comma-separated, or 'all'): _

Phase 2: Persist

The selection is written to .evolve-goals in the target directory and committed with the first change. This enables --resume to continue across sessions.

Phase 3: Grind

Evolve works through goals in priority order — fixes first, then clean, then upgrades. Each goal may take multiple iterations. .evolve-goals is updated as each goal completes (pendingin-progressdone / skipped). The triage step is narrowed: it only considers items that serve selected goals and ignores other findings. If a goal fails twice, it's marked [skipped] and evolve moves on.

Progress reporting:

[3/∞] upgrade: add rate limiting to /api/users — 1 of 3 goals complete

Resume Mode (--resume)

Pass --resume to continue a previous run. Evolve:

  1. Reads EVOLUTION.log in the target directory and parses entries to understand what's already been applied.
  2. Loads .evolve-goals if present and continues in goal-directed mode with remaining [pending] items.
  3. Counts completed iterations from the log and adjusts the remaining count accordingly.
  4. Prints a summary of prior progress, then continues from the next iteration.

If no EVOLUTION.log exists, --resume behaves identically to a fresh run.

Typical workflow

# Session 1 — crashes or gets interrupted after 7 iterations
/evolve 20 ~/Desktop/my-project
# ... ctrl-C or session ends ...

# Session 2 — pick up where you left off
/evolve --resume ~/Desktop/my-project
# reads log, continues from iteration 8

Combined with --goals:

# Session 1 — pick goals, start grinding
/evolve --goals ~/Desktop/my-project
# ... crash at iteration 5 of 10 goals ...

# Session 2 — resume with the same goal list
/evolve --goals --resume ~/Desktop/my-project
# skips discovery, loads .evolve-goals, continues

Integrations

Evolve uses several optional tools when available and degrades gracefully when they're not.

Tool Purpose Detection Fallback
codemap AST-level dead code, orphan files, complexity, blast radius, unreachable paths which codemap Grep/Glob for unused exports, unreferenced files, TODO/FIXME
WebSearch / WebFetch Research upgrades, best practices, error messages Built-in None — upgrades require research
context7 MCP Framework docs for React, Next.js, Express, Django, Tailwind, Prisma, etc. MCP server available Skip — falls back to WebSearch
wiki (pgwiki hybrid_search) AI/ML techniques, papers, architectural decisions MCP server available Skip — AI/ML projects only, not general code
claude-mem Past-session context for this project MCP server available Skip — no historical context
Agent tool Parallel subagent scanning for >500-file projects Built-in Sequential scanning
TaskCreate / TaskUpdate Visual progress tracking Built-in None — silent progress
git worktree Isolated testing for medium-risk changes git worktree add Direct application with revert-on-failure
gh CLI GitHub release on completion which gh Skip release, still pushes commits
iMessage MCP Completion notification for long runs MCP server + self-chat configured Skip notification

Codemap integration specifically — when present, evolve uses it in three places:

  1. Step 2 (Deep Scan): runs dead-functions, orphan-files, complexity, hotspots, unreachable to populate triage with structural issues tests won't catch.
  2. Step 3 (Triage): if taint-analysis is supported, uses it to find unsanitized inputs reaching sensitive sinks.
  3. Step 8 (Post-Change Validation): runs blast-radius, complexity, dead-functions on changed files. Reverts if blast radius >20%, complexity spiked, or new dead code appeared.

Artifacts

Evolve writes two files into the target repo.

EVOLUTION.log

Append-only log of every change applied. One block per iteration:

[2026-04-23 14:45] fix: reorder Steps 10-11 so EVOLUTION.log is written before commit
  source: manual review — Step 10 committed before Step 11 wrote the log, so log entries were never included in the commit they described

[2026-04-23 14:50] upgrade: add batch push (every 20 iterations) and GitHub release on completion
  source: user feedback — pushing every iteration spams GitHub, batch at 20 with release summary

[2026-04-23 15:02] clean: remove 47 lines — DOT graph and verbose bash examples that LLM doesn't need
  source: 361 lines was too long — DOT doesn't render in Claude Code, bash syntax is known

Each entry: timestamp, prefix (fix/clean/upgrade), description, and source (what surfaced the change — test output, codemap finding, web research, wiki, claude-mem, user feedback). The log is what --resume reads to reconstruct prior progress.

.evolve-goals

Created by --goals mode. Tracks selected goals and their status:

# .evolve-goals — auto-generated by /evolve --goals
# Status: pending | in-progress | done | skipped
[pending] fix: Auth middleware crashes on expired JWT tokens
[pending] clean: 12 dead functions in utils/
[pending] upgrade: Add rate limiting to API endpoints

Committed with the first change. Updated as goals progress. Read by --resume to continue goal-directed runs.

Commit format

Every commit from evolve looks like this:

upgrade: add rate limiting to /api/users

Source: WebSearch — Express rate-limit best practices + codemap hotspot
Iteration: 3 of 10

Prefix (fix:/clean:/upgrade:) matches the priority category. The body records source and iteration so git log is self-documenting even without EVOLUTION.log.


Troubleshooting

Symptom Cause Fix
/evolve not appearing as a command Plugin not enabled Check enabledPlugins in ~/.claude/settings.json, restart Claude Code
Plugin not detected at all Missing marketplace entry Check extraKnownMarketplaces in ~/.claude/settings.json
Stale behavior after update Skills cached at session start Restart Claude Code
"Refusing to start — working tree has uncommitted changes" Pre-flight check: dirty tree git commit or git stash before running evolve
"Refusing to start — not a git repo" Target isn't initialized for git git init and make at least one commit first
Iteration reverts repeatedly Verification failing twice in a row Evolve skips the candidate after 2 failures and moves on — normal behavior
Evolve stops early with "final form" No meaningful work left, or remaining candidates are cosmetic / busywork Intentional — evolve self-terminates rather than invent work
No push happening No git remote configured Silent no-op — add a remote or ignore
No GitHub release created gh not installed, no remote, or fewer than 3 changes All three are the documented skip conditions
No codemap step running codemap not installed Optional — falls back to Grep/Glob scanning
Dry-run still seems to do work Dry-run only skips steps 7–11 Steps 1–6 (test, scan, triage, research, pick, safety) still run to produce the report — no writes
--focus <subdir> tests still run at project root By design — tests/git operate at project root, only scanning/changes are scoped Working as documented
install.sh --check shows "BROKEN" Missing required files Re-run the installer

Sister Plugins

Evolve is part of a family of Claude Code plugins by the same author:

Plugin What it does
auditor Audits a codebase from 8 expert perspectives in parallel. Produces a report with numbered findings; fix #3 implements a fix. Great as a first pass before running evolve.
codemap AST-powered structural analysis — dead functions, orphan files, complexity, call graphs, blast radius. Evolve uses codemap internally when available.
second-opinion /so — gets a second opinion on a plan from 5 dynamically chosen expert reviewers before you commit to it.

Typical flow: /audit to survey → fix all critical to handle emergencies → /evolve to grind through the long tail → /so before any big architectural commit.


License

MIT — see LICENSE.

About

Autonomous project evolution for Claude Code — fixes bugs, removes waste, adds capabilities

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages