Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
16 commits
Select commit Hold shift + click to select a range
76e3d70
feat(pipeline): autonomous pipeline v0.1.1-v0.1.3
robotlearning123 Mar 5, 2026
a27bd6b
fix(pipeline): memory leak, stale ctx, redundant map lookups
robotlearning123 Mar 5, 2026
1d0fa91
feat(pipeline): error context retry, wave validation, budget-based lo…
robotlearning123 Mar 5, 2026
d3acd53
task(f816b860e3c946ac): feat(classifier): add classifyTask pure function
robotlearning123 Mar 5, 2026
c7235e5
task(b24b8f37e8d64ede): feat(types): add optional model field to Task
robotlearning123 Mar 5, 2026
8276adb
task(654d29040498495e): feat(agent-runner): pickFallbackAgent + per-t…
robotlearning123 Mar 5, 2026
ca855f3
feat(scheduler): task classifier + model fallback on retry (v0.1.5)
robotlearning123 Mar 5, 2026
66bb12b
task(6f72353ece8b4c67): feat(worktree-pool): add getActiveWorkers() a…
robotlearning123 Mar 5, 2026
721084d
task(6986f62d76ef4228): feat(store): persist _originalPrompt; seriali…
robotlearning123 Mar 5, 2026
34fcff9
task(71e20b9544914d85): test(scheduler): prompt accumulation fix, mod…
robotlearning123 Mar 5, 2026
b4d29d2
feat(scheduler): prompt accumulation fix, model escalation, staged re…
robotlearning123 Mar 5, 2026
265e914
feat(v0.1.7): empty commit detection, GPT-5.4 routing, pricing table,…
robotlearning123 Mar 5, 2026
35a59b1
docs: add strategy, research, plans, and pipeline artifacts
robotlearning123 Mar 5, 2026
f0f0e23
docs: fix stale versions, model names, honest flywheel status
robotlearning123 Mar 5, 2026
6d8f462
fix(flywheel): phase-1 repairs for broken flywheel loop
robotlearning123 Mar 5, 2026
b613481
fix(test): add git config for CI in pipeline full-flow test
robotlearning123 Mar 5, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
961 changes: 961 additions & 0 deletions .cc-pipeline/plan.md

Large diffs are not rendered by default.

37 changes: 37 additions & 0 deletions .cc-pipeline/tasks.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
{
"waves": [
{
"waveIndex": 0,
"tasks": [
"Modify src/types.ts only. (1) In the Task interface, after the line `model?: string;`, add two new optional fields: `modelOverride?: string;` and `_originalPrompt?: string;`. (2) Change the existing `dependsOn?: string;` field to `dependsOn?: string | string[];`. No other changes are needed — createTask() already handles the widened type via opts?.dependsOn assignment. Run `npx tsc --noEmit` and verify no errors. Then commit: `git add -A && git commit -m \"feat(types): add modelOverride, _originalPrompt; widen dependsOn to string|string[]\"`"
]
},
{
"waveIndex": 1,
"tasks": [
"Modify src/store.ts only. Apply four changes in sequence: (1) In migrate(), after the existing `review` column ALTER TABLE try/catch block, add: `try { this.db.exec(\"ALTER TABLE tasks ADD COLUMN original_prompt TEXT\"); } catch {}`. (2) In taskToParams(), append `task._originalPrompt ?? null` as the 26th element of the returned array. (3) Update all INSERT SQL strings in save(), updateBatch() insertStmt, and saveBatch() insertStmt to add `original_prompt` to the column list and a 26th `?` to VALUES. Update all UPDATE SQL strings to add `original_prompt=?` before `WHERE id=?` (the params array is already 26 elements from step 2). (4) In update() fieldMap, replace the dependsOn entry with: `dependsOn: { col: 'depends_on', serialize: (v) => v == null ? null : Array.isArray(v) ? JSON.stringify(v as unknown[]) : v as string }`, and add `_originalPrompt: { col: 'original_prompt' }`. In rowToTask(), replace the dependsOn line with: `dependsOn: (() => { const raw = row.depends_on as string|null|undefined; if (!raw) return undefined; if (raw.startsWith('[')) { try { return JSON.parse(raw) as string[]; } catch { return raw; } } return raw; })()`, and add `_originalPrompt: (row.original_prompt as string | null) ?? undefined`. Run `npx tsc --noEmit`. Commit: `git add -A && git commit -m \"feat(store): persist _originalPrompt; serialize dependsOn array as JSON\"`",
"Modify src/worktree-pool.ts only. Insert two new public methods after the getWorkerStats() method, before the private git() helper: (1) `getActiveWorkers(exclude?: string): string[]` — iterate `this.workers.values()`, push `w.name` into result array when `w.busy && w.name !== exclude`, return result. (2) `async rebaseOnMain(workerName: string): Promise<boolean>` — get worker via `this.workers.get(workerName)`, return false if not found; call `const { stdout } = await this.git('rev-parse', 'main')` to get mainSha (trim it); call `await this.gitIn(w.path, 'rebase', mainSha)`; on any error, call `await this.gitIn(w.path, 'rebase', '--abort').catch(() => {})`, log a warn with `log('warn', '[pool] rebaseOnMain: conflict, aborted', { worker: workerName })`, and return false; return true on success. Use the existing `log` import and `this.git`/`this.gitIn` helpers already in the file. Run `npx tsc --noEmit`. Commit: `git add -A && git commit -m \"feat(worktree-pool): add getActiveWorkers() and rebaseOnMain()\"`"
]
},
{
"waveIndex": 2,
"tasks": [
"Modify src/agent-runner.ts only. Make exactly two line changes: (1) In runClaudeSDK(), find the line `model: task.model ?? this.model,` and change it to `model: task.modelOverride ?? task.model ?? this.model,`. (2) In runClaude(), find the line `\"--model\", task.model ?? this.model,` and change it to `\"--model\", task.modelOverride ?? task.model ?? this.model,`. No other changes. Run `npx tsc --noEmit` and verify no errors. Commit: `git add -A && git commit -m \"feat(agent-runner): honour task.modelOverride in runClaude and runClaudeSDK\"`"
]
},
{
"waveIndex": 3,
"tasks": [
"Modify src/scheduler.ts only. Apply four sub-changes then compile and commit. (1) In executeAndRelease(), in the retry block where task.prompt is modified: after `task.retryCount++`, add `if (!task._originalPrompt) { task._originalPrompt = task.prompt; }`, then replace the existing prompt-mutation line with `task.prompt = prevError ? \\`${task._originalPrompt}\\n\\n---\\n## Previous Attempt Failed (attempt ${task.retryCount})\\nError: ${errorContext}\\nFix the error above and try again.\\` : task._originalPrompt;`, then after that add `if (task.retryCount >= 2) { task.modelOverride = 'claude-opus-4-6'; }`. (2) Same fix in requeue(): before the existing prompt mutation, add `if (!task._originalPrompt) { task._originalPrompt = task.prompt; }`, replace the prompt line to rebuild from task._originalPrompt, and after `task.retryCount += 1` add `if (task.retryCount >= 2) { task.modelOverride = 'claude-opus-4-6'; }`. (3) In loop(), replace the single-ID dependency check block with array-aware logic: `const depIds = Array.isArray(task.dependsOn) ? task.dependsOn : [task.dependsOn]; let anyFailed = false, failedDepId: string|undefined, failedDepStatus: string|undefined, allSuccess = true; for (const depId of depIds) { const dep = this.tasks.get(depId) ?? this.store.get(depId) ?? undefined; if (!dep || ['failed','timeout','cancelled'].includes(dep.status)) { anyFailed = true; failedDepId = depId; failedDepStatus = dep?.status ?? 'missing'; break; } if (dep.status !== 'success') allSuccess = false; }` then fail-fast if anyFailed, re-queue if !allSuccess. (4) After `const mergeResult = await this.pool.release(...)`, add: `if (shouldMerge && mergeResult.merged) { for (const w of this.pool.getActiveWorkers(workerName)) { this.pool.rebaseOnMain(w).catch((err: unknown) => { log('warn','staged rebase failed',{worker:w,error:String(err)}); }); } }`. Also update submit() opts signature: change `dependsOn?: string` to `dependsOn?: string | string[]`. Run `npx tsc --noEmit`. Commit: `git add -A && git commit -m \"feat(scheduler): fix prompt accumulation, model escalation, array dependsOn, staged rebase\"`"
]
},
{
"waveIndex": 4,
"tasks": [
"Modify src/__tests__/worktree-pool.test.ts only. Append two describe blocks at end of file. Block 1 'WorktreePool.getActiveWorkers': (a) test returns [] when no workers busy — init pool, call getActiveWorkers(), assert empty; (b) test returns both acquired worker names — acquire two workers, assert getActiveWorkers().length===2 and includes both names; (c) test excludes named worker — acquire two, call getActiveWorkers(w1.name), assert length===1 and result[0]===w2.name. Block 2 'WorktreePool.rebaseOnMain': (a) test returns false for unknown name — init pool, call rebaseOnMain('nonexistent'), assert false; (b) test returns true when up-to-date — acquire worker, call rebaseOnMain(worker.name), assert true; (c) test returns true after rebasing onto new main commits — acquire w0, write a file in w0.path, git add+commit in w0's worktree; acquire w1, write a different file in w1.path, git add+commit in w1's worktree, capture HEAD sha, call `git update-ref refs/heads/main <sha>` in repoPath to advance main; then call rebaseOnMain(w0.name) and assert true. Each test uses makeTempRepo() and cleanup() in finally. Run `node --import tsx --test src/__tests__/worktree-pool.test.ts`. Commit: `git add -A && git commit -m \"test(worktree-pool): getActiveWorkers and rebaseOnMain coverage\"`",
"Modify src/__tests__/scheduler.test.ts only. (1) Update makePool() to add two stubs: `getActiveWorkers: (_exclude?: string) => [] as string[]` and `rebaseOnMain: async (_name: string) => true`. (2) Append three describe blocks. 'Scheduler retry — prompt accumulation fix': (a) test that on 3 attempts (2 failures then success), each retry prompt contains exactly one '## Previous Attempt Failed' section — use regex match count; (b) test that task._originalPrompt equals the original submitted prompt string after first retry. 'Scheduler retry — model escalation': test that across 3 failed attempts (maxRetries:2), modelOverride captured at attempt index 0 and 1 is undefined, and at index 2 is 'claude-opus-4-6'. 'Scheduler dependency DAG — array dependsOn': (a) test task with dependsOn:[dep1.id,dep2.id] succeeds only after both deps complete, using completion order array to verify ordering; (b) test task fails immediately with error referencing the failed dep ID when one dep in the array fails; (c) backward-compat test: string dependsOn (not array) still works without errors. Use makeStore() and inline runner mocks with setTimeout delays. Run `node --import tsx --test src/__tests__/scheduler.test.ts` then `node --import tsx --test src/__tests__/*.test.ts`. Commit: `git add -A && git commit -m \"test(scheduler): prompt accumulation fix, model escalation, array dependsOn coverage\"`"
]
}
],
"totalTasks": 7
}
23 changes: 23 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Repository Guidelines

## Project Structure & Module Organization
`src/` contains the TypeScript application code. Core runtime modules include `scheduler.ts`, `agent-runner.ts`, `store.ts`, `server.ts`, and the CLI entrypoints `index.ts` and `cli.ts`. Tests live in `src/__tests__/` and follow the source module names, for example `src/__tests__/scheduler.test.ts`. The web dashboard is a single static file at `src/web/index.html`; built output goes to `dist/`. Longer-form design and API docs live in `docs/`, and `ARCHITECTURE.md` explains the dependency flow between modules. Runtime artifacts such as `.cc-manager.db` and `.worktrees/` should not be treated as source.

## Build, Test, and Development Commands
Use Node.js 20+.

- `npm install`: install dependencies and enable the repo’s git hooks via `prepare`.
- `npm run dev`: run the app from source with `tsx`.
- `npm run build`: compile TypeScript to `dist/` and copy the web UI asset.
- `npm test`: run the Node test runner against `src/__tests__/*.test.ts`.
- `npx tsc --noEmit`: run the strict type check used by CI and the pre-commit hook.
- `npm run start -- --repo /path/to/repo`: run the built server locally.

## Coding Style & Naming Conventions
Follow `.editorconfig`: UTF-8, LF, and 2-space indentation. Keep source in TypeScript under `src/`; do not add plain `.js` source files there. Use Node ESM import paths with explicit `.js` extensions, for example `import { Store } from './store.js';`. Match existing file naming: kebab-case module files and `*.test.ts` for tests. Prefer explicit types and keep `strict`-mode compatibility. For the dashboard, keep `src/web/index.html` framework-free and self-contained.

## Testing Guidelines
Add or update targeted tests in `src/__tests__/` whenever behavior changes. Keep test filenames aligned with the module under test and cover both success and failure paths for scheduler, store, server, or worktree behavior. Before opening a PR, run `npx tsc --noEmit`, `npm test`, and `npm run build` if your change affects runtime packaging.

## Commit & Pull Request Guidelines
Recent history uses Conventional Commits, usually with a scope: `feat(scheduler): ...`, `fix(pipeline): ...`, `docs: ...`. Keep commits focused and descriptive. PRs should answer: what changed, why it changed, and how to test it. Follow `.github/pull_request_template.md`: confirm tests pass, types compile, `console.log` calls are removed, and docs are updated when needed.
16 changes: 14 additions & 2 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,10 @@ Multi-agent orchestrator that runs parallel Claude Code agents in git worktrees.
- **src/worktree-pool.ts** — Git worktree lifecycle, parallel init, merge
- **src/store.ts** — SQLite persistence (better-sqlite3, WAL mode)
- **src/types.ts** — Shared TypeScript types
- **src/pipeline.ts** — 5-stage autonomous pipeline (research→decompose→execute→verify→done)
- **src/pipeline-types.ts** — Pipeline type definitions
- **src/pipeline-store.ts** — Pipeline run persistence
- **src/task-classifier.ts** — Task routing (quick/standard/deep → model/agent/contextProfile)
- **src/logger.ts** — Structured JSON logger
- **src/web/index.html** — Dashboard (vanilla HTML/JS, dark theme)

Expand All @@ -30,7 +34,7 @@ node dist/index.js --repo /path/to/repo --workers 5 --port 8080
```

```bash
# Run tests (282 tests across 8 suites)
# Run tests (372 tests across 10 suites)
node --import tsx --test src/__tests__/*.test.ts
```

Expand All @@ -45,6 +49,14 @@ node --import tsx --test src/__tests__/*.test.ts
## Agent Flywheel Strategy
The cc-manager improves itself by running agents against its own codebase.

### Current Status: NOT WORKING (v0.1.7)
Two self-hosting runs (v0.1.5, v0.1.6) achieved only 43-50% commit rate. All "successful" runs required manual fixes. The flywheel loop does not yet produce reliable, mergeable code autonomously.

**Root causes identified**:
- Agents exit 0 without committing (fixed in v0.1.7: F1 empty commit detection)
- Complex files (scheduler.ts 618 LOC) always fail multi-point integration
- System prompt commit instruction too weak (fixed in v0.1.7: F2 CRITICAL warning)

### Proven Best Practices
- **240s timeout** — sweet spot (120s = 80% failure, 180s = occasional timeout)
- **One file per task** — prevents merge conflicts between concurrent agents
Expand Down Expand Up @@ -146,7 +158,7 @@ pending → running → success (branch merged to main)

## Repository
- **GitHub**: `agent-next/cc-manager` (private)
- **Version**: v0.1.0
- **Version**: v0.1.7

## Security Notes
- **No authentication**: cc-manager has no auth. It is a local dev tool — do NOT expose to the public internet.
Expand Down
Loading
Loading