fix(flywheel): phase-1 repairs for broken flywheel loop#41
Merged
robotlearning123 merged 16 commits intoagent-next:mainfrom Mar 5, 2026
Merged
fix(flywheel): phase-1 repairs for broken flywheel loop#41robotlearning123 merged 16 commits intoagent-next:mainfrom
robotlearning123 merged 16 commits intoagent-next:mainfrom
Conversation
v0.1.1: Production-quality stage prompts
- Add getRepoContext() helper (file tree, language, deps detection)
- ResearchPlan: architect-grade prompt with repo awareness
- Decompose: wave ordering by import deps, self-contained tasks
- Verify: actionable errors with file paths and line numbers
- Verify→Execute feedback: generate fix tasks from errors
v0.1.2: Inter-task coherence + observability
- allowLongPrompt option in scheduler.submit()
- Wave tasks get plan context ("read .cc-pipeline/plan.md")
- verifyResults field on PipelineRun for dashboard
- markStaleRunsFailed() crash recovery on startup
v0.1.3: API polish + test coverage
- Pipeline config overrides from POST /api/pipeline
- Pipeline endpoints added to /api/docs
- 3 new tests: verify-fix grouping, crash recovery, plan context
- Per-run PipelineConfig (stage methods use cfg param)
329/329 tests pass, TSC clean.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add _runConfigs.delete() to drive().catch() error handler - Refresh RepoContext before doVerify() (execute stage modifies repo) - Cache cfg() locally in doResearchPlan and doExecute inner loop Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…op (v0.1.4) P0-a: Retry with error context — inject previous error into prompt on retry instead of clearing it. Both auto-retry (executeAndRelease) and manual requeue() now append error context (capped at 500 chars). P0-b: Wave file conflict validation — extractFilePaths() extracts file paths from task prompts, validateWaves() detects intra-wave file conflicts and moves conflicting tasks to subsequent waves. Prevents parallel agents from editing the same file. P1-a: Budget-based retry loop with dead-loop detection — verify stage now considers three stop conditions: budget exhausted, same errors repeated (dead loop), or max iterations reached. Added totalBudget field to PipelineConfig (default $50). Tests: 341 pass (was 329), TSC clean. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ask model override
Task Classifier: classifyTask() pure function auto-assigns model/timeout/ budget based on prompt analysis. quick (<200 chars, ≤1 file) → haiku/120s/$1, deep (refactor/redesign/architect, 3+ files) → opus/600s/$10, standard → sonnet/300s/$5. Integrated into scheduler.submit() with caller-override priority. Model Fallback: on retry, swap agent (claude↔codex) via AgentRunner.pickFallbackAgent() for better chance of success. Pipeline note: task-classifier.ts, types.ts model field, and agent-runner.ts pickFallbackAgent were created by cc-manager's own pipeline (first successful self-hosted run). Scheduler integration, tests, and classifier bug fixes done manually. Tests: 357 pass (was 341), TSC clean. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…nd rebaseOnMain()
…ze dependsOn array as JSON
…el escalation, array
…base, multi-dep DAG (v0.1.6) Phase 1 Critical Path: - Fix prompt accumulation: use _originalPrompt to rebuild from original + latest error - Model escalation: retryCount >= 2 upgrades to opus via modelOverride - Staged rebase: after merge, rebase all active worktrees onto new main - Dependency DAG: check ALL deps in array, not just first element - agent-runner respects task.modelOverride in both CLI and SDK paths Pipeline generated: types, store, worktree-pool, tests Manual fixes: scheduler integration, multi-dep check, prompt accumulation, rebase wiring Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… session resume, codex config F1: Detect empty commits after agent exits — fail task instead of silent success F2: CRITICAL commit enforcement in task prompt F3: Post-merge working directory sync (syncMainWorktree) F4: Complete pricing table with all 6 models (haiku, sonnet, opus, gpt-5.4, gpt-5.4-wide, o4-mini) F5: Capture sessionId from Claude stream-json, --resume on retry F6: --json-schema structured output for review agent F7: Codex GPT-5.4 routing for deep/integration tasks, classifier outputs agent+contextProfile F9: Codex config.toml profile management (default + wide 1M context) 372 tests pass, TSC clean. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- docs/STRATEGY.md: competitive landscape, four pillars, borrowed patterns - docs/3-agents-reference.md: Claude CLI, SDK, Codex features + gaps - docs/research/: agent landscape, model pricing, NeurIPS findings - docs/plans/: v0.1.6, v0.1.7, v0.2 implementation plans - docs/ROADMAP.md, GAP-ANALYSIS.md, COMPETITIVE-ANALYSIS.md, etc. - .cc-pipeline/: pipeline artifacts from self-hosting runs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- CLAUDE.md: v0.1.0→v0.1.7, 282→372 tests, add pipeline modules, honest flywheel status (NOT WORKING) - CONFIGURATION.md: claude-opus-4-5→claude-opus-4-6 - ROADMAP.md: update current state to v0.1.7, mark completed features, honest assessment Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
8 bug fixes from the full flywheel audit (B1-B8): - fix(worktree): remove hardcoded v1/ from node_modules symlink (P0-W1) - fix(worktree): release() defaults to merged:false (P0-W2) - fix(agent-runner): meta tasks skip commit instructions and build verify (P0-A1) - fix(pipeline): ensureMetaTaskSucceeded / runMetaTask helper (P0-A2) - fix(scheduler): review rejection and merge conflict set status="failed" (P0-S1) - fix(scheduler): populate mergeGate on all review/merge paths - fix(pipeline): cancel aborts running tasks, cleans all Maps (P0-S2) - fix(pipeline): per-run state isolation with Maps + scoped dirs (P0-S3) Additional quality fixes from simplify review: - cancelledTasks guard: don't overwrite "failed" with "cancelled" - N→1 DB writes: pipelineStore.save per-wave not per-task - Extract runMetaTask() to deduplicate 3x meta-task pattern - MergeGateState type + persistence in store 384 tests pass, 0 fail, TSC clean. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
GitHub Actions runners lack git user.name/email config, causing `git commit --allow-empty` to fail. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
8 bug fixes from the full flywheel audit, addressing critical failures in every stage of the Plan → Execute → Merge → Learn loop:
v1/from node_modules symlink;release()defaults tomerged:falserunMetaTask()helper with status checking; per-run state isolation (Map + scoped dirs); cancel aborts running tasks and cleans all Maps; N→1 DB writes per wavetask.status="failed";cancelledTasksguard prevents overwriting failure status;mergeGatepopulated on all review/merge pathsMergeGateStatetype + persistence viamerge_gatecolumnTest plan
npx tsc --noEmit— zero errors🤖 Generated with Claude Code