Ralph is Flow-Code's repo-local autonomous harness. It loops over tasks, applies multi-model review gates, and produces production-quality code overnight.
TL;DR: External shell loop → fresh Claude session per task → cross-model review gates → receipt-based proof-of-work → iterate until SHIP.
- Quick Start
- Architecture
- Quality Gates
- Configuration Reference
- Review Backends
- Run Artifacts
- Controlling Ralph
- Testing & Debugging
- Safety & Isolation
- Troubleshooting
- Morning Review Workflow
# Inside Claude Code
/flow-code:ralph-init
# Or from terminal
claude -p "/flow-code:ralph-init"Creates scripts/ralph/ with:
| File | Purpose |
|---|---|
ralph.sh |
Main loop |
ralph_once.sh |
Single iteration (testing) |
config.env |
All settings |
runs/ |
Artifacts and logs |
Edit scripts/ralph/config.env:
PLAN_REVIEW=codex # rp, codex, or none
WORK_REVIEW=codex # rp, codex, or nonescripts/ralph/ralph_once.shAlways test first. Runs one iteration then exits. Observe before committing to a full run.
scripts/ralph/ralph.shRalph spawns Claude sessions via claude -p, loops until done, and applies review gates.
Watch mode — see activity in real-time:
scripts/ralph/ralph.sh --watch # Tool calls only
scripts/ralph/ralph.sh --watch verbose # Include model responses
scripts/ralph/ralph.sh --config alt.env # Use alternate config fileRun manually in terminal:
rm -rf scripts/ralph/┌─────────────────────────────────────────────────────────────┐
│ scripts/ralph/ralph.sh │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ while flowctl next returns work: │ │
│ │ 1. claude -p "/flow-code:plan" or :work │ │
│ │ 2. check review receipts │ │
│ │ 3. if missing/invalid → retry │ │
│ │ 4. if SHIP verdict → next task │ │
│ └────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
flowchart TD
A[ralph.sh loop] --> B[flowctl next]
B -->|plan needed| C[/flow-code:plan/]
C --> D[/flow-code:plan-review/]
B -->|work needed| E[/flow-code:work/]
E --> F[/flow-code:impl-review/]
B -->|completion review needed| K[/flow-code:epic-review/]
D --> G{Receipt valid?}
F --> G
K --> G
G -- yes --> H{Verdict = SHIP?}
H -- yes --> B
H -- no --> I[Fix issues, retry review]
I --> G
G -- no --> J[Force retry iteration]
J --> B
Anthropic's official ralph-wiggum uses a Stop hook to keep Claude in the same session. Flow-Code inverts this for production-grade reliability.
| Aspect | ralph-wiggum | Ralph |
|---|---|---|
| Session | Single, accumulating | Fresh per iteration |
| Loop | Stop hook, same session | External bash, new claude -p |
| Context | Grows until full | Clean slate every time |
| Failed attempts | Pollute future work | Gone with session |
| Re-anchoring | None | Every iteration |
| Quality gates | Tests only | Multi-model reviews |
| Stuck detection | --max-iterations |
Auto-block after N failures |
| Auditability | Session transcript | Logs + receipts + evidence |
The core problems with ralph-wiggum:
- Context pollution — Failed attempts mislead future iterations
- No re-anchoring — Claude loses sight of the spec as context fills
- Single model — Claude grades its own homework
- Binary outcome — Completion promise or max iterations
Ralph's solution: Fresh context + multi-model review gates + receipt-based proof-of-work.
Ralph enforces quality through four mechanisms:
A second model verifies code. Two models catch what one misses.
| Backend | Platform | Context | Recommended |
|---|---|---|---|
rp |
macOS (GUI) | Full file context via Builder | Yes |
codex |
Cross-platform | Heuristic context from changed files | Fallback |
none |
Any | — | Not for production |
Two review types:
- Plan reviews — Verify architecture before coding starts
- Impl reviews — Verify implementation meets spec after coding
The plan review gate ensures epics are architecturally sound before any implementation begins. This catches design issues early when they're cheap to fix.
┌─────────────────────────────────────────────────────────────┐
│ flowctl next --require-plan-review │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ 1. Find epics with plan_review_status = unknown │ │
│ │ 2. Return status=plan, epic=fn-1 │ │
│ │ 3. Ralph invokes /flow-code:plan-review fn-1 │ │
│ │ 4. Skill loops until <verdict>SHIP</verdict> │ │
│ │ 5. flowctl epic set-plan-review-status fn-1 --status ship │
│ │ 6. Next iteration: epic unlocked for work │ │
│ └────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
Both settings are required for plan reviews:
# config.env
REQUIRE_PLAN_REVIEW=1 # Gate: don't start work until plans reviewed
PLAN_REVIEW=codex # Backend: rp, codex, or exportREQUIRE_PLAN_REVIEW |
PLAN_REVIEW |
Behavior |
|---|---|---|
0 |
any | Plans auto-ship, work starts immediately |
1 |
rp |
Plans reviewed via RepoPrompt |
1 |
codex |
Plans reviewed via Codex CLI |
1 |
export |
Context exported for manual review |
1 |
none |
Blocked forever — no backend to review |
Common mistake: Setting
REQUIRE_PLAN_REVIEW=1without aPLAN_REVIEWbackend. Ralph will block on every epic with no way to proceed.
When flowctl next returns status=plan:
-
Checkpoint — Save epic state before review
flowctl checkpoint save --epic fn-1 --json
-
Review — Invoke the plan review skill
/flow-code:plan-review fn-1 --review=codex
-
Fix loop — If
NEEDS_WORK:- Parse reviewer feedback
- Update epic spec via
flowctl epic set-plan - Sync affected task specs via
flowctl task set-spec - Re-review (same chat for RP, receipt continuity for Codex)
- Repeat until
SHIP
-
Receipt — Write proof-of-work
{"type":"plan_review","id":"fn-1","mode":"codex","timestamp":"..."} -
Unlock — Set status to ship
flowctl epic set-plan-review-status fn-1 --status ship
If context compacts during review cycles:
flowctl checkpoint restore --epic fn-1 --jsonThis restores the epic/task state from before the review started.
# Check all epics
flowctl epics --json | jq '.epics[] | {id, plan_review_status}'
# Check specific epic
flowctl show fn-1 --json | jq '.plan_review_status'
# Find epics needing review
flowctl next --require-plan-review --json| Aspect | Plan Review | Impl Review |
|---|---|---|
| When | Before coding | After coding |
| Reviews | Epic + task specs | Code changes |
| Blocks | All tasks in epic | Single task |
| Focus | Architecture, feasibility, scope | Correctness, security, tests |
| Config | PLAN_REVIEW + REQUIRE_PLAN_REVIEW |
WORK_REVIEW |
The epic-completion review gate ensures implementation matches the spec before closing an epic. Runs after all tasks complete, checking for requirement gaps.
┌─────────────────────────────────────────────────────────────┐
│ flowctl next --require-completion-review │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ 1. All tasks done, completion_review_status != ship │ │
│ │ 2. Return status=completion_review, epic=fn-1 │ │
│ │ 3. Ralph invokes /flow-code:epic-review fn-1 │ │
│ │ 4. Skill loops until <verdict>SHIP</verdict> │ │
│ │ 5. flowctl epic set-completion-review-status fn-1 --status ship │
│ │ 6. Next iteration: epic can close │ │
│ └────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
# config.env
COMPLETION_REVIEW=codex # Backend: rp, codex, or noneWhen COMPLETION_REVIEW != none, Ralph passes --require-completion-review to the selector. There is no separate REQUIRE_COMPLETION_REVIEW flag—the presence of a backend implies the gate is active.
COMPLETION_REVIEW |
Behavior |
|---|---|
rp |
Completion reviewed via RepoPrompt |
codex |
Completion reviewed via Codex CLI |
none |
No completion review, epics close immediately |
When flowctl next returns status=completion_review:
-
Review — Invoke the epic-review skill
/flow-code:epic-review fn-1 --review=codex
-
Fix loop — If
NEEDS_WORK:- Parse reviewer feedback (requirement gaps, missing functionality)
- Implement missing requirements inline
- Re-review (same chat for RP, receipt continuity for Codex)
- Repeat until
SHIP
-
Receipt — Skill writes proof-of-work to
receipts/completion-fn-1.json{"type":"completion_review","id":"fn-1","mode":"codex","verdict":"SHIP","timestamp":"..."} -
Unlock — Set status to ship
flowctl epic set-completion-review-status fn-1 --status ship
-
Close — Epic can now close normally
| Issue Type | Example |
|---|---|
| Decomposition gaps | Spec mentioned rate limiting, no task created |
| Partial implementation | Task marked done but only covers happy path |
| Cross-task gaps | Auth task done, logging task done, but no audit trail |
| Missing doc updates | Spec required README update, not done |
| Aspect | Impl Review | Completion Review |
|---|---|---|
| When | After each task | After all tasks done |
| Scope | Single task acceptance | Entire epic spec |
| Checks | Code quality, tests | Spec compliance |
| Focus | "Is this task done right?" | "Did we deliver everything?" |
| Config | WORK_REVIEW |
COMPLETION_REVIEW |
Every review produces a receipt JSON:
{
"type": "impl_review",
"id": "fn-1.1",
"mode": "rp",
"timestamp": "2026-01-09T..."
}No receipt = no progress. Ralph retries until receipt exists.
This is at-least-once delivery. The agent is untrusted; receipts are proof-of-work.
Reviews block progress until approved:
<verdict>SHIP</verdict>Fix → re-review → fix → re-review... until the reviewer approves.
Verdict tags:
| Verdict | Meaning |
|---|---|
<verdict>SHIP</verdict> |
Approved, proceed |
<verdict>NEEDS_WORK</verdict> |
Fix issues, re-review |
<verdict>MAJOR_RETHINK</verdict> |
Fundamental problems |
Common failures:
- Plain text "SHIP" → review skill not used correctly
- Interactive prompt (a/b/c) → backend misconfigured
- No verdict → check iteration log
When enabled, NEEDS_WORK reviews auto-capture learnings:
flowctl config set memory.enabled trueBuilds .flow/memory/pitfalls.md — things reviewers catch that models miss.
Note: Memory config is in
.flow/config.json, separate from Ralph'sconfig.env.
Edit scripts/ralph/config.env:
| Variable | Values | Default | Description |
|---|---|---|---|
PLAN_REVIEW |
rp, codex, none |
— | Plan review backend |
WORK_REVIEW |
rp, codex, none |
— | Impl review backend |
COMPLETION_REVIEW |
rp, codex, none |
— | Completion review backend |
REQUIRE_PLAN_REVIEW |
0, 1 |
0 |
Block work until plan approved |
| Variable | Values | Default | Description |
|---|---|---|---|
BRANCH_MODE |
new, current, worktree |
new |
Branch strategy |
new— One branch for entire run (ralph-<run-id>)current— Work on current branchworktree— Git worktrees (advanced)
| Variable | Default | Description |
|---|---|---|
MAX_ITERATIONS |
25 |
Total loop iterations |
MAX_TURNS |
∞ | Claude turns per iteration |
MAX_ATTEMPTS_PER_TASK |
5 |
Retries before auto-blocking |
MAX_REVIEW_ITERATIONS |
3 |
Fix+re-review cycles per review |
WORKER_TIMEOUT |
3600 |
Seconds before killing stuck worker |
| Variable | Example | Description |
|---|---|---|
EPICS |
fn-1,fn-2 |
Limit to specific epics (empty = all) |
| Variable | Default | Description |
|---|---|---|
YOLO |
1 |
Skip permission prompts |
Note:
YOLO=1is required for unattended runs. SetYOLO=0for interactive testing.
| Variable | Default | Description |
|---|---|---|
RALPH_UI |
1 |
Colored/emoji output |
| Variable | Default | Description |
|---|---|---|
CODEX_SANDBOX |
auto |
read-only, workspace-write, danger-full-access, auto |
FLOW_CODEX_EMBED_MAX_BYTES |
500000 |
Max bytes embedded in prompts |
Windows: Use
autoordanger-full-access. Theread-onlymode blocks all shell commands.
When using PLAN_REVIEW=rp or WORK_REVIEW=rp:
flowctl rp pick-window --repo-root . # Find window
flowctl rp builder ... # Build context
flowctl rp chat-send ... # Send to reviewerNever call
rp-clidirectly in Ralph mode. Use flowctl wrappers.
Window selection is automatic. With RP 1.5.68+, --create auto-opens windows.
When using PLAN_REVIEW=codex or WORK_REVIEW=codex:
flowctl codex check # Verify available
flowctl codex impl-review ... # Run impl review
flowctl codex plan-review <id> --files "src/auth.ts,src/config.ts"Requirements:
npm install -g @openai/codex && codex authAdvantages:
- Cross-platform (Windows, Linux, macOS)
- Terminal-based (no GUI)
- Session continuity via
thread_id
Each run creates:
scripts/ralph/runs/<run-id>/
├── iter-001.log # Raw Claude output
├── iter-002.log
├── progress.txt # Append-only run log
├── attempts.json # Per-task retry counts
├── branches.json # Branch info
├── receipts/
│ ├── plan-fn-1.json # Plan review receipt
│ ├── impl-fn-1.1.json # Impl review receipt
│ └── completion-fn-1.json # Completion review receipt
└── block-fn-1.2.md # Written when task auto-blocked
flowctl status # Epic/task counts + active runs
flowctl ralph pause # Pause run
flowctl ralph resume # Resume run
flowctl ralph stop # Graceful stop
flowctl ralph status # Show run state
flowctl ralph pause --run <id> # Specify run when multiple active# Pause
touch scripts/ralph/runs/<run-id>/PAUSE
# Resume
rm scripts/ralph/runs/<run-id>/PAUSE
# Stop (kept for audit)
touch scripts/ralph/runs/<run-id>/STOPRalph checks sentinels at iteration boundaries.
flowctl unblock fn-1.2 # Re-enable blocked task
flowctl update fn-1.2 --status pending # Reset to pendingscripts/ralph/ralph_once.shRuns one iteration then exits. Verify setup before full runs.
scripts/ralph/ralph.sh --watch # Tool calls
scripts/ralph/ralph.sh --watch verbose # Include responsesReal-time visibility without blocking autonomy.
scripts/ralph/ralph.sh --config my-codex-config.env
scripts/ralph/ralph.sh --watch --config rp-reviews.envUse alternate config files for different platforms or review backends without editing config.env. Useful for:
- Separate configs for RepoPrompt vs Codex reviews
- Platform-specific settings (macOS vs Linux vs Windows)
- Testing different
MAX_ITERATIONSorWORKER_TIMEOUTvalues
FLOW_RALPH_VERBOSE=1 scripts/ralph/ralph.shDetailed logs → scripts/ralph/runs/<run>/ralph.log
Ralph inherits Claude Code's default model (Opus) for both the main session and worker subagents (model: inherit). Only set FLOW_RALPH_CLAUDE_MODEL if you want to override.
FLOW_RALPH_CLAUDE_MODEL=claude-opus-4-6 # only needed to override default
FLOW_RALPH_CLAUDE_DEBUG=hooks
FLOW_RALPH_CLAUDE_PERMISSION_MODE=bypassPermissionsRun Ralph inside Docker for isolation:
docker sandbox run claude "scripts/ralph/ralph.sh"
docker sandbox run -w ~/my-project claude "scripts/ralph/ralph.sh"See Docker sandbox docs.
Community sandbox setups:
- devcontainer-for-claude-yolo-and-flow-code — VS Code devcontainer with Playwright, firewall whitelisting, RepoPrompt MCP bridge
- agent-sandbox — Docker Sandbox (Desktop 4.50+) with seccomp/namespace isolation
DCG blocks destructive commands before execution.
What it blocks:
| Command | Without DCG | With DCG |
|---|---|---|
git reset --hard |
Loses work | Blocked |
rm -rf ./src |
Deletes source | Blocked |
git push --force |
Overwrites history | Blocked |
git clean -f |
Deletes files | Blocked |
Install:
curl -fsSL "https://raw.githubusercontent.com/Dicklesworthstone/destructive_command_guard/master/install.sh?$(date +%s)" | bash -s -- --easy-modeCompatibility: DCG uses fail-open design — timeouts allow commands. Flow-next uses safe git patterns and quoted heredocs that DCG handles correctly.
Note: DCG will block
rm -rf .flow/andrm -rf scripts/ralph/— this is correct behavior. Uninstall commands should be run manually, not via AI agents. Your epics and tasks are protected.
Verify:
dcg test 'git reset --hard HEAD' # Should block
dcg test 'git checkout -b feature' # Should allowUninstall:
rm ~/.local/bin/dcg
# Edit ~/.claude/settings.json to remove dcg hook
rm -rf ~/.config/dcg/More info: DCG GitHub · Pack Reference
Plugin hooks enforce workflow rules deterministically.
Only active when
FLOW_RALPH=1— zero overhead for non-Ralph users.
| Rule | Purpose |
|---|---|
No --json on chat-send |
Preserve review text output |
No --new-chat on re-reviews |
Keep conversation context |
| Receipt before Stop | Prevent skipping reviews |
| Required flags on setup | Ensure proper targeting |
Location:
plugins/flow-code/
hooks/hooks.json # Config
bin/flowctl hook ralph-guard # Logic
Disable temporarily: Unset FLOW_RALPH
Disable permanently: Delete hooks/ directory
Symptoms: Ralph exits with NO_WORK but epics have plan_review_status: unknown.
Check config:
grep -E "REQUIRE_PLAN_REVIEW|PLAN_REVIEW" scripts/ralph/config.envCommon causes:
| Config | Problem | Fix |
|---|---|---|
REQUIRE_PLAN_REVIEW=0 |
Plan gate disabled | Set to 1 |
PLAN_REVIEW=none + REQUIRE_PLAN_REVIEW=1 |
No backend to review | Set PLAN_REVIEW=codex or rp |
PLAN_REVIEW unset |
Defaults to template placeholder | Set explicitly |
Verify selector sees plan work:
flowctl next --require-plan-review --jsonShould return status: "plan" if epics need review.
Symptoms: Ralph loops on plan review, never progresses to work.
Check:
# What's the epic status?
flowctl show fn-1 --json | jq '.plan_review_status'
# Is there a receipt?
ls scripts/ralph/runs/*/receipts/plan-fn-1.json
# What verdict did we get?
grep -i verdict scripts/ralph/runs/*/iter-*.log | grep planCommon causes:
PLAN_REVIEW=nonewithREQUIRE_PLAN_REVIEW=1→ blocked forever- Review returns
NEEDS_WORKrepeatedly → plan has fundamental issues - No verdict tag in response → backend misconfigured
Fix: Either set a review backend or disable the gate:
# Option A: Enable codex reviews
PLAN_REVIEW=codex
# Option B: Disable gate (plans auto-ship)
REQUIRE_PLAN_REVIEW=0Symptoms: Epic A completes, but Epic B (depends on A) never starts.
Check:
# Is A actually closed?
flowctl show fn-1 --json | jq '.status'
# Does B depend on A?
flowctl show fn-2 --json | jq '.depends_on_epics'Common cause: Race condition — selector runs before maybe_close_epics(). Fixed in v0.18.23+.
Workaround for older versions:
# Manually close the epic
flowctl epic close fn-1 --json
# Re-run Ralph
scripts/ralph/ralph.shSymptoms: Ralph keeps retrying the same task.
Check receipts:
ls scripts/ralph/runs/*/receipts/Check verdict:
grep -i verdict scripts/ralph/runs/*/iter-*.log | tail -5Common causes:
- No receipt file → review skill not invoked
- Wrong verdict format → plain text instead of XML tags
- Receipt exists but verdict is NEEDS_WORK → implementation has issues
After MAX_ATTEMPTS_PER_TASK failures:
- Ralph writes
block-<task>.mdwith context - Marks task blocked via
flowctl block - Moves to next task
To retry:
flowctl unblock fn-1.2"rp-cli not found":
# Install RepoPrompt, then:
which rp-cliWindow not found:
- RP 1.5.68+: Use
--createflag - Older: Open RepoPrompt on your repo manually
Alternative: Switch to Codex backend.
"codex not found":
npm install -g @openai/codex
codex authWindows "blocked by policy":
# In config.env:
CODEX_SANDBOX=autoThe read-only sandbox blocks all commands on Windows.
# Progress
cat scripts/ralph/runs/*/progress.txt
# Latest iteration
tail -100 scripts/ralph/runs/*/iter-*.log | tail -1
# Blocked tasks
ls scripts/ralph/runs/*/block-*.mdAfter overnight runs, review and merge the work.
# Run status
cat scripts/ralph/runs/*/progress.txt | tail -5
# Blocked tasks
ls scripts/ralph/runs/*/block-*.md 2>/dev/null
# Pending tasks
flowctl ready --jsonPartial run? Review block-*.md, fix issues, re-run ralph.sh (resumes from pending).
# Summary
cat scripts/ralph/runs/*/progress.txt
# All reviews passed
ls scripts/ralph/runs/*/receipts/
# Commits
git log --onelineCommits include task IDs (feat(fn-1.1): ...):
git log --oneline --grep="fn-1"
git log --oneline --grep="fn-2"All good:
git checkout main
git merge ralph-<run-id>
# Or: gh pr createOne epic is bad — cherry-pick good ones:
git checkout main
git cherry-pick <fn-1-commits>
git cherry-pick <fn-2-commits>
# Skip fn-3One epic is bad — revert and merge:
git checkout ralph-<run-id>
git revert <fn-3-commits>
git checkout main
git merge ralph-<run-id>git log --oneline --grep="fn-1"
flowctl show fn-1.1 --json | jq '.evidence.commits'- flowctl CLI
- Flow-Code README
- flow-code-tui
- Test scripts:
plugins/flow-code/scripts/ralph_e2e_*.sh