Ralph — Autonomous Loop

Ralph is Flow-Code's repo-local autonomous harness. It loops over tasks, applies multi-model review gates, and produces production-quality code overnight.

TL;DR: External shell loop → fresh Claude session per task → cross-model review gates → receipt-based proof-of-work → iterate until SHIP.

Quick Start
Architecture
- How It Works
- Why Ralph vs ralph-wiggum
Quality Gates
Configuration Reference
Review Backends
- RepoPrompt
- Codex CLI
Run Artifacts
Controlling Ralph
Testing & Debugging
Safety & Isolation
Troubleshooting
Morning Review Workflow

Quick Start

1. Initialize

# Inside Claude Code
/flow-code:ralph-init

# Or from terminal
claude -p "/flow-code:ralph-init"

Creates scripts/ralph/ with:

File	Purpose
`ralph.sh`	Main loop
`ralph_once.sh`	Single iteration (testing)
`config.env`	All settings
`runs/`	Artifacts and logs

2. Configure

Edit scripts/ralph/config.env:

PLAN_REVIEW=codex   # rp, codex, or none
WORK_REVIEW=codex   # rp, codex, or none

3. Test

scripts/ralph/ralph_once.sh

Always test first. Runs one iteration then exits. Observe before committing to a full run.

4. Run

scripts/ralph/ralph.sh

Ralph spawns Claude sessions via claude -p, loops until done, and applies review gates.

Watch mode — see activity in real-time:

scripts/ralph/ralph.sh --watch           # Tool calls only
scripts/ralph/ralph.sh --watch verbose   # Include model responses
scripts/ralph/ralph.sh --config alt.env  # Use alternate config file

Uninstall

Run manually in terminal:

rm -rf scripts/ralph/

Architecture

How It Works

┌─────────────────────────────────────────────────────────────┐
│  scripts/ralph/ralph.sh                                      │
│  ┌────────────────────────────────────────────────────────┐  │
│  │  while flowctl next returns work:                      │  │
│  │    1. claude -p "/flow-code:plan" or :work             │  │
│  │    2. check review receipts                            │  │
│  │    3. if missing/invalid → retry                       │  │
│  │    4. if SHIP verdict → next task                      │  │
│  └────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────┘

flowchart TD
  A[ralph.sh loop] --> B[flowctl next]
  B -->|plan needed| C[/flow-code:plan/]
  C --> D[/flow-code:plan-review/]
  B -->|work needed| E[/flow-code:work/]
  E --> F[/flow-code:impl-review/]
  B -->|completion review needed| K[/flow-code:epic-review/]
  D --> G{Receipt valid?}
  F --> G
  K --> G
  G -- yes --> H{Verdict = SHIP?}
  H -- yes --> B
  H -- no --> I[Fix issues, retry review]
  I --> G
  G -- no --> J[Force retry iteration]
  J --> B

Why Ralph vs ralph-wiggum

Anthropic's official ralph-wiggum uses a Stop hook to keep Claude in the same session. Flow-Code inverts this for production-grade reliability.

Aspect	ralph-wiggum	Ralph
Session	Single, accumulating	Fresh per iteration
Loop	Stop hook, same session	External bash, new `claude -p`
Context	Grows until full	Clean slate every time
Failed attempts	Pollute future work	Gone with session
Re-anchoring	None	Every iteration
Quality gates	Tests only	Multi-model reviews
Stuck detection	`--max-iterations`	Auto-block after N failures
Auditability	Session transcript	Logs + receipts + evidence

The core problems with ralph-wiggum:

Context pollution — Failed attempts mislead future iterations
No re-anchoring — Claude loses sight of the spec as context fills
Single model — Claude grades its own homework
Binary outcome — Completion promise or max iterations

Ralph's solution: Fresh context + multi-model review gates + receipt-based proof-of-work.

Quality Gates

Ralph enforces quality through four mechanisms:

1. Multi-Model Reviews

A second model verifies code. Two models catch what one misses.

Backend	Platform	Context	Recommended
`rp`	macOS (GUI)	Full file context via Builder	Yes
`codex`	Cross-platform	Heuristic context from changed files	Fallback
`none`	Any	—	Not for production

Two review types:

Plan reviews — Verify architecture before coding starts
Impl reviews — Verify implementation meets spec after coding

Plan Review Gate

The plan review gate ensures epics are architecturally sound before any implementation begins. This catches design issues early when they're cheap to fix.

How It Works

┌─────────────────────────────────────────────────────────────┐
│  flowctl next --require-plan-review                         │
│  ┌────────────────────────────────────────────────────────┐ │
│  │  1. Find epics with plan_review_status = unknown       │ │
│  │  2. Return status=plan, epic=fn-1                      │ │
│  │  3. Ralph invokes /flow-code:plan-review fn-1          │ │
│  │  4. Skill loops until <verdict>SHIP</verdict>          │ │
│  │  5. flowctl epic set-plan-review-status fn-1 --status ship │
│  │  6. Next iteration: epic unlocked for work             │ │
│  └────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘

Configuration

Both settings are required for plan reviews:

# config.env
REQUIRE_PLAN_REVIEW=1   # Gate: don't start work until plans reviewed
PLAN_REVIEW=codex       # Backend: rp, codex, or export

`REQUIRE_PLAN_REVIEW`	`PLAN_REVIEW`	Behavior
`0`	any	Plans auto-ship, work starts immediately
`1`	`rp`	Plans reviewed via RepoPrompt
`1`	`codex`	Plans reviewed via Codex CLI
`1`	`export`	Context exported for manual review
`1`	`none`	Blocked forever — no backend to review

Common mistake: Setting REQUIRE_PLAN_REVIEW=1 without a PLAN_REVIEW backend. Ralph will block on every epic with no way to proceed.

The Review Cycle

When flowctl next returns status=plan:

Checkpoint — Save epic state before review
```
flowctl checkpoint save --epic fn-1 --json
```

Review — Invoke the plan review skill

/flow-code:plan-review fn-1 --review=codex

Fix loop — If NEEDS_WORK:
- Parse reviewer feedback
- Update epic spec via flowctl epic set-plan
- Sync affected task specs via flowctl task set-spec
- Re-review (same chat for RP, receipt continuity for Codex)
- Repeat until SHIP

Receipt — Write proof-of-work

{"type":"plan_review","id":"fn-1","mode":"codex","timestamp":"..."}

Unlock — Set status to ship

flowctl epic set-plan-review-status fn-1 --status ship

Recovery

If context compacts during review cycles:

flowctl checkpoint restore --epic fn-1 --json

This restores the epic/task state from before the review started.

Inspecting Plan Review Status

# Check all epics
flowctl epics --json | jq '.epics[] | {id, plan_review_status}'

# Check specific epic
flowctl show fn-1 --json | jq '.plan_review_status'

# Find epics needing review
flowctl next --require-plan-review --json

Plan Review vs Impl Review

Aspect	Plan Review	Impl Review
When	Before coding	After coding
Reviews	Epic + task specs	Code changes
Blocks	All tasks in epic	Single task
Focus	Architecture, feasibility, scope	Correctness, security, tests
Config	`PLAN_REVIEW` + `REQUIRE_PLAN_REVIEW`	`WORK_REVIEW`

Epic-Completion Review Gate

The epic-completion review gate ensures implementation matches the spec before closing an epic. Runs after all tasks complete, checking for requirement gaps.

How It Works

┌─────────────────────────────────────────────────────────────┐
│  flowctl next --require-completion-review                    │
│  ┌────────────────────────────────────────────────────────┐ │
│  │  1. All tasks done, completion_review_status != ship   │ │
│  │  2. Return status=completion_review, epic=fn-1         │ │
│  │  3. Ralph invokes /flow-code:epic-review fn-1          │ │
│  │  4. Skill loops until <verdict>SHIP</verdict>          │ │
│  │  5. flowctl epic set-completion-review-status fn-1 --status ship │
│  │  6. Next iteration: epic can close                     │ │
│  └────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘

Configuration

# config.env
COMPLETION_REVIEW=codex       # Backend: rp, codex, or none

When COMPLETION_REVIEW != none, Ralph passes --require-completion-review to the selector. There is no separate REQUIRE_COMPLETION_REVIEW flag—the presence of a backend implies the gate is active.

`COMPLETION_REVIEW`	Behavior
`rp`	Completion reviewed via RepoPrompt
`codex`	Completion reviewed via Codex CLI
`none`	No completion review, epics close immediately

The Review Cycle

When flowctl next returns status=completion_review:

Review — Invoke the epic-review skill

/flow-code:epic-review fn-1 --review=codex

Fix loop — If NEEDS_WORK:
- Parse reviewer feedback (requirement gaps, missing functionality)
- Implement missing requirements inline
- Re-review (same chat for RP, receipt continuity for Codex)
- Repeat until SHIP

Receipt — Skill writes proof-of-work to receipts/completion-fn-1.json

{"type":"completion_review","id":"fn-1","mode":"codex","verdict":"SHIP","timestamp":"..."}

Unlock — Set status to ship

flowctl epic set-completion-review-status fn-1 --status ship

Close — Epic can now close normally

What Completion Review Catches

Issue Type	Example
Decomposition gaps	Spec mentioned rate limiting, no task created
Partial implementation	Task marked done but only covers happy path
Cross-task gaps	Auth task done, logging task done, but no audit trail
Missing doc updates	Spec required README update, not done

Completion Review vs Impl Review

Aspect	Impl Review	Completion Review
When	After each task	After all tasks done
Scope	Single task acceptance	Entire epic spec
Checks	Code quality, tests	Spec compliance
Focus	"Is this task done right?"	"Did we deliver everything?"
Config	`WORK_REVIEW`	`COMPLETION_REVIEW`

2. Receipt-Based Gating

Every review produces a receipt JSON:

{
  "type": "impl_review",
  "id": "fn-1.1",
  "mode": "rp",
  "timestamp": "2026-01-09T..."
}

No receipt = no progress. Ralph retries until receipt exists.

This is at-least-once delivery. The agent is untrusted; receipts are proof-of-work.

3. Review Loops Until SHIP

Reviews block progress until approved:

<verdict>SHIP</verdict>

Fix → re-review → fix → re-review... until the reviewer approves.

Verdict tags:

Verdict	Meaning
`<verdict>SHIP</verdict>`	Approved, proceed
`<verdict>NEEDS_WORK</verdict>`	Fix issues, re-review
`<verdict>MAJOR_RETHINK</verdict>`	Fundamental problems

Common failures:

Plain text "SHIP" → review skill not used correctly

Interactive prompt (a/b/c) → backend misconfigured

No verdict → check iteration log

4. Memory Capture (Opt-in)

When enabled, NEEDS_WORK reviews auto-capture learnings:

flowctl config set memory.enabled true

Builds .flow/memory/pitfalls.md — things reviewers catch that models miss.

Note: Memory config is in .flow/config.json, separate from Ralph's config.env.

Configuration Reference

Edit scripts/ralph/config.env:

Reviews

Variable	Values	Default	Description
`PLAN_REVIEW`	`rp`, `codex`, `none`	—	Plan review backend
`WORK_REVIEW`	`rp`, `codex`, `none`	—	Impl review backend
`COMPLETION_REVIEW`	`rp`, `codex`, `none`	—	Completion review backend
`REQUIRE_PLAN_REVIEW`	`0`, `1`	`0`	Block work until plan approved

Branches

Variable	Values	Default	Description
`BRANCH_MODE`	`new`, `current`, `worktree`	`new`	Branch strategy

new — One branch for entire run (ralph-<run-id>)
current — Work on current branch
worktree — Git worktrees (advanced)

Limits

Variable	Default	Description
`MAX_ITERATIONS`	`25`	Total loop iterations
`MAX_TURNS`	∞	Claude turns per iteration
`MAX_ATTEMPTS_PER_TASK`	`5`	Retries before auto-blocking
`MAX_REVIEW_ITERATIONS`	`3`	Fix+re-review cycles per review
`WORKER_TIMEOUT`	`3600`	Seconds before killing stuck worker

Scope

Variable	Example	Description
`EPICS`	`fn-1,fn-2`	Limit to specific epics (empty = all)

Permissions

Variable	Default	Description
`YOLO`	`1`	Skip permission prompts

Note: YOLO=1 is required for unattended runs. Set YOLO=0 for interactive testing.

Display

Variable	Default	Description
`RALPH_UI`	`1`	Colored/emoji output

Codex-Specific

Variable	Default	Description
`CODEX_SANDBOX`	`auto`	`read-only`, `workspace-write`, `danger-full-access`, `auto`
`FLOW_CODEX_EMBED_MAX_BYTES`	`500000`	Max bytes embedded in prompts

Windows: Use auto or danger-full-access. The read-only mode blocks all shell commands.

Review Backends

RepoPrompt Integration

When using PLAN_REVIEW=rp or WORK_REVIEW=rp:

flowctl rp pick-window --repo-root .  # Find window
flowctl rp builder ...                 # Build context
flowctl rp chat-send ...               # Send to reviewer

Never call rp-cli directly in Ralph mode. Use flowctl wrappers.

Window selection is automatic. With RP 1.5.68+, --create auto-opens windows.

Codex Integration

When using PLAN_REVIEW=codex or WORK_REVIEW=codex:

flowctl codex check                    # Verify available
flowctl codex impl-review ...          # Run impl review
flowctl codex plan-review <id> --files "src/auth.ts,src/config.ts"

Requirements:

npm install -g @openai/codex && codex auth

Advantages:

Cross-platform (Windows, Linux, macOS)
Terminal-based (no GUI)
Session continuity via thread_id

Run Artifacts

Each run creates:

scripts/ralph/runs/<run-id>/
├── iter-001.log           # Raw Claude output
├── iter-002.log
├── progress.txt           # Append-only run log
├── attempts.json          # Per-task retry counts
├── branches.json          # Branch info
├── receipts/
│   ├── plan-fn-1.json        # Plan review receipt
│   ├── impl-fn-1.1.json      # Impl review receipt
│   └── completion-fn-1.json  # Completion review receipt
└── block-fn-1.2.md        # Written when task auto-blocked

Controlling Ralph

CLI Commands

flowctl status                    # Epic/task counts + active runs
flowctl ralph pause               # Pause run
flowctl ralph resume              # Resume run
flowctl ralph stop                # Graceful stop
flowctl ralph status              # Show run state
flowctl ralph pause --run <id>    # Specify run when multiple active

Sentinel Files

# Pause
touch scripts/ralph/runs/<run-id>/PAUSE

# Resume
rm scripts/ralph/runs/<run-id>/PAUSE

# Stop (kept for audit)
touch scripts/ralph/runs/<run-id>/STOP

Ralph checks sentinels at iteration boundaries.

Task Retry/Rollback

flowctl unblock fn-1.2                    # Re-enable blocked task
flowctl update fn-1.2 --status pending    # Reset to pending

Testing & Debugging

Single Iteration

scripts/ralph/ralph_once.sh

Runs one iteration then exits. Verify setup before full runs.

Watch Mode

scripts/ralph/ralph.sh --watch           # Tool calls
scripts/ralph/ralph.sh --watch verbose   # Include responses

Real-time visibility without blocking autonomy.

Custom Config

scripts/ralph/ralph.sh --config my-codex-config.env
scripts/ralph/ralph.sh --watch --config rp-reviews.env

Use alternate config files for different platforms or review backends without editing config.env. Useful for:

Separate configs for RepoPrompt vs Codex reviews
Platform-specific settings (macOS vs Linux vs Windows)
Testing different MAX_ITERATIONS or WORKER_TIMEOUT values

Verbose Logging

FLOW_RALPH_VERBOSE=1 scripts/ralph/ralph.sh

Detailed logs → scripts/ralph/runs/<run>/ralph.log

Debug Environment Variables

Ralph inherits Claude Code's default model (Opus) for both the main session and worker subagents (model: inherit). Only set FLOW_RALPH_CLAUDE_MODEL if you want to override.

FLOW_RALPH_CLAUDE_MODEL=claude-opus-4-6  # only needed to override default
FLOW_RALPH_CLAUDE_DEBUG=hooks
FLOW_RALPH_CLAUDE_PERMISSION_MODE=bypassPermissions

Safety & Isolation

Docker Sandbox

Run Ralph inside Docker for isolation:

docker sandbox run claude "scripts/ralph/ralph.sh"
docker sandbox run -w ~/my-project claude "scripts/ralph/ralph.sh"

See Docker sandbox docs.

Community sandbox setups:

devcontainer-for-claude-yolo-and-flow-code — VS Code devcontainer with Playwright, firewall whitelisting, RepoPrompt MCP bridge
agent-sandbox — Docker Sandbox (Desktop 4.50+) with seccomp/namespace isolation

DCG (Destructive Command Guard)

DCG blocks destructive commands before execution.

What it blocks:

Command	Without DCG	With DCG
`git reset --hard`	Loses work	Blocked
`rm -rf ./src`	Deletes source	Blocked
`git push --force`	Overwrites history	Blocked
`git clean -f`	Deletes files	Blocked

Install:

curl -fsSL "https://raw.githubusercontent.com/Dicklesworthstone/destructive_command_guard/master/install.sh?$(date +%s)" | bash -s -- --easy-mode

Compatibility: DCG uses fail-open design — timeouts allow commands. Flow-next uses safe git patterns and quoted heredocs that DCG handles correctly.

Note: DCG will block rm -rf .flow/ and rm -rf scripts/ralph/ — this is correct behavior. Uninstall commands should be run manually, not via AI agents. Your epics and tasks are protected.

Verify:

dcg test 'git reset --hard HEAD'    # Should block
dcg test 'git checkout -b feature'  # Should allow

Uninstall:

rm ~/.local/bin/dcg
# Edit ~/.claude/settings.json to remove dcg hook
rm -rf ~/.config/dcg/

More info: DCG GitHub · Pack Reference

Guard Hooks

Plugin hooks enforce workflow rules deterministically.

Only active when FLOW_RALPH=1 — zero overhead for non-Ralph users.

Rule	Purpose
No `--json` on chat-send	Preserve review text output
No `--new-chat` on re-reviews	Keep conversation context
Receipt before Stop	Prevent skipping reviews
Required flags on setup	Ensure proper targeting

Location:

plugins/flow-code/
  hooks/hooks.json              # Config
  bin/flowctl hook ralph-guard  # Logic

Disable temporarily: Unset FLOW_RALPH

Disable permanently: Delete hooks/ directory

Troubleshooting

Plan Review Never Starts

Symptoms: Ralph exits with NO_WORK but epics have plan_review_status: unknown.

Check config:

grep -E "REQUIRE_PLAN_REVIEW|PLAN_REVIEW" scripts/ralph/config.env

Common causes:

Config	Problem	Fix
`REQUIRE_PLAN_REVIEW=0`	Plan gate disabled	Set to `1`
`PLAN_REVIEW=none` + `REQUIRE_PLAN_REVIEW=1`	No backend to review	Set `PLAN_REVIEW=codex` or `rp`
`PLAN_REVIEW` unset	Defaults to template placeholder	Set explicitly

Verify selector sees plan work:

flowctl next --require-plan-review --json

Should return status: "plan" if epics need review.

Plan Review Blocked Forever

Symptoms: Ralph loops on plan review, never progresses to work.

Check:

# What's the epic status?
flowctl show fn-1 --json | jq '.plan_review_status'

# Is there a receipt?
ls scripts/ralph/runs/*/receipts/plan-fn-1.json

# What verdict did we get?
grep -i verdict scripts/ralph/runs/*/iter-*.log | grep plan

Common causes:

PLAN_REVIEW=none with REQUIRE_PLAN_REVIEW=1 → blocked forever
Review returns NEEDS_WORK repeatedly → plan has fundamental issues
No verdict tag in response → backend misconfigured

Fix: Either set a review backend or disable the gate:

# Option A: Enable codex reviews
PLAN_REVIEW=codex

# Option B: Disable gate (plans auto-ship)
REQUIRE_PLAN_REVIEW=0

Dependent Epics Not Starting

Symptoms: Epic A completes, but Epic B (depends on A) never starts.

Check:

# Is A actually closed?
flowctl show fn-1 --json | jq '.status'

# Does B depend on A?
flowctl show fn-2 --json | jq '.depends_on_epics'

Common cause: Race condition — selector runs before maybe_close_epics(). Fixed in v0.18.23+.

Workaround for older versions:

# Manually close the epic
flowctl epic close fn-1 --json

# Re-run Ralph
scripts/ralph/ralph.sh

Review Gate Loops

Symptoms: Ralph keeps retrying the same task.

Check receipts:

ls scripts/ralph/runs/*/receipts/

Check verdict:

grep -i verdict scripts/ralph/runs/*/iter-*.log | tail -5

Common causes:

No receipt file → review skill not invoked
Wrong verdict format → plain text instead of XML tags
Receipt exists but verdict is NEEDS_WORK → implementation has issues

Auto-Blocked Tasks

After MAX_ATTEMPTS_PER_TASK failures:

Ralph writes block-<task>.md with context
Marks task blocked via flowctl block
Moves to next task

To retry:

flowctl unblock fn-1.2

RepoPrompt Issues

"rp-cli not found":

# Install RepoPrompt, then:
which rp-cli

Window not found:

RP 1.5.68+: Use --create flag
Older: Open RepoPrompt on your repo manually

Alternative: Switch to Codex backend.

Codex Issues

"codex not found":

npm install -g @openai/codex
codex auth

Windows "blocked by policy":

# In config.env:
CODEX_SANDBOX=auto

The read-only sandbox blocks all commands on Windows.

Run Inspection

# Progress
cat scripts/ralph/runs/*/progress.txt

# Latest iteration
tail -100 scripts/ralph/runs/*/iter-*.log | tail -1

# Blocked tasks
ls scripts/ralph/runs/*/block-*.md

Morning Review Workflow

After overnight runs, review and merge the work.

1. Check Completion

# Run status
cat scripts/ralph/runs/*/progress.txt | tail -5

# Blocked tasks
ls scripts/ralph/runs/*/block-*.md 2>/dev/null

# Pending tasks
flowctl ready --json

Partial run? Review block-*.md, fix issues, re-run ralph.sh (resumes from pending).

2. Review Changes

# Summary
cat scripts/ralph/runs/*/progress.txt

# All reviews passed
ls scripts/ralph/runs/*/receipts/

# Commits
git log --oneline

3. Review by Epic

Commits include task IDs (feat(fn-1.1): ...):

git log --oneline --grep="fn-1"
git log --oneline --grep="fn-2"

4. Merge

All good:

git checkout main
git merge ralph-<run-id>
# Or: gh pr create

One epic is bad — cherry-pick good ones:

git checkout main
git cherry-pick <fn-1-commits>
git cherry-pick <fn-2-commits>
# Skip fn-3

One epic is bad — revert and merge:

git checkout ralph-<run-id>
git revert <fn-3-commits>
git checkout main
git merge ralph-<run-id>

5. Find Commit SHAs

git log --oneline --grep="fn-1"
flowctl show fn-1.1 --json | jq '.evidence.commits'

References

flowctl CLI
Flow-Code README
flow-code-tui
Test scripts: plugins/flow-code/scripts/ralph_e2e_*.sh

FilesExpand file tree

ralph.md

Latest commit

History

ralph.md

File metadata and controls

Ralph — Autonomous Loop

Table of Contents

Quick Start

1. Initialize

2. Configure

3. Test

4. Run

Uninstall

Architecture

How It Works

Why Ralph vs ralph-wiggum

Quality Gates

1. Multi-Model Reviews

Plan Review Gate

How It Works

Configuration

The Review Cycle

Recovery

Inspecting Plan Review Status

Plan Review vs Impl Review

Epic-Completion Review Gate

How It Works

Configuration

The Review Cycle

What Completion Review Catches

Completion Review vs Impl Review

2. Receipt-Based Gating

3. Review Loops Until SHIP

4. Memory Capture (Opt-in)

Configuration Reference

Reviews

Branches

Limits

Scope

Permissions

Display

Codex-Specific

Review Backends

RepoPrompt Integration

Codex Integration

Run Artifacts

Controlling Ralph

CLI Commands

Sentinel Files

Task Retry/Rollback

Testing & Debugging

Single Iteration

Watch Mode

Custom Config

Verbose Logging

Debug Environment Variables

Safety & Isolation

Docker Sandbox

DCG (Destructive Command Guard)

Guard Hooks

Troubleshooting

Plan Review Never Starts

Plan Review Blocked Forever

Dependent Epics Not Starting

Review Gate Loops

Auto-Blocked Tasks

RepoPrompt Issues

Codex Issues

Run Inspection

Morning Review Workflow

1. Check Completion

2. Review Changes

3. Review by Epic

4. Merge

5. Find Commit SHAs

References