A Golden Ralph Loop orchestrator that runs fresh CLI-agent sessions (Codex, Claude Code, Copilot) in a deterministic loop, using the repo filesystem as durable memory.
Doc requirements:
- Audience: users and contributors (intermediate CLI + git experience)
- Scope: installing, configuring, and operating the loop; contributor workflow; support/security paths
- Owner: jscraik
- Review cadence: quarterly or when CLI behavior changes
- Last updated: 2026-03-05
- TL;DR
- Quickstart
- Prerequisites
- Install (with uv)
- Initialize a repo
- Run the loop
- Machine-friendly output and global output controls
- Authorization coverage and receipts
- Troubleshooting
- Documentation
The problem: agent runs drift without durable state, reproducible gates, or exit rules.
The solution: a loop that selects one task per iteration, runs gates, logs state to disk, and only exits when the tracker is complete and the agent says it is done.
Why use ralph-gold:
- File-based memory under
.ralph/keeps state and logs deterministic. - Runner-agnostic invocation with stdin-first prompt handling.
- Optional TUI and VS Code bridge for operator visibility.
- Receipts + context snapshots per iteration for auditability and review.
- Optional review gate to require a final
SHIPdecision before exit.
uv tool install -e .
ralph quickstartThen run your first guided iteration:
ralph step --agent codex- git
- uv
- At least one agent CLI (
codex,claude, orcopilot) - Optional:
prek(universal gate runner) - Optional:
rp-cli(RepoPrompt context packs and review backend)
uv tool install -e .
uv tool update-shell # optional: add uv tool bin dir to PATHVerify:
ralph --helpFrom your project root:
ralph initThis creates the recommended default layout:
.ralph/ralph.toml(config).ralph/PRD.md(task tracker, Markdown).ralph/AGENTS.md(build/test/run commands for your repo).ralph/progress.md(append-only progress log).ralph/specs/(requirement specs).ralph/PROMPT_build.md/.ralph/PROMPT_plan.md/.ralph/PROMPT_judge.md/.ralph/PROMPT_review.md.ralph/FEEDBACK.md(operator feedback + review notes).ralph/logs/(per-iteration logs).ralph/receipts/(command receipts per iteration).ralph/context/(anchor/context snapshots per iteration).ralph/attempts/(attempt records per task).ralph/state.json(session state)
You can switch trackers by changing files.prd in .ralph/ralph.toml.
Project layout and scaffolding details live in docs/PROJECT_STRUCTURE.md.
Re-initializing with --force: If you need to reset your .ralph directory, use ralph init --force. This automatically archives existing files to .ralph/archive/<timestamp>/ before creating fresh templates, preventing accidental data loss. See docs/INIT_ARCHIVING.md for details.
Run N iterations:
ralph run --agent codex --max-iterations 10
# Stream runner output during the loop
# (sequential runs only; ignored with --parallel)
ralph run --agent codex --streamExit codes:
- 0: loop completed successfully (EXIT_SIGNAL true, gates/judge/review ok)
- 1: loop ended without a successful exit (e.g., max iterations / no-progress)
- 2: one or more iterations failed (non-zero return code, gate failure, judge failure, or review BLOCK)
Run a single iteration:
ralph step --agent claudeRun a single iteration with interactive task selection:
ralph step --interactiveRun a specific task directly:
ralph step --task-id 42In interactive mode, you'll see a list of available tasks and can:
- Select a task by number
- Search/filter tasks with
s <keyword> - View task details with
d <number> - Clear search filter with
c - Quit without selecting with
q
Show status:
ralph statusExplain task selection and blocker context:
ralph explainLogs are written under .ralph/logs/.
Receipts and context snapshots (anchor + optional RepoPrompt pack) live under:
.ralph/receipts/.ralph/context/
Use global output controls for both humans and agent integrations:
ralph --format json status
ralph --verbosity verbose diagnoseResolution precedence for output settings:
- Global CLI flags (
--format,--verbosity) - Environment variables (
RALPH_FORMAT,RALPH_VERBOSITY) .ralph/ralph.toml[output]defaults
When JSON mode is enabled, command envelopes include:
schema_version(ralph.cli.v1)cmdexit_codetimestamp
Authorization can run in warn or block mode via [authorization] and .ralph/permissions.json.
As of 2026-03-05, authorization checks cover:
- prep artifact writes (for example,
ANCHOR.md) - post-run write effects (tracked + untracked changed files)
Authorization receipts are saved per iteration under .ralph/receipts/...:
authorization_prewrite_anchor.jsonauthorization_post_write_effects.json
In block mode, denied write effects fail the iteration and are recorded with path-level reasons.
For unattended runs (with a periodic heartbeat and best-effort OS notifications), use:
ralph supervise --agent codexFor long sessions, we still recommend running under tmux so the process survives terminal disconnects.
ralph tuiKeys:
sstep oncerrunloop.max_iterationsiterationsacycle agentppause/resume (between iterations)qquit
Watch mode automatically runs gates when files change, providing instant feedback during development:
ralph watchFeatures:
- Automatically runs gates when configured file patterns change
- Debounces rapid changes (500ms default) to avoid excessive runs
- Shows real-time gate results
- Optional auto-commit when gates pass
- Graceful shutdown with Ctrl+C
Configuration:
Enable watch mode in .ralph/ralph.toml:
[watch]
enabled = true
patterns = ["**/*.py", "**/*.md"] # File patterns to watch
debounce_ms = 500 # Debounce delay in milliseconds
auto_commit = false # Auto-commit when gates passCommand options:
# Run watch mode (gates only - default)
ralph watch
# Auto-commit changes when gates pass
ralph watch --auto-commitHow it works:
- Watch mode monitors files matching the configured patterns
- When a file changes, it waits for the debounce period
- After the debounce period, gates are executed
- Results are displayed in real-time
- If
--auto-commitis enabled and gates pass, changes are automatically committed
Use cases:
- Get instant feedback while developing
- Ensure code quality before committing
- Automate repetitive gate runs during active development
- Catch issues early in the development cycle
Example workflow:
# Enable watch mode in config
# Edit .ralph/ralph.toml and set watch.enabled = true
# Start watch mode
ralph watch
# In another terminal, make changes to your code
# Watch mode automatically runs gates and shows results
# When satisfied, stop watch mode with Ctrl+CNotes:
- Watch mode requires
watch.enabled = truein configuration - Uses OS-native file watching when available (inotify on Linux, FSEvents on macOS)
- Falls back to polling if native watching is unavailable
- Respects
.gitignorepatterns and ignores common directories (.ralph/,.git/,__pycache__/, etc.) - JSON output format is not supported for watch mode (it's interactive)
ralph-gold includes a project wrapper at scripts/harness
that runs the globally installed @brainwav/coding-harness from your npm/pnpm global root.
Install or update to latest:
npm i -g @brainwav/coding-harness@latest
# or
pnpm add -g @brainwav/coding-harness@latestVerify:
scripts/harness --versionUse harness commands to turn .ralph history/receipts into a regression-friendly
quality dataset and report.
# Build dataset from recent history
ralph harness collect --days 30 --limit 200
# Evaluate dataset and write a run report
ralph harness run --dataset .ralph/harness/cases.json
# Or run live targeted execution for each dataset case
ralph harness run --dataset .ralph/harness/cases.json --execution-mode live --strict-targeting
# Pin failing cases so they remain covered in future runs
ralph harness pin --run .ralph/harness/runs/<run>.json
# CI-friendly collect + evaluate
ralph harness ci --enforce-regression-threshold
# View latest run (text/json/csv)
ralph harness report --format textConfig (optional):
[harness]
enabled = false
dataset_path = ".ralph/harness/cases.json"
runs_dir = ".ralph/harness/runs"
pinned_dataset_path = ".ralph/harness/pinned.json"
baseline_run_path = ".ralph/harness/runs/baseline.json"
append_pinned_by_default = true
max_cases_per_task = 2
regression_threshold = 0.05
[harness.buckets]
small_max_seconds = 120
medium_max_seconds = 600
[harness.ci]
execution_mode = "historical"
enforce_regression_threshold = true
require_baseline = true
baseline_missing_policy = "fail"Enable in .ralph/ralph.toml:
[git]
branch_strategy = "per_prd" # per_prd|none
branch_prefix = "ralph/"
auto_commit = true
amend_if_needed = trueAdd PRD metadata:
Put near the top:
Branch: ralph/my-featureIf no branch is specified, a fallback branch is generated from the repo name using branch_prefix.
Enable a cross-model review that must end with SHIP:
[gates.review]
enabled = true
backend = "runner" # runner|repoprompt
agent = "claude"
required_token = "SHIP"When enabled, ralph run will not exit until the review returns SHIP.
codex exec --full-auto expects the prompt via stdin (or an explicit prompt argument).
Default runner config in .ralph/ralph.toml uses:
[runners.codex]
argv = ["codex", "exec", "--full-auto", "-"]The - means "read prompt from stdin".
Built-ins:
- Markdown tracker: checkbox tasks in
PRD.md - YAML tracker:
tasks.yamlwith parallel execution support
Select via:
[tracker]
kind = "auto" # auto|markdown|json|yaml|beadsbeads is supported as an optional tracker (requires the bd CLI to be installed).
Blocked tasks:
- Markdown tracker:
[-]marks a blocked task. - YAML tracker: set
blocked: trueon a task.
The YAML tracker provides structured task tracking with native parallel execution grouping:
version: 1
metadata:
project: my-app
branch: ralph/my-feature
tasks:
- id: 1
title: Implement authentication API
group: backend
completed: false
acceptance:
- User can login with email/password
- JWT token returned on success
- id: 2
title: Create login UI component
group: frontend
completed: false
acceptance:
- Login form with email and password fields
- Error messages displayed on failureInitialize with YAML:
ralph init --format yamlConvert existing PRD to YAML:
# From Markdown
ralph convert .ralph/PRD.md tasks.yamlBenefits:
- Structured schema with validation
- Parallel execution groups (tasks in different groups can run concurrently)
- Comments support for documentation
- Machine-editable format
See docs/YAML_TRACKER.md for complete documentation.
Ralph supports task dependencies to enforce execution order. Tasks with unmet dependencies are automatically skipped until their dependencies are completed.
Markdown tracker (PRD.md):
Add a Depends on: line in the task's acceptance criteria with task numbers:
## Tasks
- [ ] 1. Setup database schema
- Create users table
- Create posts table
- [ ] 2. Implement user authentication
- Depends on: 1
- User can register
- User can login
- [ ] 3. Create post API
- Depends on: 1, 2
- User can create posts
- User can view their postsYAML tracker (tasks.yaml):
Add a depends_on list with task IDs:
tasks:
- id: 1
title: Setup database schema
completed: false
- id: 2
title: Implement user authentication
depends_on: [1]
completed: false
- id: 3
title: Create post API
depends_on: [1, 2]
completed: falseView the dependency graph:
ralph status --graphExample output:
============================================================
Task Dependency Graph
============================================================
Level 0:
β 1
Level 1:
β 2
depends on: 1
Level 2:
β 3
depends on: 1, 2
============================================================
Total tasks: 3
Total dependencies: 3
============================================================
Ralph automatically detects circular dependencies during diagnostics:
ralph diagnoseIf circular dependencies are found, you'll see:
ERRORS:
β Found 1 circular dependency cycle(s)
β Remove circular dependencies to allow tasks to execute
β Circular dependencies detected:
β Cycle 1: task-2 β task-3 β task-2
β Break the cycle by removing one or more 'depends_on' relationships
- Tasks are only selected when all their dependencies are marked complete
- The loop automatically skips tasks with unmet dependencies
- Dependencies are checked on every iteration
- Circular dependencies prevent the loop from making progress and must be fixed
Ralph provides reusable task templates to quickly create common task types with pre-defined acceptance criteria. Templates help maintain consistency across your PRD and save time when adding similar tasks.
Ralph includes three built-in templates:
- bug-fix: For bug fixes (high priority)
- feature: For new features (medium priority)
- refactor: For refactoring tasks (low priority)
View all available templates:
ralph task templatesExample output:
Available Task Templates:
============================================================
bug-fix [built-in]
Description: Template for bug fixes
Title format: Fix: {title}
Priority: high
Variables: title
Acceptance criteria: 4 items
feature [built-in]
Description: Template for new features
Title format: Feature: {title}
Priority: medium
Variables: title
Acceptance criteria: 4 items
refactor [built-in]
Description: Template for refactoring tasks
Title format: Refactor: {title}
Priority: low
Variables: title
Acceptance criteria: 4 items
============================================================
Total: 3 template(s)
Add a new task using a template:
ralph task add --template bug-fix --title "Login fails on Safari"This creates a new task with:
- Title: "Fix: Login fails on Safari"
- Priority: high
- Pre-defined acceptance criteria for bug fixes
The task is automatically added to your configured PRD file (Markdown, JSON, or YAML).
Create custom templates for your project by adding JSON files to .ralph/templates/:
mkdir -p .ralph/templatesCreate a template file (e.g., .ralph/templates/api-endpoint.json):
{
"name": "api-endpoint",
"description": "Template for new API endpoints",
"title_template": "API: {title}",
"acceptance_criteria": [
"Endpoint is implemented with proper HTTP methods",
"Request/response validation is in place",
"Unit tests cover happy path and error cases",
"API documentation is updated",
"Integration tests pass"
],
"priority": "medium",
"variables": ["title"],
"metadata": {
"author": "your-team",
"version": "1.0"
}
}Use your custom template:
ralph task add --template api-endpoint --title "Create user profile endpoint"Templates support variable substitution using {variable} syntax. The title variable is always available. You can add additional variables:
ralph task add --template custom --title "Fix bug" --var component=auth --var severity=criticalCustom templates must include:
name: Unique template identifierdescription: Human-readable descriptiontitle_template: Template string with{variable}placeholdersacceptance_criteria: Array of acceptance criteria strings
Optional fields:
priority: "low", "medium", or "high" (default: "medium")variables: Array of variable names (default: ["title"])metadata: Additional metadata (author, version, etc.)
Custom templates override built-in templates with the same name.
Ralph provides shell completion scripts for bash and zsh to enable tab completion for commands, flags, and dynamic values.
Generate and install bash completion:
# Generate completion script
ralph completion bash > ~/.ralph-completion.sh
# Add to your ~/.bashrc
echo "source ~/.ralph-completion.sh" >> ~/.bashrc
# Reload your shell
source ~/.bashrcSystem-wide installation (optional):
# Install for all users
sudo ralph completion bash > /etc/bash_completion.d/ralph
# Reload bash completion
source /etc/bash_completion.d/ralphGenerate and install zsh completion:
# Create completion directory
mkdir -p ~/.zsh/completion
# Generate completion script
ralph completion zsh > ~/.zsh/completion/_ralph
# Add to your ~/.zshrc (if not already present)
echo "fpath=(~/.zsh/completion \$fpath)" >> ~/.zshrc
echo "autoload -Uz compinit && compinit" >> ~/.zshrc
# Reload your shell
source ~/.zshrcShell completion provides intelligent suggestions for:
- Commands: All ralph commands (init, run, step, status, etc.)
- Flags: Command-specific and global flags
- Agent names: Configured runners (codex, claude, copilot, custom)
- Templates: Available task templates (built-in and custom)
- Snapshots: Existing snapshot names for rollback
- File paths: For flags that accept files (--prd-file, --export, etc.)
- Formats: Output formats (text, json) and tracker formats (markdown, json, yaml)
# Tab completion for commands
ralph <TAB>
# Shows: init doctor diagnose stats resume clean step run status ...
# Tab completion for flags
ralph run --<TAB>
# Shows: --agent --max-iterations --prompt-file --prd-file --parallel --stream ...
# Tab completion for agent names
ralph step --agent <TAB>
# Shows: codex claude copilot
# Tab completion for templates
ralph task add --template <TAB>
# Shows: bug-fix feature refactor (and any custom templates)
# Tab completion for snapshots
ralph rollback <TAB>
# Shows: before-refactor my-checkpoint (your snapshot names)Bash completion not working:
- Ensure
bash-completionpackage is installed - Check that
~/.ralph-completion.shexists and is sourced in~/.bashrc - Try reloading:
source ~/.bashrc
Zsh completion not working:
- Ensure
~/.zsh/completion/_ralphexists - Check that
fpathincludes~/.zsh/completionin~/.zshrc - Run
compinitto rebuild completion cache - Try:
rm ~/.zcompdump && compinit
Dynamic completions not showing:
- Dynamic completions (templates, snapshots) require ralph to be run from a valid ralph project directory
- Ensure
.ralph/ralph.tomlexists in your project
Ralph provides powerful progress tracking and visualization features to help you understand your project's velocity and completion timeline.
View detailed progress metrics including velocity and ETA:
ralph status --detailedExample output:
Progress: [ββββββββββββββββββββββββββββββββββββββββββββββββ] 60% (12/20 tasks)
Detailed Progress Metrics:
Total Tasks: 20
Completed: 12
In Progress: 1
Blocked: 0
Completion: 60.0%
Velocity: 1.50 tasks/day
Estimated ETA: 2024-02-15
The --detailed flag shows:
- Progress Bar: Visual representation of task completion
- Task Counts: Total, completed, in-progress, and blocked tasks
- Completion Percentage: Overall progress percentage
- Velocity: Average tasks completed per day (calculated from history)
- Estimated ETA: Projected completion date based on current velocity
Visualize task completion over time with an ASCII burndown chart:
ralph status --chartExample output:
Tasks
20 β β
β β
15 β ββ
β β
10 β ββ
β β
5 β ββ
β β
0 ββββββββββββββββββ
Day 1 3 5 7 9
The burndown chart shows:
- Y-axis: Number of remaining tasks
- X-axis: Days since project start
- Data Points (β): Task completion milestones
- Trend: Visual representation of progress velocity
- Velocity Calculation: Based on successful iterations in
.ralph/state.json - ETA Estimation: Uses velocity to project completion date
- History Required: At least 2 successful iterations needed for velocity calculation
- Automatic Updates: Progress metrics update after each successful iteration
- Daily Standups: Quick progress overview with
ralph status --detailed - Sprint Planning: Velocity data helps estimate future work
- Stakeholder Updates: Burndown charts visualize progress trends
- Bottleneck Detection: Identify when velocity drops
Runners are configured in .ralph/ralph.toml.
Prompt transport rules:
codex: if argv contains-, prompt is sent via stdinclaude: if argv contains-p, prompt is inserted immediately after-pcopilot: if argv contains--prompt, prompt is inserted immediately after--prompt- You can also use
{prompt}in argv to inline
ralph specs check
ralph specs check --strictUses files.specs_dir from config by default.
Run diagnostic checks to validate your Ralph configuration:
ralph diagnoseThis command checks:
- Configuration file existence and syntax (
.ralph/ralph.toml) - TOML syntax validation
- Configuration schema validation
- Runner configuration
- PRD file existence and format
- PRD structure validation
Test gate commands:
ralph diagnose --test-gatesThis runs each configured gate command individually to verify they work correctly. Useful for debugging gate failures before running the loop.
Exit codes:
0: All diagnostics passed2: Issues found (errors or warnings)
Example output:
Ralph Diagnostics Report
============================================================
PASSED:
β Configuration file found
β Configuration file ralph.toml has valid TOML syntax
β Configuration schema is valid
β Found 2 configured runner(s)
β PRD file found: PRD.md
β PRD file has valid markdown format
β PRD structure is valid (3/10 tasks complete)
============================================================
Summary: 7/7 checks passed
β All diagnostics passed!
View iteration statistics to understand loop performance:
ralph statsThis command displays:
- Total iterations (successful and failed)
- Success rate
- Duration statistics (average, min, max)
Show per-task breakdown:
ralph stats --by-taskThis shows detailed statistics for each task, including:
- Number of attempts per task
- Success/failure counts
- Average and total duration per task
- Tasks sorted by total duration (slowest first)
Export to CSV:
ralph stats --export stats.csvExports statistics to a CSV file for analysis in spreadsheet tools or custom scripts. The CSV includes both overall statistics and per-task breakdowns.
Example output:
============================================================
Ralph Gold - Iteration Statistics
============================================================
Overall Statistics:
Total Iterations: 15
Successful: 12
Failed: 3
Success Rate: 80.0%
Duration Statistics:
Average: 245.50s
Minimum: 120.30s
Maximum: 450.75s
============================================================
Use cases:
- Identify slow tasks that need optimization
- Track success rates over time
- Estimate time for remaining work
- Export data for trend analysis
Create git-based snapshots before risky changes and rollback if needed:
Create a snapshot:
ralph snapshot my-snapshot-nameOptionally add a description:
ralph snapshot before-refactor --description "Snapshot before major refactoring"List all snapshots:
ralph snapshot --listRollback to a snapshot:
ralph rollback my-snapshot-nameThe rollback command will ask for confirmation before proceeding. To skip confirmation:
ralph rollback my-snapshot-name --forceHow it works:
- Snapshots use
git stashto save your working tree state - Ralph state (
.ralph/state.json) is backed up separately - Rollback restores both git state and Ralph state
- Snapshots are stored in
.ralph/snapshots/with metadata instate.json
Example workflow:
# Before making risky changes
ralph snapshot before-experiment -d "Before trying new approach"
# Make changes, run iterations
ralph step --agent codex
# If something goes wrong, rollback
ralph rollback before-experiment
# Or if everything works, continue and the snapshot remains availableUse cases:
- Create checkpoints before major refactoring
- Save state before experimenting with new approaches
- Quick recovery from failed iterations
- Safe exploration of different solutions
Notes:
- Rollback requires a clean working tree (or use
--force) - Snapshot names must use only letters, numbers, hyphens, and underscores
- Old snapshots can be cleaned up manually from
.ralph/snapshots/
giterrors: ensure you are inside a git repository and have at least one commit.Unknown agent: checkrunners.*in.ralph/ralph.tomlor install the CLI.No prompt providedfrom Codex: ensure runner argv includes-so stdin is used.
- Assumption: agents run with least-privilege credentials and do not write secrets into
.ralph/*. - Risk: long-running loops can produce large logs; prune
.ralph/logs/if needed. - Risk: auto-commit may amend unintended changes if the worktree is dirty; review git status before running unattended loops.
- Treat prompts and logs as potentially sensitive. Avoid storing secrets in
.ralph/*. - Run long loops in a least-privilege environment (container or isolated dev VM).
Core Guides:
- Configuration:
docs/CONFIGURATION.md- Complete configuration reference - Commands:
docs/COMMANDS.md- Complete CLI command reference - Authorization:
docs/AUTHORIZATION.md- File write permission system - Troubleshooting:
docs/TROUBLESHOOTING.md- Common issues and solutions
Features:
- Evidence System:
docs/EVIDENCE.md- Evidence citations and tracking - Progress:
docs/PROGRESS.md- Velocity, ETA, and burndown charts - YAML Tracker:
docs/YAML_TRACKER.md- Structured task tracking - UX Modes:
docs/SIMPLE_EXPERT_MODE.md- Simple vs expert workflow policy - Watch Mode: README#watch-mode - File watching and auto-gates
Reference:
- Project Structure:
docs/PROJECT_STRUCTURE.md- Directory layout and lifecycles - Parallel Config:
docs/PARALLEL_CONFIG.md- Parallel execution guide - VS Code Bridge:
docs/VSCODE_BRIDGE_PROTOCOL.md- Extension protocol
See CONTRIBUTING.md.
See SECURITY.md for vulnerability reporting.
See SUPPORT.md.
See CODE_OF_CONDUCT.md.
- Create/update implementation plan using
.agent/PLANS.mdcontract. - Validate plan graph:
python3 /Users/jamiecraik/.codex/scripts/plan-graph-lint.py .agent/PLANS.md- Run canonical verification:
/Users/jamiecraik/.codex/scripts/verify-work.sh- Validate version sync:
uv run python scripts/check_version_sync.py- Follow global scaffold policy:
/Users/jamiecraik/.codex/instructions/agent-first-scaffold-spec.md
