Cub

⚠️ Status: Alpha - This is an early-stage release with breaking changes possible. See ALPHA-NOTES.md for known limitations and stability considerations.

Work ahead of your AI coding agents, then let them run.

Cub is for developers who are already running AI coding CLIs (Claude Code, Codex, OpenCode) in autonomous mode and want more structure. If you're juggling multiple agent sessions, manually routing work to different models, or finding that fully hands-off agents tend to run amok—Cub helps you work ahead of execution so you can be more hands-off during execution.

Quick Install

curl -LsSf https://install.cub.tools | bash

Then restart your shell (or open a new terminal) and run:

cub init --global

(Already installed? Run pipx upgrade cub or re-run the installer.)

The Problem

AI coding agents in 2026 are powerful. They can operate for hours, produce working code, run tests, and iterate toward production quality. But there's a gap:

Too hands-on: Sitting in an IDE, approving every tool call, staying close to the work
Too hands-off: Letting agents run wild with vague instructions, hoping for the best

Cub finds the balance. You invest time before code starts flying—breaking work into agent-sized tasks, routing complexity to the right models, reviewing the plan—then step back and let execution happen more seamlessly.

Two Main Steps: Planning and Running

Step 1. `cub plan`: Go From a Vision to Structured Tasks

Bring your ideas (a sentence, a spec, a whole design doc) and go through structured planning phases to generate clear tasks for an LLM:

Orient — Research and understand the problem space
Architect — Design the solution architecture
Itemize — Break it into agent-sized chunks with clear acceptance criteria

The goal: observable, reviewable work before any code is written. No gaps in understanding slip through.

cub plan run                # Run full planning pipeline
cub plan orient             # Or run phases individually
cub plan architect
cub plan itemize
cub plan list               # List all plans

Step 2: `cub run`: Turn Tasks Into Code

Once you have structured tasks, Cub runs the Ralph Wiggum loop—picking ready tasks, generating prompts, invoking your chosen AI harness, and iterating until done or budget exhausted.

cub run                     # Run until complete
cub run --once              # Single iteration
cub run --epic my-feature   # Target specific work

The execution loop handles dependency ordering, failure recovery, git commits, and structured logging. You can watch it stream or check in later.

Key Features

Right Model for the Task

Not everything needs Opus. Cub supports per-task model selection:

bd label add cub-abc model:haiku     # Simple rename, use fast model
bd label add cub-xyz model:sonnet    # Medium complexity
bd label add cub-123 model:opus      # Complex architecture work

Route simple refactoring to Haiku, medium tasks to Sonnet, reserve Opus for planning and complex work. Manage tokens as a resource.

Multi-Harness Flexibility

Cub abstracts across multiple AI coding CLIs:

Claude Code — General coding, complex refactoring (default)
OpenAI Codex — Quick fixes, OpenAI ecosystem
Google Gemini — Alternative perspective
OpenCode — Open-source option

Each harness evolves rapidly. New capabilities emerge in one that may not exist in others. Cub lets you use the right tool without vendor lock-in.

cub run --harness claude    # Explicit selection
cub run --harness codex

Deterministic Control Layer

Building outside any single harness means the core loop—task selection, success/failure detection, retry logic, state transitions—runs as traditional software, not LLM inference. This enables:

Reliable hooks: Email when a task completes, not "hopefully the agent remembers"
Consistent logging: Structured JSONL, not scattered console output
Predictable budgets: Hard limits that actually stop execution

Features

Autonomous Loop: Runs until all tasks complete or budget exhausted
Dependency Tracking: Respects task dependencies, picks ready tasks
Priority Scheduling: P0-P4 priority-based task selection
Epic/Label Filtering: Target specific epics or labeled tasks
Budget Management: Token tracking with configurable limits and warnings
Guardrails: Iteration limits, secret redaction, safety controls
Failure Handling: Configurable modes (stop, move-on, retry, triage)
Session Management: Named sessions, artifact bundles per task
Git Workflow: Auto-branching, commit per task, clean state enforcement
Hooks System: Custom scripts at 5 lifecycle points
Structured Logging: JSONL logs with timestamps, durations, git SHAs
Dual Task Backend: Use beads CLI or simple JSON file
Streaming Output: Watch agent activity in real-time
Dashboard [EXPERIMENTAL]: Unified Kanban board visualization of project state across 8 columns
Task State Sync [EXPERIMENTAL]: Git-based sync of task state to cub-sync branch for persistence across clones
Planning Pipeline [EXPERIMENTAL]: Orient → Architect → Itemize phases for structured task generation

Dashboard `[EXPERIMENTAL]`

Cub includes an integrated dashboard that provides a unified Kanban view of your entire project state—from initial ideas through release. This feature is experimental and the API may change.

8-Column Kanban Workflow

The dashboard visualizes work across 8 stages:

CAPTURES — Raw ideas and notes
SPECS — Specifications being researched
PLANNED — Specs in planning or existing plans
READY — Tasks ready to work (no blockers)
IN_PROGRESS — Active development and implementing specs
NEEDS_REVIEW — Awaiting code review or approval
COMPLETE — Done but not released
RELEASED — Shipped to production/users

Data Aggregation

The dashboard automatically aggregates data from multiple sources:

Specs — From specs/**/*.md with frontmatter
Plans — From .cub/sessions/*/plan.jsonl
Tasks — From your beads or JSON backend
Ledger — Completion records from .cub/ledger/
Changelog — Released versions from CHANGELOG.md

Running the Dashboard

# Start the dashboard API server
uvicorn cub.core.dashboard.api.app:app --reload --port 8000

# Open in your browser
open http://localhost:8000

# View API documentation
open http://localhost:8000/docs

Customizing Views

Create custom Kanban views in .cub/views/my-view.yaml:

id: my-view
name: My Workflow View
description: Customized view for my team

columns:
  - id: ready
    title: Ready to Start
    stages: [READY]
  - id: active
    title: In Progress
    stages: [IN_PROGRESS]
    group_by: epic_id  # Group by epic

filters:
  exclude_labels: [archived]
  include_types: [task, epic]

display:
  show_cost: true
  show_tokens: false
  card_size: compact

For details, see .cub/views/README.md.

Linking Entities

Use explicit relationship markers in frontmatter to link specs, plans, and tasks:

Spec to Plan:

---
id: spec-auth
title: Authentication Flow
spec_id: cub-vd6  # Links to plan cub-vd6
---

Plan to Epic:

---
id: plan-auth-flow
title: Auth Implementation Plan
plan_id: cub-vd6  # Links to epic cub-vd6
---

Task to Epic: Created automatically by bootstrap, or add epic_id label:

bd label add cub-abc epic:cub-vd6

⚠️ Security Considerations

For Alpha users, be aware:

Permissions Skipping — Cub includes --no-verify and --no-gpg-sign flags that bypass git safety hooks. Use these carefully in production environments as they can skip important validation and signing checks.
AI Code Execution — Cub runs AI-generated code in your environment without manual approval of each change. Only use in projects where automated execution is acceptable, with proper git history and recovery procedures in place.
Repository Access — When running multi-task sessions, cub modifies git branches, commits code, and can sync state to remote branches. Ensure your credentials and repository access are properly secured.
Task State Persistence — Task state is stored in .cub/tasks.jsonl and synced to the cub-sync git branch. Don't store sensitive credentials or secrets in task descriptions.
Sandbox Recommendations — For untrusted or experimental code generation workflows, consider using cub run --sandbox to isolate execution in Docker.

For details on all known limitations and stability issues, see ALPHA-NOTES.md.

Prerequisites

Python 3.10+ (required)
Harness (Claude and others):
- Claude Code CLI (claude) - Recommended for cub run (used by default).
- OpenAI Codex CLI (codex)
- Google Gemini CLI (gemini)
- OpenCode CLI (opencode)
Task Backend (optional):
- beads CLI (bd) - For advanced task management

Installation

One-Liner (Recommended)

curl -LsSf https://install.cub.tools | bash

This will:

Install cub via pipx (installing pipx if needed)
Add cub to your PATH
Run cub init --global to set up config directories

Restart your shell after installation.

Alternative Methods

Using pipx manually

pipx install git+https://github.com/lavallee/cub.git
cub init --global

Using uv

uv tool install git+https://github.com/lavallee/cub.git
cub init --global

From source (for development)

git clone https://github.com/lavallee/cub ~/tools/cub
cd ~/tools/cub
uv sync  # or: python3.10 -m venv .venv && source .venv/bin/activate && pip install -e .
export PATH="$HOME/tools/cub/.venv/bin:$PATH"
cub init --global

Add the PATH export to your ~/.bashrc or ~/.zshrc.

Quick Start

# Option A: Create a new project from scratch
cub new my-project          # Creates directory, git init, cub init

# Option B: Initialize an existing project
cd my-project
cub init                    # Initialize cub in current directory

Cub uses a JSONL-based task backend by default (.cub/tasks.jsonl). This persists task state as JSON lines without external dependencies. If you prefer beads CLI for advanced task management, ensure it's installed and run cub init --backend beads.

Path A: Start with Planning (Recommended)

When starting new work, use the planning pipeline to turn your ideas into structured tasks:

# Run the full planning pipeline
cub plan run

# Or run phases individually for more control
cub plan orient      # Research the problem space
cub plan architect   # Design the solution architecture
cub plan itemize     # Break into agent-sized tasks

After planning, tasks are ready to run. View your plans:

cub plan list        # List all plans
cub plan orient      # Jump back to earlier phases

Path B: Create Tasks Directly

If you already know what needs doing, use task management commands:

# Create tasks (auto-detects JSONL, beads, or JSON backend)
cub task create "Implement user authentication" --type feature --priority 0
cub task create "Add login form" --type task --priority 0

# List and manage tasks
cub task list        # Show all open tasks
cub task ready       # Show tasks ready to work on
cub task show <id>   # Show task details

Priority levels: 0 (highest) → 4 (backlog)

Run the Loop

cub status              # Check what's ready
cub run                 # Run until complete
cub run --once          # Single iteration
cub run --epic my-epic  # Target specific work
cub run --stream        # Watch in real-time

Upgrading from v0.20 (Bash)? See UPGRADING.md for migration guide.

Usage

Planning Commands (Vision → Tasks) `[EXPERIMENTAL]`

The planning pipeline is experimental. Command names and outputs may change in future releases.

# Full planning pipeline
cub plan run                # Run orient → architect → itemize

# Individual phases (for more control)
cub plan orient             # Research and understand the problem space
cub plan architect          # Design the solution architecture
cub plan itemize            # Break into agent-sized tasks

# Manage plans
cub plan list               # List all plans
cub interview <task-id>     # Deep-dive on a specific task (after planning)

Run Commands (Tasks → Code)

# Execute the loop
cub run                     # Run until all tasks complete
cub run --once              # Single iteration (-1 shorthand)
cub run --ready             # Show ready (unblocked) tasks (-r shorthand)
cub run --task <id>         # Run specific task by ID (-t shorthand)

# Session naming
cub run --name myname       # Custom session name (-n shorthand)

# Filtering
cub run --epic <id>         # Target tasks within a specific epic (-e shorthand)
cub run --label <name>      # Target tasks with a specific label (-l shorthand)
cub run --epic cub-1gq --label phase-1  # Combine filters

# Harness selection
cub run --harness claude    # Use Claude Code (default, -h shorthand)
cub run --harness codex     # Use OpenAI Codex CLI
cub run --harness gemini    # Use Google Gemini
cub run --harness opencode  # Use OpenCode

# Model selection
cub run --model haiku       # Use Haiku model (-m shorthand)
cub run --model sonnet      # Use Sonnet model
cub run --model opus        # Use Opus model

# Budget control
cub run --budget 10         # Max budget in USD (-b shorthand)
cub run --budget-tokens 100000  # Max token budget

# Output modes
cub run --stream            # Stream harness activity in real-time (-s shorthand)
cub run --debug             # Enable verbose debug logging
cub run --monitor           # Launch live dashboard in tmux split

# Isolation modes
cub run --worktree          # Run in isolated git worktree
cub run --sandbox           # Run in Docker sandbox for isolation
cub run --parallel 4        # Run 4 tasks in parallel (-p shorthand)
cub run --direct "task description"  # Run directly with provided task (-d shorthand)

# Git workflow
cub run --main-ok           # Allow running on main/master branch
cub run --use-current-branch  # Run in current branch (don't create new)
cub run --from-branch <ref>  # Base branch for new feature branch

# Sync control
cub run --no-sync           # Disable auto-sync for this run
cub run --no-circuit-breaker  # Disable circuit breaker timeout (for long operations)

Other Commands

# Setup
cub new my-project          # Create new project (mkdir + git init + cub init)
cub init                    # Initialize current project
cub init --global           # Set up global config

# Status and inspection
cub status                  # Show task progress
cub status --json           # JSON output for scripting
cub status -v               # Verbose status with details
cub explain-task <task-id>  # Show full task details
cub artifacts               # List task outputs
cub artifacts <task-id>     # Show specific task artifacts

# Task management (backend-agnostic)
cub task create "Title"     # Create a new task
cub task list               # List all tasks
cub task show <id>          # Show task details
cub task update <id>        # Update task fields
cub task close <id>         # Close a task
cub task ready              # List ready (unblocked) tasks
cub task counts             # Show task statistics
cub task dep                # Manage dependencies

# Git workflow
cub branch <epic-id>        # Create branch bound to epic
cub branches                # List branch-epic bindings
cub pr <epic-id>            # Create pull request for epic

# Sync task state to git
cub sync status             # Check sync status
cub sync init               # Initialize sync branch
cub sync -m "msg"           # Commit current task state with message
cub sync --push             # Push changes to remote after syncing
cub sync --pull             # Pull remote changes before syncing

# Utilities
cub doctor                  # Diagnose configuration issues
cub doctor --fix            # Automatically fix issues
cub version                 # Show cub version
cub update                  # Update project templates
cub system-upgrade          # Upgrade cub to newer version

For a complete command reference, run:

cub --help                  # Show all available commands
cub <command> --help        # Show help for a specific command

Note: Running cub without a subcommand defaults to cub run.

Project Structure

After running cub init, your project will have:

my-project/
├── prd.json        # Task backlog (JSONL-based, default)
├── PROMPT.md       # Loop prompt template (system instructions)
├── AGENT.md        # Build/run instructions for the agent
├── AGENTS.md       # Symlink to AGENT.md (for Codex compatibility)
├── progress.txt    # Session learnings (agent appends)
├── fix_plan.md     # Discovered issues and plans
├── specs/          # Detailed specifications
└── .cub/          # Cub runtime data
    ├── config.json # Project configuration
    ├── hooks/      # Project-specific hooks
    ├── tasks.jsonl # Task backend (primary location)
    └── ledger/     # Task completion ledger
        ├── index.jsonl   # Index of all ledger entries
        ├── by-task/      # Entries organized by task ID
        ├── by-epic/      # Entries organized by epic ID
        ├── by-run/       # Entries organized by run/session ID
        └── forensics/    # Session event logs (per session)

Ledger Storage

Task completion records are stored in a unified ledger structure:

.cub/ledger/
├── index.jsonl                           # Index of all completion entries
├── by-task/cub-abc/                      # All work on task cub-abc
│   ├── run-1-20260111-114543.jsonl       # Entry from first run
│   └── direct-session-20260112-090000.jsonl  # Entry from direct session
├── by-epic/cub-048a-5/                   # All work in epic cub-048a-5
│   ├── cub-abc-...jsonl
│   └── cub-def-...jsonl
├── by-run/porcupine-20260111-114543/     # All entries from a run
│   ├── cub-abc-...jsonl
│   └── cub-def-...jsonl
└── forensics/                            # Session event logs
    ├── porcupine-20260111-114543.jsonl   # Events from cub run session
    └── direct-session-20260112-090000.jsonl  # Events from direct Claude Code session

View ledger data with:

cub ledger show                  # List all completion entries
cub ledger stats                 # Show ledger statistics
cub ledger search <query>        # Search ledger entries
cub retro <id>                   # Generate retrospective for epic or plan
cub verify                       # Verify ledger consistency

Task Backends

Cub supports two task management backends:

JSON Backend (Default)

Simple file-based task management using prd.json:

{
  "projectName": "my-project",
  "prefix": "myproj",
  "tasks": [
    {
      "id": "myproj-a1b2",
      "type": "feature",
      "title": "User authentication",
      "description": "Implement login functionality",
      "acceptanceCriteria": ["Login form renders", "Tests pass"],
      "priority": "P1",
      "status": "open",
      "dependsOn": [],
      "notes": ""
    }
  ]
}

Beads Backend

For projects using the beads CLI:

# Install beads
brew install steveyegge/beads/bd

# Initialize in project
bd init

# Cub auto-detects .beads/ directory
cub status  # Uses beads backend automatically

Task Fields

Field	Description
`id`	Unique identifier (prefix + hash, e.g., `prd-a1b2`)
`type`	`epic`, `feature`, `task`, `bug`, `chore`
`title`	Short description
`description`	Full details, can use user story format
`acceptanceCriteria`	Array of verifiable conditions
`priority`	P0 (critical) to P4 (backlog)
`status`	`open`, `in_progress`, `closed`
`dependsOn`	Array of task IDs that must be closed first
`parent`	(Optional) Parent epic ID
`labels`	(Optional) Array of labels for filtering and model selection
`notes`	Agent-maintained notes

Per-Task Model Selection

Tasks can specify which Claude model to use via a model: label:

# In beads:
bd label add cub-abc model:haiku     # Use fast model for simple tasks
bd label add cub-xyz model:sonnet    # Use balanced model for complex tasks
bd label add cub-123 model:opus-4.5  # Use most capable model for hard tasks

In JSON backend, add labels to the task:

{
  "id": "prd-abc",
  "title": "Quick fix",
  "labels": ["model:haiku", "phase-1"]
}

When cub picks up a task with a model: label, it automatically sets CUB_MODEL to pass to the Claude harness.

Task Selection Algorithm

Find tasks where status == "open"
Filter to tasks where all dependsOn items are closed
Sort by priority (P0 first)
Pick the first one

AI Harnesses

Cub abstracts the AI coding CLI into a "harness" layer, supporting multiple backends.

For detailed capability matrix and technical reference, see docs/HARNESSES.md.

Claude Code (Default)

cub --harness claude
# or
export HARNESS=claude

Uses Claude Code's --append-system-prompt for clean prompt separation.

OpenAI Codex

cub --harness codex
# or
export HARNESS=codex

Uses Codex's --full-auto mode with combined prompts.

Google Gemini

cub --harness gemini
# or
export HARNESS=gemini

Uses Gemini CLI's -y (YOLO mode) for autonomous operation.

OpenCode

cub --harness opencode
# or
export HARNESS=opencode

Uses OpenCode's run subcommand with JSON output for token tracking.

Auto-Detection

By default, cub auto-detects available harnesses using this priority order:

Explicit HARNESS setting (CLI flag --harness or env var HARNESS)
Config priority array (harness.priority in config file)
Default detection order: claude > opencode > codex > gemini

Configuration Example

You can customize the harness priority in .cub.json or global config:

{
  "harness": {
    "priority": ["gemini", "claude", "codex", "opencode"]
  }
}

Cub will try each harness in order and use the first one available. If none are found, it falls back to the default order.

Budget Management

Cub provides token budget tracking to control AI API costs and prevent runaway spending.

How It Works

Cub tracks token usage across all tasks and enforces budget limits:

Per-task tracking: Each harness reports tokens used (where available)
Cumulative tracking: Total tokens tracked per session in logs
Warning threshold: Alert when budget usage reaches a configurable percentage
Hard limit: Loop exits when budget is exceeded

Budget Configuration

Set budget in your config file or via environment variable:

Global config (~/.config/cub/config.json):

{
  "budget": {
    "default": 1000000,
    "warn_at": 0.8
  }
}

Project override (.cub.json):

{
  "budget": {
    "default": 500000,
    "warn_at": 0.75
  }
}

Environment variable:

export CUB_BUDGET=2000000  # Overrides both config files
cub

Budget Parameters

Parameter	Default	Description
`budget.default`	1,000,000	Token budget limit per session
`budget.warn_at`	0.8	Warn when usage reaches this % (0.0-1.0)

Common Budget Examples

For development/testing (small projects):

export CUB_BUDGET=100000  # 100k tokens
cub

For medium projects (most use cases):

export CUB_BUDGET=1000000  # 1M tokens (default)
cub

For large projects (extensive refactoring):

export CUB_BUDGET=5000000  # 5M tokens
cub

For multi-day sessions:

# Set higher budget if running multiple iterations
export CUB_BUDGET=10000000  # 10M tokens
cub --max-iterations 200

Monitoring Budget Usage

Check token usage in structured logs:

# View all budget warnings
jq 'select(.event_type=="budget_warning")' ~/.local/share/cub/logs/myproject/*.jsonl

# Track total tokens per session
jq -s '[.[].data.tokens_used // 0] | add' ~/.local/share/cub/logs/myproject/*.jsonl

# Find high-cost tasks
jq 'select(.data.tokens_used > 10000)' ~/.local/share/cub/logs/myproject/*.jsonl

Guardrails

Cub includes safety guardrails to prevent runaway loops and protect sensitive information.

Iteration Limits

{
  "guardrails": {
    "max_task_iterations": 3,    // Max retries per task
    "max_run_iterations": 50,    // Max total iterations per run
    "iteration_warning_threshold": 0.8  // Warn at 80% of limit
  }
}

When a task exceeds max_task_iterations, it's marked as failed and skipped. When a run exceeds max_run_iterations, the entire run stops.

Circuit Breaker: Stagnation Detection

Cub includes a circuit breaker to detect and prevent infinite hangs when the AI harness becomes unresponsive. If no activity is detected for the configured timeout period, the run stops with a clear error message.

{
  "circuit_breaker": {
    "enabled": true,
    "timeout_minutes": 30
  }
}

When the circuit breaker trips:

Run stops immediately
Clear error message indicating stagnation detected
Task marked as failed
Artifacts captured for debugging

Disable for long operations:

If you're running legitimately long operations (e.g., model training, large dataset processing), disable the circuit breaker:

cub run --no-circuit-breaker    # Disable timeout protection for this run

Or configure in your project's .cub.json:

{
  "circuit_breaker": {
    "enabled": false
  }
}

Default timeout: 30 minutes of inactivity. Override in config:

{
  "circuit_breaker": {
    "enabled": true,
    "timeout_minutes": 60
  }
}

Secret Redaction

Cub automatically redacts sensitive patterns in logs and debug output:

{
  "guardrails": {
    "secret_patterns": [
      "api[_-]?key",
      "password",
      "token",
      "secret",
      "authorization",
      "credentials"
    ]
  }
}

Add custom patterns for project-specific secrets.

Failure Handling

Cub provides configurable failure handling modes:

{
  "failure": {
    "mode": "retry",
    "max_retries": 3
  }
}

Failure Modes

Mode	Behavior
`stop`	Stop immediately on first failure
`move-on`	Mark task failed, continue to next task
`retry`	Retry task with failure context (up to max_retries)
`triage`	(Future) Human-in-the-loop intervention

Failure Context

When using retry mode, subsequent attempts include context about what failed:

## Previous Attempt Failed
Exit code: 1
Error: Test failures in auth_test.py

Please fix the issues and try again.

Session Management

Each cub run creates a unique session with an auto-generated animal name:

# Auto-generated session name
cub run    # Creates: porcupine-20260111-114543

# Custom session name
cub run --name release-1.0    # Creates: release-1.0-20260111-114543

Session names are used for:

Git branch naming: cub/{session}/{timestamp}
Ledger organization: .cub/ledger/by-run/{session}/
Forensic logs: .cub/ledger/forensics/{session}.jsonl
Log identification

Session Assignment

Tasks can be assigned to specific sessions (useful for parallel work):

# With beads backend
bd assign cub-abc porcupine

# View task assignment
bd show cub-abc | grep Assignee

Git Workflow

Cub follows a disciplined git workflow:

Branch Per Run

Each run creates a feature branch (when using the auto-branch hook):

main
└── cub/porcupine/20260111-114543

Commit Per Task

The AI commits after each completed task with a structured message:

task(cub-abc): Implement user authentication

- Added login form component
- Created auth API endpoints
- Added tests for auth flow

Task-Id: cub-abc
Co-Authored-By: Claude Sonnet <noreply@anthropic.com>

Clean State Enforcement

Cub verifies clean git state before and after tasks:

{
  "clean_state": {
    "require_commit": true,   // Require all changes committed
    "require_tests": false    // Optionally require tests pass
  }
}

Override via CLI:

cub run --require-clean      # Force clean state check
cub run --no-require-clean   # Disable clean state check

Task State Synchronization `[EXPERIMENTAL]`

Cub automatically syncs task state to a dedicated git branch (cub-sync) to ensure task progress persists across git clones and team collaboration. This sync happens independently of your working tree—no checkout required. This feature is experimental and the sync mechanism may change.

How Auto-Sync Works

When you run cub run, the sync service:

Initializes the cub-sync branch (if not already created)
Commits task state to the branch after each task completion
Tracks changes using git plumbing commands (no working tree impact)
Persists .cub/tasks.jsonl in git history for recovery

Auto-Sync Behavior

By default, auto-sync triggers after every task completion during cub run:

cub run                      # Auto-syncs after each task (default)
cub run --no-sync            # Disable auto-sync for this run

Configuration

Control auto-sync behavior in your config file:

Global config (~/.config/cub/config.json):

{
  "sync": {
    "enabled": true,
    "auto_sync": "run"
  }
}

Project override (.cub.json):

{
  "sync": {
    "enabled": true,
    "auto_sync": "never"
  }
}

Auto-Sync Modes

Mode	Behavior
`run`	Auto-sync during `cub run` only (default)
`always`	Auto-sync on every task mutation
`never`	Manual sync only (use `cub sync` with flags)

Sync Commands

Manual sync operations are available via the CLI:

# Initialize sync branch (required once)
cub sync init

# Check sync status
cub sync status

# Commit current task state with custom message
cub sync -m "Update tasks"

# Push changes to remote after syncing
cub sync --push

# Pull and merge remote changes before syncing
cub sync --pull

# Combine operations
cub sync --pull --push -m "Sync from local"

Failure Handling

Auto-sync failures are logged but don't stop task execution:

Sync fails: Warning logged, task still completes
Git unavailable: Sync disabled for the run
No changes: Commit skipped (same task state)

Example output:

✓ Task cub-abc completed
⚠ Warning: Failed to sync task state: git not found in PATH

Multi-Task Runs

During multi-task runs, each task completion triggers a sync commit:

# Run with 3 tasks
cub run

# Creates 3 commits on cub-sync branch:
# - "Task cub-001 completed"
# - "Task cub-002 completed"
# - "Task cub-003 completed"

Disabling Auto-Sync

To run without auto-sync (e.g., for testing):

# Temporary disable (this run only)
cub run --no-sync

# Permanent disable (config)
# Edit .cub.json:
{
  "sync": {
    "auto_sync": "never"
  }
}

Collaboration

The cub-sync branch enables team collaboration on task state:

Push your local sync branch: cub sync push
Team member pulls: cub sync pull
Conflicts resolved using last-write-wins (based on updated_at)
Recovery: Clone repo, pull sync branch, continue work

Technical Details

Branch: cub-sync (configurable)
File: .cub/tasks.jsonl committed to sync branch
Method: Git plumbing (git hash-object, git commit-tree)
Working tree: Never touched (uses git update-ref)
Conflict resolution: Last-write-wins based on task updated_at timestamp

Environment Variables

Variable	Default	Description
`CUB_PROJECT_DIR`	`$(pwd)`	Project directory
`CUB_MAX_ITERATIONS`	`100`	Max loop iterations
`CUB_DEBUG`	`false`	Enable debug mode
`CUB_STREAM`	`false`	Enable streaming output
`CUB_BACKEND`	`auto`	Task backend: `auto`, `beads`, `json`
`CUB_EPIC`		Filter to tasks within this epic ID
`CUB_LABEL`		Filter to tasks with this label
`CUB_MODEL`		Override model for Claude harness
`CUB_BUDGET`		Override token budget (overrides config)
`HARNESS`	`auto`	AI harness: `auto`, `claude`, `codex`, `opencode`, `gemini`
`CLAUDE_FLAGS`		Extra flags for Claude Code
`CODEX_FLAGS`		Extra flags for Codex CLI
`GEMINI_FLAGS`		Extra flags for Gemini CLI
`OPENCODE_FLAGS`		Extra flags for OpenCode CLI

Configuration

Cub uses XDG-compliant configuration with global and project-level overrides.

For a complete reference of all configuration options, see docs/CONFIG.md.

Global Setup

cub init --global

Creates:

~/.config/cub/config.json - Global configuration
~/.config/cub/hooks/ - Hook directories
~/.local/share/cub/logs/ - Log storage
~/.cache/cub/ - Cache directory

Configuration Precedence

CLI flags (highest priority)
Environment variables
Project config (.cub.json in project root)
Global config (~/.config/cub/config.json)
Hardcoded defaults (lowest priority)

Config File Format

{
  "harness": {
    "default": "auto",
    "priority": ["claude", "codex"]
  },
  "budget": {
    "default": 1000000,
    "warn_at": 0.8
  },
  "loop": {
    "max_iterations": 100
  },
  "clean_state": {
    "require_commit": true,
    "require_tests": false
  },
  "hooks": {
    "enabled": true
  }
}

Project Override

Create .cub.json in your project root to override global settings:

{
  "budget": {
    "default": 500000
  },
  "loop": {
    "max_iterations": 50
  }
}

Structured Logging

Cub logs all task executions in JSONL format for debugging and analytics.

Log Location

~/.local/share/cub/logs/{project}/{session}.jsonl

Session ID format: YYYYMMDD-HHMMSS (e.g., 20260109-214858)

Log Events

Each task produces structured events:

{"timestamp":"2026-01-09T21:48:58Z","event_type":"task_start","data":{"task_id":"cub-abc","task_title":"Fix bug","harness":"claude"}}
{"timestamp":"2026-01-09T21:52:30Z","event_type":"task_end","data":{"task_id":"cub-abc","exit_code":0,"duration":212,"tokens_used":0,"git_sha":"abc123..."}}

Querying Logs

# Find all task starts
jq 'select(.event_type=="task_start")' ~/.local/share/cub/logs/myproject/*.jsonl

# Find failed tasks
jq 'select(.event_type=="task_end" and .data.exit_code != 0)' logs/*.jsonl

# Calculate total duration
jq -s '[.[].data.duration // 0] | add' logs/*.jsonl

Hooks

Cub provides a flexible hook system to integrate with external services and tools. Hooks are executable scripts that run at specific points in the cub lifecycle.

Hook Lifecycle

The hook execution flow through a typical cub session:

┌─────────────────────────────────────────────────┐
│                   cub Start                     │
└──────────────────┬──────────────────────────────┘
                   │
                   ▼
            ┌──────────────┐
            │ pre-loop ✓   │  (setup, initialization)
            └──────────────┘
                   │
                   ▼
        ┌──────────────────────┐
        │  Main Loop Starts    │
        └──────┬───────────────┘
               │
        ┌──────▼──────────┐
        │ pre-task ✓      │  (for each task)
        └────────┬────────┘
                 │
                 ▼
          ┌─────────────────┐
          │ Execute Task    │
          │  (harness)      │
          └────────┬────────┘
                   │
        ┌──────────┴──────────┐
        │                     │
        ▼                     ▼
   ┌──────────┐         ┌──────────┐
   │ Success  │         │ Failure  │
   └────┬─────┘         └────┬─────┘
        │                    │
        │              ┌─────▼──────┐
        │              │ on-error ✓ │  (alert, logs)
        │              └─────┬──────┘
        │                    │
        └────────┬───────────┘
                 │
                 ▼
           ┌────────────────┐
           │ post-task ✓    │  (metrics, notify)
           └────────┬───────┘
                    │
        ┌───────────┴──────────┐
        │                      │
        ▼                      ▼
    ┌────────┐           ┌──────────┐
    │ More   │           │ All Done │
    │ Tasks? │           └──────┬───┘
    └───┬────┘                  │
        │ yes                   │
        ▼                       │
   (Loop Back)                  │
        │                       │
        └───────────────────────┘
                   │
                   ▼
            ┌──────────────┐
            │ post-loop ✓  │  (cleanup, reports)
            └──────────────┘
                   │
                   ▼
            ┌──────────────┐
            │  Exit Loop   │
            └──────────────┘

Hook Points

Cub supports eight hook points:

Hook	When It Runs	Use Cases
`pre-loop`	Before starting the main loop	Setup, initialization, cleanup from previous run
`pre-task`	Before each task execution	Prepare environment, start timers
`post-task`	After each task (success or failure)	Notifications, metrics, logging
`on-error`	When a task fails	Alerts, incident creation, diagnostics
`on-budget-warning`	When budget crosses threshold (80%)	Cost alerts, scaling decisions
`on-all-tasks-complete`	When all tasks are done	Final notifications, trigger next phase
`post-loop`	After the main loop completes	Cleanup, final notifications, reports
`post-init`	After `cub init` completes	Custom project setup, install team hooks

Hook Locations

Hooks are discovered from two locations (in order):

Global hooks: ~/.config/cub/hooks/{hook-name}.d/ - Available to all projects
Project hooks: ./.cub/hooks/{hook-name}.d/ - Specific to a project

All executable files in these directories are run in sorted order (alphabetically).

Context Variables

All hooks receive context via environment variables:

Variable	Available In	Description
`CUB_HOOK_NAME`	All	Name of the hook being executed
`CUB_PROJECT_DIR`	All	Project directory
`CUB_SESSION_ID`	pre-loop, post-loop	Unique session identifier
`CUB_HARNESS`	pre-loop, post-loop	Harness in use (claude, codex, etc.)
`CUB_TASK_ID`	pre-task, post-task, on-error	ID of the current task
`CUB_TASK_TITLE`	pre-task, post-task, on-error	Title of the current task
`CUB_EXIT_CODE`	post-task, on-error	Exit code from task execution (0 = success)

Example Hooks

Cub includes example hooks for common integrations:

examples/hooks/post-task/slack-notify.sh - Posts task completion to Slack
examples/hooks/post-loop/datadog-metric.sh - Sends metrics to Datadog
examples/hooks/on-error/pagerduty-alert.sh - Creates PagerDuty incidents on failure

To install an example hook:

# Copy to global hooks directory
mkdir -p ~/.config/cub/hooks/{post-task,post-loop,on-error}.d
cp examples/hooks/post-task/slack-notify.sh ~/.config/cub/hooks/post-task.d/01-slack.sh
chmod +x ~/.config/cub/hooks/post-task.d/01-slack.sh

# Or to project-specific hooks
mkdir -p .cub/hooks/post-task.d
cp examples/hooks/post-task/slack-notify.sh .cub/hooks/post-task.d/01-slack.sh
chmod +x .cub/hooks/post-task.d/01-slack.sh

Each example script includes detailed installation and configuration instructions.

Writing Custom Hooks

Creating a hook is simple - just write a bash script:

#!/usr/bin/env bash
# Example hook script

# Hooks receive context as environment variables
echo "Task $CUB_TASK_ID completed with exit code $CUB_EXIT_CODE"

# Exit with 0 for success, non-zero for failure
exit 0

Requirements:

Script must be executable (chmod +x)
Script must exit with status 0 (success) or non-zero (failure)
Script should handle missing environment variables gracefully
Hook failures are logged but don't stop the loop by default (unless hooks.fail_fast is enabled in config)

Configuration

Hook behavior is controlled in your config file:

{
  "hooks": {
    "enabled": true,
    "fail_fast": false
  }
}

Option	Default	Description
`hooks.enabled`	`true`	Enable/disable all hooks
`hooks.fail_fast`	`false`	Stop loop if a hook fails (true) or continue (false)
`hooks.async_notifications`	`true`	Run post-task/on-error hooks asynchronously

Async Hooks

By default, post-task and on-error hooks run asynchronously (non-blocking). This means:

These hooks fire in the background and don't slow down the main loop
The next task can start while notifications are being sent
All async hooks are collected before the session ends

To disable async execution and wait for each hook:

{
  "hooks": {
    "async_notifications": false
  }
}

Note: pre-loop, pre-task, and post-loop hooks always run synchronously because they must complete before the next phase begins.

Troubleshooting Hooks

Hooks not running:

Verify hooks.enabled: true in config
Check scripts are executable: chmod +x <script>
Verify directory structure: .cub/hooks/{hook-name}.d/

Hook timeout:

Hooks have a 5-minute timeout (300 seconds)
Long operations should use timeout command internally

Hook failed:

Check output for error messages
Exit code 0 = success, non-zero = failure
With fail_fast: false (default), failures are logged but don't stop cub

Naming convention:

Use numeric prefixes to control order: 01-first.sh, 02-second.sh
Scripts run in alphabetical order (global before project)

Hook Best Practices

Make hooks idempotent - Safe to run multiple times
Handle missing variables - Use ${VAR:-default} syntax
Exit cleanly - Always exit 0 for success
Log with prefix - echo "[hook-name] message" for clarity
Use numeric prefixes - 10-setup.sh, 20-main.sh, 90-cleanup.sh
Store state in .cub/ - For inter-hook communication
Keep hooks fast - Under 30 seconds ideally
Test locally first - CUB_TASK_ID=test ./hook.sh

Example with best practices:

#!/usr/bin/env bash
# 10-slack-notify.sh - Post task completion to Slack

set -euo pipefail

# Handle missing variables gracefully
TASK_ID="${CUB_TASK_ID:-unknown}"
EXIT_CODE="${CUB_EXIT_CODE:-0}"

# Skip if no webhook configured
if [[ -z "${SLACK_WEBHOOK_URL:-}" ]]; then
    echo "[slack-notify] No SLACK_WEBHOOK_URL set, skipping"
    exit 0
fi

# Send notification
curl -s -X POST "$SLACK_WEBHOOK_URL" \
    -H 'Content-type: application/json' \
    -d "{\"text\": \"Task $TASK_ID completed (exit code: $EXIT_CODE)\"}" \
    > /dev/null

echo "[slack-notify] Notification sent for $TASK_ID"
exit 0

Toolsmith: Tool Discovery and Management

Cub includes Toolsmith, a built-in tool discovery system to help you find, evaluate, and catalog tools (MCP servers, skills, and integrations) from multiple sources.

Quick Start

Sync tools from all sources:

cub toolsmith sync

Search for tools:

cub toolsmith search "database"

View catalog statistics:

cub toolsmith stats

Commands

Command	Purpose
`cub toolsmith sync [--source NAME]`	Populate catalog from external sources
`cub toolsmith search QUERY [--source NAME]`	Find tools by name, description, or capability
`cub toolsmith stats`	View catalog statistics and last sync time

Sources

Toolsmith syncs from multiple sources:

Smithery - MCP server registry
Glama - AI tool and skill marketplace
SkillsMP - Multi-provider skills platform
ClawdHub - Claude-focused tools

Configuration

Toolsmith stores the tool catalog in:

SQLite database: ~/.cub/toolsmith.db
Override: Set CUB_TOOLSMITH_DB environment variable

Common Use Cases

# Populate your catalog (do this first)
cub toolsmith sync

# Find database tools
cub toolsmith search "database"

# Filter by source
cub toolsmith search "api" --source smithery

# Check what's been synced
cub toolsmith stats

Documentation

For detailed documentation, examples, and troubleshooting, see docs/toolsmith.md.

How It Works

The Loop

┌──────────────────────────────────────────┐
│                 cub                      │
│                                           │
│  Tasks ────▶ Find Ready Task             │
│                     │                     │
│                     ▼                     │
│              Generate Prompt              │
│                     │                     │
│                     ▼                     │
│           AI Harness (claude/codex)       │
│                     │                     │
│                     ▼                     │
│              Task Complete?               │
│                /        \                 │
│               ▼          ▼                │
│            Loop        Done               │
└──────────────────────────────────────────┘

Prompt Structure

Cub generates two prompts for each iteration:

System Prompt (from PROMPT.md): Static instructions about workflow, rules, and completion signals
Task Prompt: Current task details including ID, description, and acceptance criteria

Feedback Loops

The agent runs these before committing:

Type checking (tsc, mypy, etc.)
Tests (jest, pytest, etc.)
Linting (eslint, ruff, etc.)
Build (if applicable)

If any fail, the agent must fix before proceeding.

Completion Signal

When all tasks have status: "closed", the agent outputs:

<promise>COMPLETE</promise>

This signals cub to exit the loop.

Advanced Usage

Streaming Mode

Watch agent activity in real-time:

cub run --stream

Shows tool calls, responses, and costs as they happen.

Debug Mode

Get verbose output for troubleshooting:

cub run --debug --once

Includes:

Full prompts being sent
Full harness command line (for copy-paste debugging)
Task selection details
Timing information
Acceptance criteria logging
Saves prompts to temp files

Planning Mode

Analyze codebase and update fix_plan.md:

cub run --plan

Uses parallel subagents to study code, find TODOs, and document issues.

Task Backend Auto-Detection

Cub automatically detects your task backend:

Beads — If .beads/ directory exists, uses beads CLI
JSON — If prd.json exists, uses JSON backend
JSONL — If .cub/tasks.jsonl exists, uses JSONL backend (default)

To switch backends, use the CUB_BACKEND environment variable:

# Explicitly use JSON backend
CUB_BACKEND=json cub status

# Use beads backend
CUB_BACKEND=beads cub status

# Auto-detect (default)
cub status

Tips

Task Sizing

Keep tasks small enough to complete in one iteration (~one context window). If a task feels big, break it into subtasks.

Specifications

The more detailed your specs, the better the output. Put them in specs/ and reference them in task descriptions.

Progress Memory

The agent appends to progress.txt after each task. This creates memory across iterations - patterns discovered, gotchas encountered.

Recovery

If the codebase gets into a broken state:

git reset --hard HEAD~1  # Undo last commit
cub                      # Restart loop

Choosing a Harness

Harness	Best For
Claude Code	General coding, complex refactoring, multi-file changes
Codex	Quick fixes, OpenAI ecosystem projects

Source Code Reference

Module	Purpose
`src/cub/cli/`	Typer CLI subcommands (run, status, init, planning commands)
`src/cub/core/config.py`	Configuration loading and merging
`src/cub/core/models.py`	Pydantic data models (Task, Config, etc.)
`src/cub/core/tasks/`	Task backends (beads, JSON)
`src/cub/core/harness/`	AI harness backends (Claude, Codex, Gemini, OpenCode)
`src/cub/core/logger.py`	Structured JSONL logging
`templates/PROMPT.md`	Default system prompt
`templates/AGENT.md`	Default agent instructions

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 837 Commits
.beads		.beads
.claude		.claude
.cub		.cub
.github/workflows		.github/workflows
docker		docker
docs-src		docs-src
docs		docs
examples/hooks		examples/hooks
plans		plans
scripts		scripts
sketches		sketches
specs		specs
src/cub		src/cub
templates		templates
tests		tests
.coverage		.coverage
.cub.json		.cub.json
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
.python-version		.python-version
@progress.txt		@progress.txt
AGENT.md		AGENT.md
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
PROMPT.md		PROMPT.md
README.md		README.md
ROADMAP.md		ROADMAP.md
UPGRADING.md		UPGRADING.md
coverage.json		coverage.json
plan_review.md		plan_review.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

License

lavallee/cub

Folders and files

Latest commit

History

Repository files navigation

Cub

Quick Install

The Problem

Two Main Steps: Planning and Running

Step 1. cub plan: Go From a Vision to Structured Tasks

Step 2: cub run: Turn Tasks Into Code

Key Features

Right Model for the Task

Multi-Harness Flexibility

Deterministic Control Layer

Features

Dashboard [EXPERIMENTAL]

8-Column Kanban Workflow

Data Aggregation

Running the Dashboard

Customizing Views

Linking Entities

⚠️ Security Considerations

Prerequisites

Installation

One-Liner (Recommended)

Alternative Methods

Quick Start

Path A: Start with Planning (Recommended)

Path B: Create Tasks Directly

Run the Loop

Usage

Planning Commands (Vision → Tasks) [EXPERIMENTAL]

Run Commands (Tasks → Code)

Other Commands

Project Structure

Ledger Storage

Task Backends

JSON Backend (Default)

Beads Backend

Task Fields

Per-Task Model Selection

Task Selection Algorithm

AI Harnesses

Claude Code (Default)

OpenAI Codex

Google Gemini

OpenCode

Auto-Detection

Configuration Example

Budget Management

How It Works

Budget Configuration

Budget Parameters

Common Budget Examples

Monitoring Budget Usage

Guardrails

Iteration Limits

Circuit Breaker: Stagnation Detection

Secret Redaction

Failure Handling

Failure Modes

Failure Context

Session Management

Session Assignment

Git Workflow

Branch Per Run

Commit Per Task

Clean State Enforcement

Task State Synchronization [EXPERIMENTAL]

How Auto-Sync Works

Auto-Sync Behavior

Configuration

Auto-Sync Modes

Sync Commands

Failure Handling

Multi-Task Runs

Disabling Auto-Sync

Collaboration

Step 1. `cub plan`: Go From a Vision to Structured Tasks

Step 2: `cub run`: Turn Tasks Into Code

Dashboard `[EXPERIMENTAL]`

Planning Commands (Vision → Tasks) `[EXPERIMENTAL]`

Task State Synchronization `[EXPERIMENTAL]`

Packages