Autonomous development pipeline that pairs with any GitHub-hosted TypeScript/Node repo. Three components work together: bugbot scans the codebase for issues and files them on GitHub, nightshift picks them up and implements fixes autonomously using Claude Code, and self-review triages its own PRs before you see them.
Note: Currently TypeScript-only. The verification gates (
npm run verify, dependency guard, etc.) assume a Node/TS project. Supporting other languages would require swappable verify commands and language-aware guardrails.
- Node.js 18+
- GitHub CLI (
gh) — authenticated - Claude Code CLI (
claude) — authenticated - Claude Max subscription recommended — nightshift and bugbot run headless Claude Code sessions that consume significant tokens. Built and tested with unlimited Max tokens; API billing may be expensive for large queues.
- GitHub Actions minutes — every PR nightshift creates triggers CI on the target repo (branch push + merge-to-main = 2 CI triggers per issue). The actual cost depends on your target repo's CI workflow — matrix builds, long test suites, or multiple jobs multiply the minutes per trigger. A full overnight run of 40+ issues can easily consume 200+ minutes on a modest CI setup, and much more with heavier workflows. GitHub Free includes 2,000 min/month for private repos (public repos are unlimited). Monitor usage at Settings → Billing → Actions, and set a $0 spending limit to avoid surprise charges. Consider pausing nightshift late in your billing cycle if you're above 80%.
npx tsx(included with Node 18+)- tmux (for
nightshift start)
The diagram uses color-coded swim lanes:
- Blue — Bugbot: scanners, dedup, and issue filing
- Green — Bash phases (P1–P5): worktree setup and execution via
auto-dev.sh - Purple — TypeScript phases (P6–P9): panel review and publish via
worker.ts - Orange — Self-review phases (P10–P12): PR triage via
pr-self-review.ts - Red (bottom bar) — Infrastructure: state management, circuit breaker, PID lock
Dashed arrows represent label transitions and feedback loops (see Label Lifecycle).
See the interactive pipeline explainer for a visual breakdown (source).
git clone <this-repo-url>
cd auto-dev
npm install
cp .env.example .env # edit TARGET_REPO to point at your projectAdd the CLI functions to your shell:
echo 'source /path/to/auto-dev/nightshift/nightshift.zsh' >> ~/.zshrcIf you clone to a non-standard location, set AUTO_DEV_REPO to the repo root:
export AUTO_DEV_REPO=~/projects/auto-dev # default: ~/Repos/auto-devbugbot # Run all scanners against TARGET_REPO
bugbot --dry-run # Show findings without filing issues
bugbot --category dead-code # Run specific scanner onlynightshift start # Launch in tmux with dashboard
nightshift start --at 2:00 # Start at 2:00 AM tonight (sleeps in tmux)
nightshift start --in 1h # Start in 1 hour
nightshift start --fresh # Ignore prior state, start clean
nightshift start --issue 184,185 # Process specific issues (comma-separated, passed individually to auto-dev.sh)
nightshift start --concurrency 3 # Parallel workers
nightshift start --dry-run # Preview queue without executing
nightshift start --max-failures 3 # Stop after N consecutive failures
nightshift status # One-shot status check
nightshift log # Tail $STATE_DIR/nightshift.log
nightshift stop # Kill the tmux session (also cancels scheduled runs)
nightshift promote # Label next wave of unblocked issuesFlags can be combined: nightshift start --at 2:00 --concurrency 2 --fresh
The queue builder converts labeled GitHub issues into a prioritized work queue; the worker pool processes them with configurable concurrency (--concurrency N) controlled by a semaphore.
nightshift optimize # Run with defaults (10 experiments, PR after 5 wins)
nightshift optimize --max-experiments 20 # Cap experiment count
nightshift optimize --wins-before-pr 3 # Draft PR after 3 wins
nightshift optimize --dry-run # Show config without executing
nightshift optimize status # Show optimize stateInspired by Karpathy's autoresearch pattern. Runs in a git worktree (TARGET_REPO--optimize) so the user's working tree is never touched. Each experiment cycle:
- Claude generates a hypothesis (reads
program.md+ prior results) - Claude implements the change headlessly
npm run verifygates correctness (build + lint + test)- Benchmark runs 15 diverse screenings, measures p50/p95 latency
- Win detection requires both >=5% p50 improvement AND statistical significance (paired t-test, alpha=0.05)
- Win → commit kept, baseline ratchets forward. Loss → rollback to snapshot SHA
After N wins accumulate, a draft PR is created with before/after metrics. Wall time is ~5 minutes per experiment.
| Phase | Name | Tool | Description |
|---|---|---|---|
| 1 | Discover | auto-dev.sh | Find nightshift-labeled issues via gh |
| 2 | Setup | auto-dev.sh | Create git worktree from origin/main |
| 3 | Execute | auto-dev.sh | Claude Code implements the fix headlessly + scope constraints (no cascade deletions) |
| 4 | Simplify | auto-dev.sh | Code quality pass (Claude reviews own work) + scope-limited (won't delete code the spec didn't mention) |
| 5 | Verify | auto-dev.sh | Gates 1–5: verify, file limits (≤15), line limits (≤500 total), dependency guard (positive-verb match), deletion budget (≤15 net / ≤40 gross). On success, writes sentinel.json — the handoff artifact (with net_deletions, head_sha) that activates the TypeScript layer |
| 6 | Panel Review | worker.ts | Always: code-reviewer, spec-compliance-checker, test-coverage-checker, scope-checker. Conditional: red-team (security files), ml-specialist (scoring files) |
| 6.5 | Simplify Filter | worker.ts | Remove overengineered review suggestions (internal step, not shown in diagram) |
| 7 | Fix | worker.ts | Apply actionable review findings |
| 8 | Re-verify | worker.ts | npm run verify after fixes |
| 9 | Publish | worker.ts | Create draft PR, post review brief |
| 10 | Self-Review | pr-self-review.ts | Review each PR diff, classify findings |
| 11 | Auto-Fix | pr-self-review.ts | Fix auto-fixable findings, re-verify, push |
| 12 | Triage | pr-self-review.ts | Deterministic: approve / self-fix / flag for human |
| 12+ | Wave Promotion | pr-self-review.ts | Promotes dependency-unblocked issues (Kahn's algorithm), labels them auto-ready + nightshift to re-enter the queue |
| Variable | Default | Description |
|---|---|---|
TARGET_REPO |
~/Repos/your-target-repo |
Repository to scan and process |
STATE_DIR |
~/.auto-dev |
Nightshift state, logs, and run artifacts |
BUGBOT_STATE |
~/.bugbot |
Bugbot state directory |
SCAN_ROOT |
Same as TARGET_REPO |
Bugbot scan target (if different) |
BUGBOT_ROOT |
~/Repos/auto-dev/bugbot |
Bugbot source directory |
The circuit breaker (--max-failures N, default 3) halts the queue after N consecutive systemic failures (crashes, timeouts, setup errors). Spec-level failures (verify gate, panel review) reset the counter. A PID lock at $STATE_DIR/nightshift.pid prevents overlapping runs.
Nightshift includes a dependency-aware promoter (nightshift promote) that uses Kahn's algorithm to find issues whose dependencies are all satisfied (closed or PR-ready). Promotion uses an allowlist approach: only issues whose labels are all in the AUTONOMOUS_LABELS set get promoted. Any unknown label blocks promotion (fails safe). When promotion succeeds, issues are labeled auto-ready + nightshift, re-entering the queue for the next nightshift run. This creates the self-sustaining feedback loop shown as the dashed arrow in the diagram.
[bugbot files issue]
→ adds: auto-ready + nightshift + bugbot + bugbot:{category} + priority:{level}
→ [nightshift picks up]
→ removes: auto-ready + nightshift
→ [success] adds: auto-pr-ready (draft PR created)
→ [no changes] issue closed with comment (no label added)
→ [failure] adds: auto-failed
nightshift— scoped for autonomous processingauto-ready— specced and ready (broader gate)auto-pr-ready— PR created, awaiting human mergeauto-failed— pipeline failed, needs investigationbugbot— filed by bugbot (vs. manually created)bugbot:{category}— scanner category (e.g.,bugbot:dead-code,bugbot:type-holes)priority:{level}— severity-based priority (e.g.,priority:high,priority:low)
The wave promoter only promotes issues whose labels are all in the autonomous-compatible set. Labels outside this set block promotion (unknown labels fail safe).
Compatible labels: bug, enhancement, documentation, frontend, pipeline, extraction, dir:*, tier:*, priority:*, bugbot, bugbot:*, nightshift, auto-ready, auto-pr-ready, auto-failed, auto-review-failed
Blocked by default: research, design, admin, in-progress, blocked, and any label not in the set above. To make a new label autonomous-compatible, add it to AUTONOMOUS_LABELS in nightshift/src/promoter.ts.
