Skip to content

yeewangcn/tianluo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

田螺姑娘 (tianluo)

🌐 Languages: English · 简体中文

Keep Claude Code on a multi-hour task without it pinging you at 3am. Tianluo is the methodology that survives context compaction, plans every fork up front, diagnoses failures structurally, and knows when to escalate versus retry — so a single instruction starts an 18-hour run that ends with one report, not a wakeup call.

License: MIT Claude Plugin


What it actually does

Five concrete capabilities. Each one fixes a specific failure mode that breaks long-running agent runs:

Capability Failure it fixes
Persistent state survives context compactionstate.json on shared storage, atomically updated; cron prompts rebuild context from scratch on every wake-up Agent forgets the task_id it just submitted, leaks an unowned external job
Plan-time fork enumeration — every future user-decision is listed as Q1..Qn before the run starts, answered once, then the run goes to terminal state without further questions Agent pauses at hour 4 asking permission for routine yes/no, idles until you wake up
5-layer structured diagnosis — never decide "task is stuck" from one signal; check scheduler / container / log / progress / output before classifying Agent triggers manual_hold on slices that already finished, or retries phantom failures
Budget-bounded retry with escalation ladder — transient → patch → replan → human; never infinite retry, never retry a code-level bug Agent burns hours retrying a real bug, or gives up on a recoverable transient blip
File-role strict separation — machine state (JSON), human plan (frozen Markdown), failure artifacts (append-only), cross-session memory (curated) — never merged State gets polluted by natural-language narrative, can't be grep'd or hook-validated

These are five inversions of common multi-hour-agent failure modes, distilled into a single Claude Code skill plugin.


Why "tianluo"?

田螺姑娘 (Tianluo Guniang) is a Chinese folk tale: a fisherman returns home each day to find his hut tidied and dinner ready. A snail-spirit had been quietly doing the work while he was at sea — never asking, never interrupting, only present when something genuinely required him.

That's the contract: quiet progress while you're away, single-report on return, a whisper only when something is genuinely out of scope.


What this is (and isn't)

This is a methodology skill — a SKILL.md plus reference docs and templates that teach Claude Code how to design and drive long-horizon, cron-driven tasks. It is not a framework, runtime, or executable. Nothing to install beyond the plugin itself; you bring your own scheduler, storage, and task domain.

  • In scope: persistent state design, self-contained cron prompts, retry-budget ladders, multi-layer failure diagnosis, plan-time fork enumeration, file-role separation
  • Out of scope: any specific cloud, scheduler, framework, or task domain. The skill speaks in generic terms ("scheduler", "shared storage", "instance × stream × role"); you map them to your stack

Where it fits: any task that runs for hours under a cron-like trigger, has multiple tracked entities, has pollable state, has classifiable failure modes, and has an auto-decidable terminal state. Concrete examples: scheduled batch jobs, multi-stage build → test → deploy → verify pipelines, parameter sweeps, multi-target health monitors, long CI suites, data-pipeline runs, training/eval workflows, simulation sweeps. The skill doesn't care which.

Where it doesn't fit: sub-10-minute tasks (just run them yourself), tasks with no external state to poll (the agent has to hold the connection), tasks where every step needs human judgement (use a different pattern).


Compatibility

The methodology itself is agent-agnostic — SKILL.md, the four reference docs, and the four templates are plain Markdown / JSON with no Claude-specific code. Different agents reach it via different install paths:

Agent Install path Notes
Claude Code (CLI) /plugin marketplace add yeewangcn/tianluo then /plugin install tianluo@tianluo Native, one-step. Skill loads automatically when you ask about long-running orchestration.
Claude.ai (web) Settings → Capabilities → Skills → upload plugins/tianluo/skills/tianluo/ as a custom skill Manual upload of the skills/tianluo/ directory.
Claude API Reference SKILL.md content via the /v1/skills endpoint Paste the SKILL.md content into a skill resource.
Codex / GitHub Copilot / Cursor / Aider / Continue / Gemini CLI / OpenCode / Windsurf git clone https://github.com/yeewangcn/tianluo and point your agent at plugins/tianluo/skills/tianluo/SKILL.md as a system-prompt reference or rules file The methodology is just Markdown — every agent that can read a .md rules / system prompt can use it. Format may need light adaptation per agent (e.g. Cursor's .cursor/rules, Aider's .aider.conf).

Bottom line: install via Claude Code if you can; clone-and-reference works for everyone else. Either way you get the same SKILL.md, the same reference docs, the same templates.


Quickstart

Install via Claude Code plugin marketplace

/plugin marketplace add yeewangcn/tianluo
/plugin install tianluo@tianluo

The skill becomes available as tianluo and Claude Code will load SKILL.md automatically when you ask about long-running orchestration.

Use the skill

Ask Claude something like:

"I have a 14-hour multi-stage job to run overnight across 6 variants. Use the tianluo skill to design the orchestration."

Claude will then:

  1. Read SKILL.md (the 8 invariants and 5 components)
  2. Apply templates/plan-doc-skeleton.md to enumerate every future fork once, up front
  3. Wait for you to answer all Q1...Qn decisions
  4. Build a self-contained cron prompt from templates/cron-prompt-skeleton.md
  5. Drive the run autonomously, escalating to manual_hold only on real ambiguity

The 4 unbreakable rules (essence)

The full SKILL.md lists 8 invariants and 5 components. If you read nothing else, the four rules below are the delta vs other long-running-agent patterns (Anthropic Effective Harnesses, Manus, Plan Mode, ReWOO, StateFlow, Cursor, etc.):

① Submit ≡ Record

The very next tool call after submit must write task_id to state.json. They form an atomic pair. If the prompt is compacted between submit and record, you've leaked an unowned external job. (This is the most common sin in multi-hour runs and the hardest to recover from.)

② Plan-time fork enumeration

Before the cron starts, walk the entire task chain and list every future user-decision-fork as Q1...Qn — branch decisions, resource thresholds, dependencies, fallbacks, output destinations. The user answers them all once. The cron then runs to terminal state with zero further interruptions, except for the explicit manual_hold triggers you declared.

This is what separates tianluo from "ask when stuck" patterns. An agent that mid-run pings the user at 3am for routine confirmation is broken design, not poor execution.

③ File-role strict separation

Four file types, four roles, never merged:

File Reader Format Mutability
state.json machine (cron, hooks) JSON high-frequency overwrite
<exp>_plan.md human Markdown frozen at plan-time
failures/<entity>_<n>.md human (postmortem) Markdown append-only artifacts
~/.claude/.../memory/feedback_*.md future agent sessions Markdown curated, cross-session

The Manus 3-file pattern (task_plan + findings + log mono-file) looks tidy but breaks here: state mutates inside a markdown narrative, you can't grep it, you can't pre-commit-hook-validate it, and natural language pollution causes self-reference drift in cron.

④ The cron prompt is ground truth — not the plan doc

Plan doc is the human contract at plan-time. Once cron is running, the agent never reads its own plan.md as runtime ground truth. Runtime ground truth is cron prompt + state.json only. This is the deliberate inverse of Claude Code's Plan Mode and Manus's PreToolUse re-read pattern: those approaches risk the agent narrating itself into drift.


Reference docs

File Topic
SKILL.md Core methodology: 5 components + 8 invariants + when to use
reference/5-layer-debugging.md Scheduler / container / log / progress / output — never diagnose from one layer
reference/retry-budget-ladder.md StateFlow-style 4-level escalation: transient → patch → replan → human
reference/plan-time-fork-types.md The 5 fork categories + anti-patterns
reference/industry-comparison.md Side-by-side with 13 other frameworks
Template Use
templates/state-schema.json Skeleton for state.json
templates/cron-prompt-skeleton.md Self-contained cron prompt (500-1500 words)
templates/plan-doc-skeleton.md Plan doc with Q1...Qn and Success Gate
templates/failure-analysis.md What to write at manual_hold

Case study

docs/case-studies/overnight-18h-ml-run.md — an 18-hour autonomous overnight run across a 4-stage pipeline (data prep → main job → output selection → scoring), operator asleep the entire time. Single block of terminal report next morning. Zero mid-run interruptions. Three near-miss interruptions each prevented by a specific invariant. The example domain is ML, but the same five capabilities apply identically to any other long-running multi-stage workload.


How tianluo compares

Approach Plan once? Cron self-contained? Failure artifacts File roles HITL model
tianluo yes (forks enumerated) yes independent failures/*.md strict 4-way separation conditional + plan-time enumerated
Anthropic Effective Harnesses yes partial (init.sh) inlined in progress.txt progress + git only approval
Anthropic Plan Mode yes n/a (single-pass) n/a plan.md re-read approval
Manus 3-file mid-run editable no (re-reads plan) inlined in progress.md mono-file varies
ReWOO fully frozen n/a (worker-fills) re-plan on fail n/a n/a
StateFlow one FSM event-driven failure transitions n/a varies
Cursor Self-Driving mid-run editable judge-driven judge artifacts git history audit
LangGraph checkpointers mid-run editable per-node exception nodes JSON snapshots varies

Full breakdown: reference/industry-comparison.md.


Project-specific knowledge does NOT belong in this skill

A core discipline: the skill stays pure methodology. Domain pits (one specific framework's quirks, one project's data layout) belong in your project's:

  • ~/.claude/projects/<path>/memory/feedback_*.md for cross-session pits
  • <project>/experiments/<exp>_plan.md / _findings.md / _lessons.md for run-specific notes

Don't bloat SKILL.md with "X framework on Y filesystem triggers race Z" — that's two-component-specific and lives in project memory.


Contributing

Issues and PRs welcome. The skill is opinionated by design; if you propose adding a new invariant or component, please cite at least one concrete failure case where the existing 8 invariants weren't sufficient.

When proposing changes:

  • Methodology only — no domain-specific examples in SKILL.md (those go in reference/*.md clearly marked as illustrative)
  • Keep SKILL.md ≤ 250 lines; reference docs absorb depth
  • Add comparison entries to industry-comparison.md if you reference a new framework

License

MIT — see LICENSE.

Author

Built by @yeewangcn. The methodology was extracted from many failed multi-hour runs where the failure mode was always "the agent paused asking permission for something it could have asked at plan-time" or "the agent's own narrative drifted across context compactions". Each of the eight invariants in SKILL.md is the inversion of one specific recurring failure.

About

Tianluo (田螺姑娘) — keep Claude Code on a multi-hour task without it pinging you at 3am. Methodology for multi-hour autonomous runs: persistent state, plan-time fork enumeration, multi-layer diagnosis, budget-bounded retry.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors