Time-travel chat: snapshot + rollback + lessons subsystem #242

HomenShum · 2026-04-26T22:37:54Z

HomenShum
Apr 26, 2026

Time-travel chat: snapshot + rollback + lessons subsystem

Why now

While dogfooding the Phase 1 decompose flow from #225 (PR #241), I hit a failure mode that the existing architecture has no graceful recovery for: the agent kept retrying fs.replace against fragile whitespace, broke the file structure, then tried to "fix" the damage and broke it more. ~14 minutes of compute lost; the only remedy was closing the design and starting over.

The existing primitives are all there (snapshots-db, action-timeline, DiagnosticEventRow) — they just aren't being woven into a recovery loop. This Discussion proposes the shape of that loop before any code lands.

This isn't blocking #225's decompose PR. But it's a hard prereq for hands-off async use — if the user steps away mid-run, the agent must be able to recover from its own missteps. Otherwise async work is dead on arrival.

Proposed shape — 3 layers

Layer	Trigger	State	Cost
1. Auto-snapshot per turn	Pre-hook before any `fs.write` / `fs.replace` / `fs.insert` agent tool call	Extends `snapshots-db` with `(designId, turnId, filePath, sha256)`, content-deduped, capped at 100 most-recent per design (BOUND rule)	$0
2. Chat-native rollback	User types `/rollback`, `/rollback 3`, "undo that", "revert that" in composer; OR agent self-invokes after detecting damage	Restores files from snapshot(s); emits typed `rollback` diagnostic event into chat thread as a visible audit message	$0
3. Lesson capture + injection	Toast after rollback asks "what should the agent do differently next time?" → user types one sentence (skip-able) → stored in `LESSONS.md` per-design → top-K relevant lessons injected into next agent system prompt	`LESSONS.md` in design root, structured JSON-in-markdown for human readability	~50 input tokens per relevant lesson per turn

Why all three layers, not just snapshot+rollback

Without layer 3, you have git stash. With layer 3, the agent literally stops repeating the same mistake — it sees last time you tried find-and-insert against this file's whitespace, it broke the file. Prefer read-then-replace. in its next system prompt. This is the meta-cognition piece that turns rollback from "user fixes their tool" into "tool fixes itself."

Per-design isolation is intentional — what failed in one design doesn't pollute unrelated work. Cross-design "global guardrails" defer to a future iteration once we know which lessons actually generalize.

Anti-spiral safety net (deserves its own callout)

After 3 consecutive turns that touch the same file with no measurable progress (defined as: file size oscillating, or same tool call signature repeating with different args), the orchestrator should auto-pause and ask the user:

"I've tried 3 times to add the tab wrapper and the file structure broke each time. Should I rollback to turn 42 and try a different approach, or do you want to take over manually?"

This is the spiral-circuit-breaker. Without it, the agent burns tokens forever. (Filed as its own Discussion: "Spiral detector: auto-pause after N consecutive non-progressive turns".)

Implementation outline (8 PRs, sequenced as 2 phases)

PR-A foundation (ship first) — items 1-3 below give chat-native rollback alone, which would have saved my 14 minutes:

Extend apps/desktop/src/main/snapshots-db.ts schema: turnId, toolName, parentTurnId, content sha-dedup, retention cap 100/design
New packages/core/src/tools/snapshot-checkpoint.ts — pre-hook fires implicitly before destructive fs tools, not agent-callable
New packages/core/src/tools/rollback-to-checkpoint.ts — agent-callable + chat-keyword interceptor in store.ts (fuzzy /rollback, "undo that", "revert that") + rollback chat message kind for visible audit

PR-B learning (ship second) — items 4-7 layer the lesson loop on top:
4. New packages/core/src/tools/capture-lesson.ts — sort-stable JSON-in-markdown writer to LESSONS.md
5. New apps/desktop/src/renderer/src/hooks/lessonCapturePrompt.ts — locale-aware EN/ZH toast prompt fired after rollback (skip-able)
6. Extend packages/core/src/agent.ts system prompt builder — read LESSONS.md, filter relevant by upcoming intent (string-match on tool name), inject top-5 verbatim before user instruction
7. Files panel surfaces LESSONS.md as editable — bad lessons get pruned by humans, pinned lessons never expire
8. Scenario tests: rollback during streaming, rollback past retention, dedup, lesson-vs-intent contradiction handling, 1000-turn long-running session memory bound

Risks & mitigations

Risk	Mitigation
Lesson pollution balloons system prompt	Cap top-10 per turn, age-out unused after 30 turns, pin permanent ones manually
Rollback during streaming corrupts in-flight write	Abort generation first via `AbortController`, then snapshot-restore (same pattern `config:v1:test-endpoint` uses)
User reverts past snapshot window	Honest `{ok: false, error: 'snapshot_expired', oldestAvailable: turnN}` — never fake-success (HONEST_STATUS rule)
Captured lesson is actually wrong	All `LESSONS.md` entries editable + deletable in Files panel — bad lessons get pruned by humans
Lessons don't transfer between designs	Intentional v1 — per-design isolation. Cross-design lessons defer to v2 once generalizable patterns are clear

Alignment check before code

A few choices I'd want maintainer alignment on before any PR:

Where does LESSONS.md live — design root (per-design) vs workspace root (per-workspace) vs global? My lean: per-design root for v1, per-workspace later if usage proves it generalizes.
Is the spiral-circuit-breaker its own subsystem or part of this one? Filing it separately, but they share the auto-pause UX. Could merge if you'd rather one PR.
Should /rollback be a literal slash command or just keyword interception? Slash commands need a UI for discoverability; keyword is invisible. My lean: both, with /rollback shown in the composer's + menu for discoverability and keyword for fast-path.
Snapshot retention — 100/design hard cap vs time-based (e.g. 30 days)? Hard cap is simpler; time-based is more user-friendly for long projects. My lean: hard cap, with "pinned snapshots" exception for important checkpoints.

Cross-references

Surfaced during dogfood of [Feature]: image 2 已经够厉害了，最需要的是如何把生成好的UI 变成组件化，再到原型的过程！ #225 / PR feat(core): add decompose-to-ui-kit + boolean parity verifiers (Phase 1 of #225) #241 (decompose to UI kit)
Sibling Discussions: "Capability-aware multi-provider failover with budget gates", "Spiral detector: auto-pause after N consecutive non-progressive turns"
Together these three become a unified "Autonomous continuation" subsystem — the agent self-heals from both semantic (this) and infrastructure (failover) failures, and the design accumulates per-project memory of what works

Looking forward to direction before writing any of the actual code.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Time-travel chat: snapshot + rollback + lessons subsystem #242

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Time-travel chat: snapshot + rollback + lessons subsystem #242

Uh oh!

HomenShum Apr 26, 2026

Time-travel chat: snapshot + rollback + lessons subsystem

Why now

Proposed shape — 3 layers

Why all three layers, not just snapshot+rollback

Anti-spiral safety net (deserves its own callout)

Implementation outline (8 PRs, sequenced as 2 phases)

Risks & mitigations

Alignment check before code

Cross-references

Replies: 0 comments

HomenShum
Apr 26, 2026