Step 1: AI-Output sediment system (writeback + stale sweep) + bilingual user guide#5
Conversation
Persona outputs now sediment into {vault}/00-Inbox/AI-Output/{persona}/
with a 6-field provenance frontmatter (generated-by / generated-at /
agent / parent-query / source-nodes / status). Gardener can
subsequently flip aged drafts to stale with a backlink-source test.
vault.writeAIOutput
- Typed params with persona regex ^vault-[a-z]+$ validation
- Auto-slug from parentQuery (first 6 words, kebab-case, fs-safe);
YYYY-MM-DD-slug.md under per-persona subdir
- Collision loop appends -2 through -99
- YAML-subset serialization round-trips through the existing
parseFrontmatter (no reformatting drift)
- dryRun default true (matches project convention); on dry run
returns the computed path + frontmatter without writing
- Status hardcoded `draft` on write
vault.sweepAIOutput
- Hardcoded per-persona thresholds: architect=45d, gardener=30d,
historian=180d, librarian=60d, catch-all=60d (MVP; graduate to
yaml config in Step 1.5 once usage justifies)
- Stale rule = age-past-threshold AND zero non-AI-Output backlinks
(source files whose own frontmatter carries `generated-by` do
NOT anchor -- AI->AI references would be self-anchoring
hallucination chains)
- Supersede candidates reported on same-persona reviewed pairs with
source-nodes Jaccard >= 0.6 and newer generated-at; human
confirms, never auto-applies -- AI self-grading is a second-layer
hallucination deliberately avoided
- dry_run=false rewrites frontmatter `status: draft` to `stale`
in-place via narrow regex substitution (body + other fm fields
untouched)
- `now` param provides ISO timestamp injection seam for tests
Tests: 10 unit tests in mcp-server/src/ai-output.test.ts covering
the write contract (all 6 fields, dryRun default, collision,
persona validation, empty sourceNodes serialization) and the sweep
contract (threshold detection with injected now, backlink filter,
AI-Output->AI-Output non-anchoring, in-place status flip preserving
body, supersede candidate Jaccard gate). node:test + tmpdir
isolation, zero new deps.
Incidental fix: added an `import.meta.url === process.argv[1]` guard
around main() in index.ts so importing VaultFs from tests does not
start the stdio MCP server. Without the guard, StdioServerTransport
keeps the process alive and subsequent test files never run.
Also exported `class VaultFs` (was previously private to the module)
so ai-output.test.ts can construct a fresh instance per test against
a tmpdir vault.
Total: 98 tests pass (88 baseline + 10 new). tsc 0 errors.
See docs/ai-output-convention.md for the human-readable schema
contract + FAQ. Closes Step 1 of the AI-Output sediment system.
Every persona now instructs its runtime to persist meaningful
analyses via vault.writeAIOutput, so the vault sediments both the
human's notes and the AI collaboration history.
- skills/vault-{architect,curator,teacher,historian,janitor,
librarian,gardener}.md: appended `## Sediment convention` block
with a per-persona vault.writeAIOutput invocation template +
status lifecycle pointer.
- skills/vault-gardener.md: additionally appended `## Sweep
convention (gardener-only responsibility)` -- dry-run first,
require explicit user confirmation before dry_run:false flips,
never auto-apply supersede candidates.
- docs/ai-output-convention.md (NEW, ~120 lines): schema table,
legal status transitions, per-persona thresholds, FAQ covering
"who flips reviewed", "what counts as a backlink", "why exclude
AI->AI references", "what happens to stale entries".
Non-trivial design reasoning (why B->A beats A->B for the
draft->stale/reviewed split, why the 6-field schema is minimal,
why supersede stays semi-automatic) lives in planning notes, not
in-tree.
Auto-generated via `npm run generate-tools-doc` after adding vault.writeAIOutput + vault.sweepAIOutput. Drift guard test now passes (mcp-server/src/scripts/generate-tools-doc.test.ts). Purely mechanical diff: 2 new ops appear under the `vault.*` namespace section with their param tables; no hand-edits.
…ess handoff - docs/GUIDE.md (NEW, 268 lines): user-journey oriented walkthrough -- pitch, 30-second install, first useful 5-minute session with 5 real prompt examples, 7-persona table, AI-Output sediment plain- English explanation, common prompts cheat sheet, troubleshooting, FAQ. Links to INSTALL.md for install-depth, ai-output-convention.md for sediment schema, mcp-tools-reference.md for tool catalog. - docs/GUIDE.zh-CN.md (NEW, 268 lines): structure-mirrored Chinese translation. Idiomatic (not literal) rendering of the same content for the Chinese dev community. Top-of-page language switch link. - README.md: single-line language-switch header pointing at both GUIDE files. Keeps README lean; GUIDE.md carries the depth. - progress.txt: Step 1 (AI-Output sediment) shipping summary, engineering notes for next editor, updated known-gaps roster.
There was a problem hiding this comment.
Code Review
This pull request introduces the 'AI-Output sediment' feature, which allows persona-authored analyses to be persisted in the vault. It adds two new MCP tools: vault.writeAIOutput for saving analyses with structured frontmatter and vault.sweepAIOutput for managing the lifecycle of these files (staling old drafts and identifying supersede candidates). The PR also includes comprehensive documentation, unit tests, and updates to persona skill files to utilize the new writeback capability. Feedback focuses on optimizing the sweep operation's performance and ensuring YAML frontmatter integrity by sanitizing newlines in user-provided strings.
| const hasRealBacklink = (targetRel: string): boolean => { | ||
| const targetBase = basename(targetRel, ".md"); | ||
| let found = false; | ||
| this.walkMd((relPath, content) => { | ||
| if (found) return; | ||
| if (relPath === targetRel) return; | ||
| if (aiOutputPaths.has(relPath)) return; // AI-Output -> AI-Output doesn't anchor | ||
| for (const l of this.parseWikilinks(content)) { | ||
| const linkPath = l.link.split("#")[0]; | ||
| if (!linkPath) continue; | ||
| if (linkPath === targetRel || linkPath === targetBase || linkPath + ".md" === targetRel) { | ||
| found = true; | ||
| return; | ||
| } | ||
| } | ||
| }); | ||
| return found; | ||
| }; |
There was a problem hiding this comment.
The current implementation of hasRealBacklink is highly inefficient as it performs a full vault scan (this.walkMd) for every AI-Output draft found. In a vault with
It is recommended to build a set of all human-authored wikilinks by walking the vault once before iterating through the AI-Output entries.
// Pre-compute all human-authored links to avoid O(M*N) vault scans
const humanLinks = new Set<string>();
this.walkMd((relPath, content) => {
if (aiOutputPaths.has(relPath)) return;
for (const l of this.parseWikilinks(content)) {
const link = l.link.split("#")[0];
if (link) humanLinks.add(link);
}
});
const hasRealBacklink = (targetRel: string): boolean => {
const targetBase = basename(targetRel, ".md");
const targetNoExt = targetRel.replace(/\.md$/, "");
return humanLinks.has(targetRel) || humanLinks.has(targetBase) || humanLinks.has(targetNoExt);
};| if (typeof body !== "string") throw err(-32602, "body required"); | ||
|
|
||
| // Sanitize parent-query: truncate to 200 chars, replace " with right-double-quote | ||
| const parentQuery = parentQueryRaw.slice(0, 200).replace(/"/g, "\u201D"); |
There was a problem hiding this comment.
The parentQuery string is embedded directly into the YAML frontmatter. If the user's query contains newlines, it will break the simple line-based frontmatter parser implemented in parseFrontmatter (which splits by \n and expects key: value pairs). Sanitizing newlines by replacing them with spaces ensures the generated file remains valid and parseable.
| const parentQuery = parentQueryRaw.slice(0, 200).replace(/"/g, "\u201D"); | |
| const parentQuery = parentQueryRaw.slice(0, 200).replace(/"/g, "\u201D").replace(/\n/g, " "); |
| } else { | ||
| yamlLines.push(`source-nodes:`); | ||
| for (const node of sourceNodes) { | ||
| const escaped = String(node).replace(/"/g, "\u201D"); |
There was a problem hiding this comment.
Similar to parentQuery, items in sourceNodes should be sanitized for newlines to prevent breaking the YAML frontmatter structure during serialization, especially since these values are provided by the LLM and might contain unexpected formatting.
| const escaped = String(node).replace(/"/g, "\u201D"); | |
| const escaped = String(node).replace(/"/g, "\u201D").replace(/\n/g, " "); |
…dless MCP) Consolidates PRs #4 → #5 → #6 → #7 → #8 into one commit. ## Headline - 7 vault-* persona skills (architect/curator/gardener/historian/janitor/librarian/teacher) - AI-Output sediment pipeline (writeAIOutput + sweepAIOutput + review-status + scope + quarantine-state + history audit trail) - Step 2.5 input gate with warning emission - Step 2.6-2.8: tag migration + axis sub-key + sweep metrics trend log - Bilingual user guide (EN + CN) - Auto-generated tools reference with drift guard - End-to-end stdio smoke test - Paste-install UX (setup / setup.ps1) - Graph viewer (static HTML) - Brand: LLM Wiki Bridge (display) / obsidian-llm-wiki (slug) ## Merge-prep (c8651f5) - loadConfig precedence flipped to env > ./yaml > ../yaml (fixes silent vault redirect) - pglite vector.tar.gz bundle path fix via esbuild externalize (5ee746a) - compiler ruff clean (unblocks lint-python CI) - docs/ICEBOX.md: 2026-04-20 persona+MCP audit 12 items deferred to v3 - CHANGELOG.md v2.0.0 entry 121/121 tests green.
What this adds
The "persona outputs survive the session" design — Step 1 of the dependency chain ③ writeback → ① provenance → ② active-push. Without this, every
/vault-architect//vault-gardener/ etc. analysis evaporates at session end; the vault only sediments the human's notes, not the collaboration history.Two new MCP ops
vault.writeAIOutputPersona calls this to persist a meaningful analysis. Lands at:
With a 6-field frontmatter that a future sweep (or human) can reason about:
Each field solves a distinct future-failure class: who / when / what-quality / why / from-where / still-valid. Every persona's skill file got a per-persona invocation template appended.
vault.sweepAIOutputGardener calls this to manage the sediment lifecycle. Two report types:
generated-at. Reported only; never auto-applied.dry_run: true(default) returns the report without writing.dry_run: falseflipsstatus: draft→stalein-place via narrow regex substitution (body + other fm fields preserved).Design rationale (condensed)
Install + try it
New comprehensive bilingual guide lands in this PR:
Minimum viable try after merge:
Then in Claude Code:
/vault-architect suggest 3 refactors for my vault structure. Watch the file appear at00-Inbox/AI-Output/vault-architect/YYYY-MM-DD-suggest-3-refactors.md.Verification
npm run build— tsc 0 errorsnode.exe --test dist/**/*.test.js— 98 tests pass, 0 fail (88 baseline + 10 new)npm run generate-tools-doc— deterministic output; drift guard greenNew tests cover the full contract:
status: draft-2suffix same day/slug[]nowfor determinismdry_run: falseflips status in-place, preserves bodyIncidental fixes included
class VaultFsis now exported frommcp-server/src/index.ts(was module-private).main()now guarded byimport.meta.url === process.argv[1]. Without this, importing VaultFs from tests would spin upStdioServerTransportand hold stdin open, blocking all subsequent test files. One-liner, preserves runnable-as-entrypoint behavior.Files modified
mcp-server/src/core/operations.ts— 2 new Operation entries (+27)mcp-server/src/index.ts— 2 dispatch cases + export + main guard (+190)mcp-server/src/ai-output.test.ts— NEW, 10 tests (+297)skills/vault-{architect,curator,gardener,historian,janitor,librarian,teacher}.md— +17 each (gardener +23)docs/ai-output-convention.md— NEW schema + lifecycle + FAQ (~120)docs/mcp-tools-reference.md— auto-regenerated (+29)docs/GUIDE.md+docs/GUIDE.zh-CN.md— NEW bilingual guide (268 each)README.md— language-switch blockquote above descriptionprogress.txt— Step 1 shipping summary + handoffExplicitly deferred
reviewed → supersededflip (currently reports only)user-verdictfrontmatter fieldReady for review or merge. Base is
v2-stagingso this layers on top of PR #4 — merge order: #4 first, then this.