feat(mcp): Step 2.6–2.8 — tag migration + warnings + axis + trend log#8
feat(mcp): Step 2.6–2.8 — tag migration + warnings + axis + trend log#82233admin wants to merge 14 commits intofeat/step2-5-input-gatefrom
Conversation
…y tag Previously (PR #6 commit 136c40c): added review-status as a 9th frontmatter field as a cache over history[].trigger == manual-user-confirmed-write. Two design bugs surfaced during session-2 external-review audit: - P1 drift: sweep had no sync code from history appends back to the cache; any hand-edit of history or a future write-path emission of manual-user-confirmed-write would leave the cache wrong. - P2 adapt-over-build violation: Obsidian ships with native #tag infrastructure that indexes for free via vault.searchByTag. Re-inventing that as a frontmatter cache is exactly the kind of substrate-duplication the adapt-over-build rule forbids. Fix: drop review-status frontmatter field. When caller passes reviewStatus: user-confirmed, append '#user-confirmed' at body end (idempotent: skip if already present). Enum validation + API param shape preserved so internal callers (librarian skill) don't need updates. Files: - index.ts: drop frontmatter field + YAML line, add body-tag injection - operations.ts: update description (9-field → 8-field) + param doc - ai-output.test.ts: delete default-frontmatter test, rewrite user-confirmed test to assert body tag (not frontmatter), add idempotency test - docs/ai-output-convention.md: schema header (9 → 8 fields), remove the review-status row, reframe axes section as '3 axes + body tag' and describe the Obsidian-native routing - docs/mcp-tools-reference.md: regenerated (drift guard green) Verification: npm run build clean, 108/108 tests green (was 107; +1 net for new idempotency test, -1 for removed default test, ±0 for rewrite). Net diff +22 lines (projection was -30). Overrun comes from two new tests (idempotency + stronger body-tag assertion) and the kept enum validator. Trade accepted: preserves API surface + pins a real invariant (tag dedup), at the cost of the line-count target.
Step 2.5 (commit 4374ce3) chose three thresholds as guesses — body >= 50 chars, single-shell-cmd reject, query+sourceNodes both-empty reject — and hard-rejected on any miss. Step 2.6 converts the three throws to structured warnings collected into a warnings[] array on the write result, plus a stderr log line. The write still lands on disk. Rationale: - Thresholds were estimates, not measurements. 'body >= 50' was picked to block obvious noise but also blocks short-but-legitimate analyses (one-paragraph librarian lookups, stub references). - Hard reject means we never build up distribution data to retune from. - Warnings preserve the signal-quality instinct, make it observable (stderr + response field), and keep writes unblocked for 2-4 weeks of data collection. Retune or re-promote to reject after that. Response shape gain: writeAIOutput now always returns warnings: string[] (empty when clean) Values: 'body-too-short' / 'query-looks-like-shell-cmd' / 'no-anchor'. Files: - index.ts: swap 3 throws for warnings.push() + stderr log + include warnings in both dryRun and real-write response - ai-output.test.ts: 3 'rejects' tests -> 'warns but still writes', plus a clean-write test asserting warnings=[]; WriteOk interface gets warnings?: string[] - docs/ai-output-convention.md: 'Input gate (Step 2.5)' section rewritten as 'Input gate (Step 2.5 -> Step 2.6: warning mode)' Verification: npm run build clean, 114/114 tests green (same count as before; test count preserved, semantics flipped).
Step 2 history entries carried {from, to} without naming which axis
moved. Unambiguous for draft->stale (status always), but future
manual-promote entries {from: reviewed, to: promoted} are ambiguous —
both status and quarantine-state have a 'reviewed' value. Adding an
'axis: status' | 'axis: quarantine-state' sub-key resolves it.
Backward compat: existing entries are parsed as opaque strings by the
scalar-only frontmatter parser, so unchanged entries keep round-tripping.
New entries get axis:. No migration pass needed — entries without
axis are treated as 'axis unknown' (human reviews them manually if
ever audited).
Files:
- index.ts sweep path: historyEntry template gets 'axis: status'
inserted right after ts, before from/to (gardener only auto-flips
status). Comment block explains why.
- ai-output.test.ts: first-transition test asserts both
'axis: status' is present and that it sits immediately before from/to.
- docs/ai-output-convention.md: history example shows three entries
covering both axis values; required-fields table gains an axis row
with the Step 2.7 rationale; from/to row simplified to 'before/after
value on axis'.
Verification: 114/114 green, build clean.
Step 2.5 metrics gave a point-in-time snapshot per sweep but no
time-series, so tuning decisions had to eyeball the latest sweep and
trust memory for the trend. This commit appends one flow-style YAML
line per real sweep to 00-Inbox/AI-Output/sweep.log.md so a future
vault.lint or vault.query can reconstruct the trend.
Line shape:
- {ts: "...", totalEntries: N, staleHits: M, supersedeHits: K,
realBacklinkHitRate: 0.XXX}
Gates:
- Skipped on dry_run (no-write-on-dry-run invariant).
- Skipped when entries.length === 0 (no point spamming empty-vault
rows into the log).
- File created with a '# Sweep trend log' header on first non-empty
real sweep; subsequent sweeps append only the data line.
Files:
- index.ts sweepAIOutput: add the conditional append block right
before return. Reuses appendFileSync (already imported) and mkdirSync
(already imported). ~14 real lines of code + comment.
- ai-output.test.ts: 3 new tests pinning the semantics — dry_run=true
must not create the file, two !dry_run sweeps produce two lines
under the header, empty-vault !dry_run sweeps leave no file.
- docs/ai-output-convention.md: new 'Trend log (Step 2.8)' subsection
under 'Sweep metrics' with rationale + example entries + gate list.
Verification: 117/117 green (+3 new tests), build clean.
…w handoff Records two-PR (#6 governance + #7 Step 2.5 input gate/metrics) outcome, external-review absorption from agent-memory-runtime, and an explicit self-audited backlog for next session: P1: review-status cache has no sync code (drift lurking); history from/to axis is ambiguous (needs axis: sub-key) P2: review-status should be an Obsidian tag not a fm field (adapt-over-build rule violation); input gate thresholds are guesses (should downgrade to warnings); metrics is snapshot without trend log; no e2e MCP smoke test P3: MUST-append-history undocumented enforcement; borrow-3/skip-3 was inference not measurement; PR should have been single Next-session priority: migrate review-status frontmatter -> Obsidian tag (~-30 lines, kills two gaps at once), then downgrade input gate to warnings (~10 lines), then history axis sub-key (~20 lines), then metrics trend log (~15 lines). Total ~70 lines closes all P1 + half of P2.
Records the four-item close-out (items 1-4 from session-2's next-step list): review-status migration, input-gate warning downgrade, axis sub-key, sweep metrics trend log. 117/117 tests green, +88 net lines (+25% over projection, traded for regression safety), explicit gap list, PR strategy decision queued for next session.
Post-Step-2.5 the input gate pushes to warnings: string[] instead of throwing. Every skills/vault-*.md example showed the call without reading the result, so warnings from a real run would have landed silently. Each example now assigns const result = vault.writeAIOutput and surfaces result.warnings via console.warn when non-empty. Pure doc change. No mcp-server code touched. Tests 117/117 green.
Adds a five-step Hand-test path section to both GUIDE.md and GUIDE.zh-CN.md, sitting between the AI-Output sediment convention and the Vault structure sections. The payoff for the sediment system is most concrete in Obsidian Local Graph: a #user-confirmed tag cluster visually connects a human-confirmed AI-Output to every source-node it cites. Section walks from install -> one real writeAIOutput call -> open in Obsidian -> Local Graph depth 2 -> round-trip through sweep to see an axis: status history entry. Screenshot placeholders (TODO comments) ship today; images go in after a first real run against a live vault.
There was a problem hiding this comment.
Code Review
This pull request implements steps 2.6 through 2.8 of the governance plan, focusing on refining the input gate and improving auditability. Key changes include downgrading the low-signal input gate from hard rejections to structured warnings, migrating the human-confirmation signal from a frontmatter field to a native Obsidian body tag (#user-confirmed), and adding an 'axis' field to history entries to disambiguate state transitions. Additionally, a new trend log (sweep.log.md) was introduced to track sweep metrics over time. Feedback is provided regarding the idempotency check for the new body tag to ensure it correctly handles tags within code blocks.
|
|
||
| // Body tag injection for human-confirmed entries. Obsidian treats | ||
| // repeated tags as one, but we skip duplicate writes to keep diffs clean. | ||
| const bodyWithTag = reviewStatus === "user-confirmed" && !/(^|\s)#user-confirmed(\s|$)/m.test(body) |
There was a problem hiding this comment.
The regex-based idempotency check for the #user-confirmed tag is fragile because it doesn't account for tags mentioned inside code blocks (which Obsidian typically ignores for indexing). Reusing the existing parseTags logic would be more robust as it already handles code block stripping.
| const bodyWithTag = reviewStatus === "user-confirmed" && !/(^|\s)#user-confirmed(\s|$)/m.test(body) | |
| const bodyWithTag = reviewStatus === "user-confirmed" && !this.parseTags(body).includes("#user-confirmed") |
Spawn mcp-server/bundle.js as a child process and drive it with the MCP SDK client. Three round-trips: tools/list returns the core vault ops, vault.list finds a seeded note, vault.exists agrees. Uses node --test + tmpdir isolation. Closes the three-session carry-forward from Step 1. Surfaces two latent issues documented in progress.txt: - loadConfig precedence: `./vault-mind.yaml` and `../vault-mind.yaml` shadow $VAULT_MIND_VAULT_PATH silently. - bundled pglite/vector resolves `vector.tar.gz` relative to bundle.js, not to pglite; vaultbrain init crashes when enabled in a deployed (non-dev-workspace) build. Smoke test sidesteps both by writing a minimal yaml in the temp vault with `adapters: filesystem`. Bundle rebuilt so the shipped artifact tracks src. 120/120 tests green.
Problem: `@electric-sql/pglite/vector` resolves its extension bundle via
`new URL('../vector.tar.gz', import.meta.url)`. When esbuild inlined
the vector module into bundle.js, `import.meta.url` pointed at
bundle.js and the URL resolved to
`D:/projects/obsidian-llm-wiki/vector.tar.gz` (non-existent) instead
of `pglite/dist/vector.tar.gz`. Any deploy enabling the default
adapter list crashed on VaultBrain init with:
Error: Extension bundle not found: file:///.../vector.tar.gz
Fix: add `--external:@electric-sql/pglite --external:@electric-sql/pglite/*`
to the esbuild bundle script. pglite is a regular dependency so
consumers already install it alongside bundle.js; at runtime pglite
loads from node_modules/ and `import.meta.url` points to pglite's
own dist/ where `../vector.tar.gz` correctly resolves.
Side effect: bundle.js shrinks from ~1.54MB to ~963KB (-38%).
Also: extend the smoke test with a regression guard that spawns a
second server with the default adapter list (vaultbrain enabled).
Drop the workaround `adapters: filesystem` from the main smoke
block now that the default path is safe.
121/121 tests green.
…NGELOG Prepares feat/step2-6-tag-migration for squash-merge into main as v2.0.0. loadConfig precedence flipped from yaml-first to env-first (env > ./vault-mind.yaml > ../vault-mind.yaml). Previously an abandoned parent-dir yaml could silently shadow VAULT_MIND_VAULT_PATH. Smoke-test comments updated; 121/121 green. compiler ruff now clean (was 23 errors blocking PR #4 CI): - drop unused strip_noise import in concept_graph.py - bump line-length 100 -> 120 (LLM prompts + CLI help strings) - per-file ignore E501 in tests/*.py (long assert literals) - wrap remaining 3 long report strings in frontmatter_generator - isort auto-fix on link_discovery.py docs/ICEBOX.md: 2026-04-20 persona+MCP audit 12 items deferred to v3 (item #2 mcp-tools-reference drift already closed via generate-tools-doc); bridge v2 architecture noted as separate-repo concern; P2/P3 carry-forward items (sweep.log rotation, screenshots, CRLF, smoke payload tolerance) parked. CHANGELOG.md: v2.0.0 entry enumerates breaking changes and features. Bundle regenerated after loadConfig edit.
…dless MCP) Consolidates PRs #4 → #5 → #6 → #7 → #8 into one commit. ## Headline - 7 vault-* persona skills (architect/curator/gardener/historian/janitor/librarian/teacher) - AI-Output sediment pipeline (writeAIOutput + sweepAIOutput + review-status + scope + quarantine-state + history audit trail) - Step 2.5 input gate with warning emission - Step 2.6-2.8: tag migration + axis sub-key + sweep metrics trend log - Bilingual user guide (EN + CN) - Auto-generated tools reference with drift guard - End-to-end stdio smoke test - Paste-install UX (setup / setup.ps1) - Graph viewer (static HTML) - Brand: LLM Wiki Bridge (display) / obsidian-llm-wiki (slug) ## Merge-prep (c8651f5) - loadConfig precedence flipped to env > ./yaml > ../yaml (fixes silent vault redirect) - pglite vector.tar.gz bundle path fix via esbuild externalize (5ee746a) - compiler ruff clean (unblocks lint-python CI) - docs/ICEBOX.md: 2026-04-20 persona+MCP audit 12 items deferred to v3 - CHANGELOG.md v2.0.0 entry 121/121 tests green.
Summary
Stacks on top of PR #7 (feat/step2-5-input-gate). Closes out the four P1/P2 items flagged in session-2 handoff.
Commits
b447e45
refactor(mcp): migrate review-status from frontmatter to Obsidian body tagDrops the
review-statusfrontmatter field; user-confirmed writes append#user-confirmedto body (idempotent). Kills drift risk (no cache to sync) and restores adapt-over-build (Obsidian tag index is native signal).vault.searchByTag("user-confirmed")replacessearchByFrontmatterwith zero handler code change.58a9029
refactor(mcp): downgrade Step 2.5 input gate from throw to warningsConverts three
throwbranches towarnings.push()+ stderr log.writeAIOutputreturn shape now always carrieswarnings: string[](empty when clean). Values:body-too-short/query-looks-like-shell-cmd/no-anchor.14ce9aa
feat(mcp): add axis sub-key to history entries (Step 2.7)Resolves the
from/toambiguity —reviewedexists on both status and quarantine-state axes.axis: status|quarantine-statenow names which field moved.eab80d6
feat(mcp): sweep metrics trend log (Step 2.8)Each
dry_run: falsenon-empty sweep appends a flow-style YAML line to00-Inbox/AI-Output/sweep.log.md. Hook for retuning input-gate thresholds with 2-4 weeks of real data.Verification
npm test→ 117/117 greennpm run build→ cleanMerge chain
PR #8 → PR #7 → PR #6 →
feat/ai-output-sediment→ v2-staging → main