feat(mcp): Step 2.5 — input gate + sweep metrics by 2233admin · Pull Request #7 · 2233admin/obsidian-llm-wiki

2233admin · 2026-04-20T21:15:33Z

Layers on top of PR #6. Implements the three items borrowed from the external agent-memory-runtime governance review (governance-layer.md + decision-table.md):

writeAIOutput input gate — reject body<50 chars / single shell command as parent-query / empty query+empty sourceNodes. Placed after schema+enum validation so those errors still surface first.
sweep metrics — every sweep response now carries {totalEntries, byPersona, byStatus, byQuarantineState, realBacklinkHitRate}. Signals future threshold-tuning needs: a vault where <10% of entries are cited by human notes is a persona-prompt problem, not a staleness-threshold problem.
default-reject rules documented — governance-layer §七 mapped to our vault-level base in docs/ai-output-convention.md.

Skipped from the external review (and why)

capture/observation layer (append-only log for low-signal persona thought). They need it because their base is Bash-PostToolUse streams; we already filter at the writeAIOutput boundary. Double-layering would be redundant.
durable_memory + UserPromptSubmit injection gate. Their gate is a Claude Code hook; our recall is a librarian skill reading vault_search. Different interface surface — Step 3 should put relevance/trust logic in the librarian, not a new frontmatter field.

Verification

npm run build: 0 tsc errors
ai-output suite: 22/22 pass (6 new tests: 3 gate rejections, 1 gate allow, 2 metrics cases)
Full mcp-server suite: 110/110 — no regressions

Stats

~180 insertions across 3 files (code + tests + docs). All existing short-body tests updated to use a shared LONG_BODY constant that clears the 50-char gate.

Merge order

PR #6 (feat/step2-governance) first, then this. Both target merges back through feat/ai-output-sediment → v2-staging → main.

gemini-code-assist

Code Review

This pull request introduces an 'Input gate' to filter low-signal AI outputs and adds detailed metrics to the sweep process. The input gate enforces quality by rejecting entries with short bodies, shell commands as queries, or missing anchors. The sweep metrics provide insights into persona distribution, status, and backlink hit rates. A performance bottleneck was identified in the metrics calculation where repeated vault scans could cause significant latency in large vaults; a pre-indexed backlink lookup was suggested to optimize this to linear complexity.

gemini-code-assist · 2026-04-20T21:17:20Z

+        let withRealBacklink = 0;
+        for (const e of entries) {
+          metrics.byPersona[e.persona] = (metrics.byPersona[e.persona] ?? 0) + 1;
+          metrics.byStatus[e.status || "(none)"] = (metrics.byStatus[e.status || "(none)"] ?? 0) + 1;
+          const qs = typeof e.fm["quarantine-state"] === "string"
+            ? (e.fm["quarantine-state"] as string) : "(none)";
+          metrics.byQuarantineState[qs] = (metrics.byQuarantineState[qs] ?? 0) + 1;
+          if (hasRealBacklink(e.relPath)) withRealBacklink++;
+        }


The current implementation of metrics calculation has a significant performance bottleneck. Calling hasRealBacklink(e.relPath) inside a loop over all entries results in $O(N \times M)$ complexity, where $N$ is the number of AI-Output entries and $M$ is the total number of markdown files in the vault. Since hasRealBacklink performs a full vault scan via walkMd and reads every file, this will lead to severe latency in larger vaults.

I suggest building a backlink index once by walking the vault a single time before the loop. This reduces the complexity to $O(N + M)$. I've also included a minor fix to ensure consistent handling of empty strings in the metrics keys (matching the logic used for status).

Suggested change

let withRealBacklink = 0;

for (const e of entries) {

metrics.byPersona[e.persona] = (metrics.byPersona[e.persona] ?? 0) + 1;

metrics.byStatus[e.status || "(none)"] = (metrics.byStatus[e.status || "(none)"] ?? 0) + 1;

const qs = typeof e.fm["quarantine-state"] === "string"

? (e.fm["quarantine-state"] as string) : "(none)";

metrics.byQuarantineState[qs] = (metrics.byQuarantineState[qs] ?? 0) + 1;

if (hasRealBacklink(e.relPath)) withRealBacklink++;

}

const backlinked = new Set<string>();

this.walkMd((relPath, content) => {

if (aiOutputPaths.has(relPath)) return;

for (const l of this.parseWikilinks(content)) {

const link = l.link.split("#")[0];

if (link) {

backlinked.add(link);

if (!link.endsWith(".md")) backlinked.add(link + ".md");

}

}

});

let withRealBacklink = 0;

for (const e of entries) {

metrics.byPersona[e.persona] = (metrics.byPersona[e.persona] ?? 0) + 1;

metrics.byStatus[e.status || "(none)"] = (metrics.byStatus[e.status || "(none)"] ?? 0) + 1;

const qs = (typeof e.fm["quarantine-state"] === "string" && e.fm["quarantine-state"]) || "(none)";

metrics.byQuarantineState[qs] = (metrics.byQuarantineState[qs] ?? 0) + 1;

if (backlinked.has(e.relPath) || backlinked.has(basename(e.relPath, ".md"))) withRealBacklink++;

}

Informed by Eridanus117/agent-memory-runtime external review: borrow the default-reject rules (decision-table §七) and the promotion-ready metrics (governance-layer §P0.5), adapted to a vault-level base. Skipped: capture/observation layer and the durable-memory+injection path (wrong base — our recall is a librarian skill, not a Claude Code hook). Input gate in vault.writeAIOutput (after schema/enum validation, so those errors still surface first): - body >= 50 chars required - parent-query must not be a single shell command (pwd/ls/cd/cat/ rg/grep/echo/git status|diff) - at least one of parent-query or sourceNodes must be non-empty Sweep metrics on every vault.sweepAIOutput response: - totalEntries, byPersona, byStatus, byQuarantineState - realBacklinkHitRate (fraction of entries with a non-AI-Output wikilink anchoring them) Purpose: future threshold tuning needs evidence; a vault where <10% of entries are ever cited is a persona-prompt problem, not a staleness-threshold problem. Tests: 22/22 ai-output pass (6 new: 3 gate rejections, 1 gate allow, 2 metrics cases). Full mcp-server suite 110/110 green. Convention doc gains two new sections (input gate, sweep metrics) with rationale grounded in the external governance plan.

…w handoff Records two-PR (#6 governance + #7 Step 2.5 input gate/metrics) outcome, external-review absorption from agent-memory-runtime, and an explicit self-audited backlog for next session: P1: review-status cache has no sync code (drift lurking); history from/to axis is ambiguous (needs axis: sub-key) P2: review-status should be an Obsidian tag not a fm field (adapt-over-build rule violation); input gate thresholds are guesses (should downgrade to warnings); metrics is snapshot without trend log; no e2e MCP smoke test P3: MUST-append-history undocumented enforcement; borrow-3/skip-3 was inference not measurement; PR should have been single Next-session priority: migrate review-status frontmatter -> Obsidian tag (~-30 lines, kills two gaps at once), then downgrade input gate to warnings (~10 lines), then history axis sub-key (~20 lines), then metrics trend log (~15 lines). Total ~70 lines closes all P1 + half of P2.

2233admin · 2026-04-21T08:04:23Z

Collapsed into PR #4 for unified v2 merge (squash into main). feat/step2-6-tag-migration fast-forwarded into v2-staging; all commits in this PR are now part of #4's diff.

…dless MCP) Consolidates PRs #4 → #5 → #6 → #7 → #8 into one commit. ## Headline - 7 vault-* persona skills (architect/curator/gardener/historian/janitor/librarian/teacher) - AI-Output sediment pipeline (writeAIOutput + sweepAIOutput + review-status + scope + quarantine-state + history audit trail) - Step 2.5 input gate with warning emission - Step 2.6-2.8: tag migration + axis sub-key + sweep metrics trend log - Bilingual user guide (EN + CN) - Auto-generated tools reference with drift guard - End-to-end stdio smoke test - Paste-install UX (setup / setup.ps1) - Graph viewer (static HTML) - Brand: LLM Wiki Bridge (display) / obsidian-llm-wiki (slug) ## Merge-prep (c8651f5) - loadConfig precedence flipped to env > ./yaml > ../yaml (fixes silent vault redirect) - pglite vector.tar.gz bundle path fix via esbuild externalize (5ee746a) - compiler ruff clean (unblocks lint-python CI) - docs/ICEBOX.md: 2026-04-20 persona+MCP audit 12 items deferred to v3 - CHANGELOG.md v2.0.0 entry 121/121 tests green.

gemini-code-assist Bot reviewed Apr 20, 2026

View reviewed changes

2233admin force-pushed the feat/step2-5-input-gate branch from 09d0af8 to 4374ce3 Compare April 20, 2026 21:18

2233admin mentioned this pull request Apr 21, 2026

feat(mcp): Step 2.6–2.8 — tag migration + warnings + axis + trend log #8

Closed

2233admin closed this Apr 21, 2026

2233admin deleted the feat/step2-5-input-gate branch April 21, 2026 08:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(mcp): Step 2.5 — input gate + sweep metrics#7

feat(mcp): Step 2.5 — input gate + sweep metrics#7
2233admin wants to merge 1 commit intofeat/step2-governancefrom
feat/step2-5-input-gate

2233admin commented Apr 20, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 20, 2026

Uh oh!

2233admin commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

2233admin commented Apr 20, 2026

Skipped from the external review (and why)

Verification

Stats

Merge order

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

2233admin commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant