feat: ecosystem miners — index plugins, agents, hooks, MCP servers by shahe-dev · Pull Request #3 · NickCirv/engram

shahe-dev · 2026-04-14T10:22:29Z

Summary

Adds two new miners that index the Claude Code plugin ecosystem as graph nodes:

plugin-miner walks ~/.claude/plugins/installed_plugins.json, indexing each installed plugin plus its nested skills/ and agents/ directories. Skills are scored against the project's detected stack (reusing detectStack output) and linked to their parent plugin via provided_by edges. Skills scoring EXTRACTED or INFERRED also get relevant_to edges to matching project files.
config-miner parses global and project-local settings JSON, indexing configured hooks and MCP servers. No scoring — these are always-on infrastructure, confidence fixed at EXTRACTED 1.0.

One shared utility (src/graph/stack-detect.ts) provides language/framework detection from a GraphNode[] snapshot, so future miners that need project context don't reinvent the detection logic.

Schema discipline

Follows the skills-miner convention — no new NodeKind values. All new nodes use kind: \"concept\" with a metadata.subkind discriminator (plugin, agent, hook, mcp_server).

Two additive EdgeRelation values: provided_by (skill/agent → plugin) and relevant_to (skill/agent → file, gated on non-AMBIGUOUS confidence to keep the graph sparse).

Why

The skills-miner you shipped in v0.2 indexes SKILL.md files but treats each skill as orphan metadata — you can see the skill, not the plugin that provided it, the agents it ships alongside, or the hooks / MCP servers configured in the same tree. With these two miners, queries like "what plugins does this project use" and "what hooks fire in this repo" start returning real answers from the graph.

Backward compatibility

No schema migration. New EdgeRelation values are additive.
Both miners are gated behind the existing options.withSkills flag in core.ts, so default engram init behavior is unchanged and empty-project + stress tests stay isolated from the real ~/.claude tree.
ENGRAM_SKIP_ECOSYSTEM=1 env var provides a second-level escape hatch for CI.

Tests

Full suite: 548/548 pass (520 baseline from v0.5.3 + 28 new).

tests/graph/stack-detect.test.ts — 6 tests covering language/framework detection and the "non-file non-class nodes are ignored" invariant.
tests/plugin-miner-scoring.test.ts — 6 tests covering the full EXTRACTED / INFERRED / AMBIGUOUS decision table.
tests/plugin-miner.test.ts — 7 integration tests against a fixture plugin tree (2 skills, 1 agent), covering silent failure on missing ~/.claude, the env-var escape hatch, provided_by edges, scoring, relevant_to gating, and plugins with no skills/agents dirs.
tests/config-miner.test.ts — 8 tests covering global + local settings merge, MCP-servers-global-only precedence, malformed JSON handling, and the env-var escape hatch.
One smoke test appended to tests/core.test.ts confirming both miners are invokable via the pipeline integration.

Verified

Built and manually verified against a real project: 261 → 731 nodes after enabling. Plugin / agent / skill / hook / mcp nodes appear as expected with correct provided_by edges.
ENGRAM_SKIP_ECOSYSTEM=1 verified: node count drops by exactly the ecosystem contribution (731 → 488).
Windows + Node 25 clean (uses toPosixPath from your v0.3.2 fix for all sourceFile writes).

Design docs

Full design rationale and implementation plan are in the PR branch at docs/superpowers/specs/2026-04-14-engram-ecosystem-miners-design.md and docs/superpowers/plans/2026-04-14-engram-ecosystem-miners.md. Happy to strip those from the PR if you'd rather keep the repo's doc tree clean — they're my workflow artifacts, not prescriptive.

Commits

feat(schema): add provided_by and relevant_to edge relations
feat(graph): add stack-detect utility
feat(plugin-miner): relevance scoring with stack awareness
feat(plugin-miner): index plugins, skills, and agents
feat(config-miner): index hooks and MCP servers
feat(core): wire both miners into pipeline, gated on withSkills
docs: changelog entry

Each commit has tests + builds green. Happy to split into two PRs (plugin-miner and config-miner) if that's easier to review.

NickCirv · 2026-04-18T13:09:39Z

Hey @shahe-dev — thanks for this, reading through the plugin-miner + config-miner design it's in the same vein as skills-miner but going the next mile by tying skills/agents back to their plugin via provided_by and scoring them against detected stack via relevant_to. That's the right shape.

Before we dig in for merge, two asks:

1. Please rebase onto current main. The branch forked pre-v2.0.0 (548 baseline tests) and main is now at 673 tests with v2.0.0 Ecosystem + v2.0.1 Windows CI fix + v2.0.2 security hotfix all landed since Apr 14. Several files your diff touches have moved — src/core.ts, src/miners/skills-miner.ts (v0.5.3 Node-25 dirent fix), and src/graph/schema.ts (v2 migrations). I'd rather not do the rebase for you because the semantic choices around ecosystem + skills interaction are subtle (e.g. when your provided_by edge meets skills-miner's existing triggered_by edge) and you'll do a better job than a mechanical 3-way merge.

2. Happy to split into two PRs if that simplifies review. You already offered. Given the surface area (2.7K additions, 7 commits, two new miners, new edge relations, new env-var escape hatch), reviewing plugin-miner first and config-miner second lets us land the primary capability fast and iterate on the config path separately. If you'd rather keep it as one PR, that's fine too — just depends on how split-ready the commit graph already is.

Not blocking the v2.1.0 cycle (currently stacking on top of #9 + #8 + #5 — security hotfix already shipped as 2.0.2). Target merge window for this one is v2.2.

When the rebase is done, CI needs maintainer approval to run (first-time contributor default on Actions) — ping me and I'll approve.

…rena reference plugin ITEM #2 — Plugin contract v2 Extends ContextProviderPlugin so plugin authors can declare an MCP server via 'mcpConfig' and skip writing resolve()/isAvailable() by hand. The loader auto-wraps via createMcpProvider() from item #1. Classic plugins (custom resolve()) continue to work unchanged — if both fields are present, the author's resolve() wins (they opted into custom logic). Type changes (src/providers/types.ts): - ContextProviderPlugin stays strict (extends ContextProvider fully) — this is the POST-VALIDATION shape the resolver consumes - NEW: RawPluginShape — the pre-validation shape a plugin-file author writes in .mjs. tier/tokenBudget/timeoutMs/resolve/isAvailable all optional (loader fills from factory when mcpConfig present) Loader changes (src/providers/plugin-loader.ts): - validatePlugin() branches on 'has mcpConfig vs. has resolve()' - name/label/version always required - Classic path: tier/tokenBudget/timeoutMs/isAvailable required - mcpConfig path: config validated via validateProviderConfig(), merged with plugin fields (author overrides win over factory defaults) - One clear error per rejection — 'invalid mcpConfig: <reason>' tells you exactly which sub-field on which plugin is broken Tests (+7 cases in tests/providers/plugin-loader.test.ts): - mcpConfig-only plugin auto-wraps resolve + isAvailable - Plugin with neither resolve nor mcpConfig rejected (clear message) - Invalid mcpConfig rejected (bad command, bad http url) - Custom resolve wins over mcpConfig when both present - Plugin tokenBudget override wins over factory default - Missing version rejected even for mcpConfig plugins ITEM #6 — Serena plugin reference docs/plugins/examples/serena-plugin.mjs (~60 lines incl. docs) — the full Serena (oraios/serena) wrapper as an mcpConfig-only plugin. Install is cp + enable. Thanks to item #2, NO custom transport code needed. docs/plugins/examples/static-context-plugin.mjs — the classic-path reference showing a tier 1 plugin with hand-rolled resolve() for users who just want to inject a fixed string on every Read. docs/plugins/README.md — author-facing guide. Shape 1 (MCP-backed), Shape 2 (classic), template tokens, safety guarantees, debugging checklist, publishing notes. FULL SUITE 808 -> 815 tests (+7), all passing. TypeScript clean, lint clean. V3.0 PROGRESS Done: #1 foundation, #2, #6, #7, #9, #10, #11 = 7 of 12 scope items. Next: #3 budget-weighted resolver + mistakes-boost (~2-3d).

Two orthogonal improvements to the resolver's assembly pipeline. Both exported from resolver.ts so they're testable in isolation, and both run in the main resolveRichPacket() flow before the final priority sort. 1. PER-PROVIDER BUDGET ENFORCEMENT (enforcePerProviderBudget) Providers are SUPPOSED to self-truncate their content to 'tokenBudget', but a bad plugin or a non-conforming MCP server shouldn't be able to spend our entire total budget on one section. New helper truncates each result to the provider's declared budget BEFORE assembly. - Under-budget content passes through unchanged (zero-cost) - Over-budget content is line-truncated (never cut mid-word) - Edge: first line alone > budget -> hard-cap characters with marker Default budget for unknown/missing providers is 200 tokens (matches the MCP-config default from item #1). 2. MISTAKES-BOOST RERANKING (boostByMistakes) If the engram:mistakes provider fires for this file, scan OTHER providers' content for substring matches against mistake labels (extracted from the ' ! <label> (flagged <age>)' format). Matching results get confidence * 1.5 (capped at 1.0). Runs BEFORE the priority sort, but the secondary sort is now (priority asc, confidence desc) — so boost breaks ties WITHIN a priority tier without overriding priority across tiers. - Case-insensitive matching (labels normalized to lowercase) - Does NOT boost the mistakes provider itself - No-op if no mistakes are reported for this file (common case) Examples of the intended effect: - An engram:git commit message mentioning a known-broken function sorts UP within the git tier - A mempalace decision that references a mistaken architectural choice bubbles ahead of unrelated decisions TESTS (+10 cases in tests/providers/resolver.test.ts) enforcePerProviderBudget: - Under-budget untouched - Over-budget truncated by line with marker - Hard-cap when first line alone exceeds budget - Default 200 tokens when provider not found boostByMistakes: - No-op when no mistakes provider in set - Matching substring boosts confidence 0.6 -> 0.9 - Cap enforced (0.8 * 1.5 = 1.2 -> 1.0) - Non-matching results left alone - Mistakes provider itself is never self-boosted - Case-insensitive matching across upper/lower case variations Full suite: 815 -> 825 tests (+10), all passing. TypeScript clean. V3.0 PROGRESS: 8 of 12 scope items done. ✅ #1 foundation ✅ #2 ✅ #3 ✅ #6 ✅ #7 ✅ #9 ✅ #10 ✅ #11 Remaining: #4 Auto-Memory (blocked on MEMORY.md fixture), #5 SSE streaming, #8 pre-mortem warnings, #12 MCP Registry submit, and #1 completion (HTTP transport + real-server integration tests).

Opt-in warnings that fire BEFORE Claude Code runs an Edit/Write/Bash tool call against code previously flagged as a mistake. Fully gated via ENGRAM_MISTAKE_GUARD env var — zero overhead when unset. MODES unset / '0' → off (default — no database read, no overhead) '1' → permissive: tool proceeds, a warning is prepended to any additionalContext the primary handler emits '2' → strict: tool is denied with the warning as reason Hooks Edit/Write/Bash only. Read already surfaces mistakes via the engram:mistakes context provider — duplicating at tool-call time would be noise. MATCHING Edit/Write: - Normalize tool_input.file_path to relative POSIX vs projectRoot - Indexed lookup via store.getNodesByFile() (uses idx_nodes_source_file) - Dedupe by node id when both relative + raw shapes are stored Bash: - Substring match on mistake.metadata.commandPattern (length >2) - Fallback: substring match on mistake.sourceFile (length >3 to avoid accidentally matching single-char paths like 'a') - Full-table scan of mistakes (unavoidable — no file axis to index on). Bounded by project size; only runs when the guard is explicitly on. BI-TEMPORAL FILTER (item #7 interop) Mistakes with validUntil <= now are suppressed — they refer to code that has since been refactored away. Prevents stale-warning fatigue. INTEGRATION New file: src/intercept/handlers/mistake-guard.ts - currentGuardMode() — reads env var at call time, not module load, so tests can flip between cases cleanly - findMatchingMistakesAsync(target, projectRoot) — the matcher - formatWarning(matches) — human-readable warning block - applyMistakeGuard(rawResult, payload, kind) — wrapping fn that augments additionalContext (permissive) or overrides to deny (strict) src/intercept/dispatch.ts wiring: after runHandler() returns for Edit/ Write/Bash, pass result through applyMistakeGuard() before returning. Two-line diff. Doesn't touch the existing handlers. SAFETY Every code path in mistake-guard is wrapped in try/catch with a null return. A guard failure MUST NEVER break the primary handler. If the store open fails, the env var is wrong, the payload is malformed — guard silently returns the raw result unchanged. TESTS (+21 cases in tests/intercept/handlers/mistake-guard.test.ts) - currentGuardMode: off/permissive/strict recognition, bogus values coerced to off - formatWarning: empty-match string, single-match header, >5-match collapse with '… and N more' - findMatchingMistakesAsync (file): rel path, abs path normalization, no-match, validUntil filter - findMatchingMistakesAsync (bash): commandPattern substring match, sourceFile-in-command match, case-insensitive, too-short pattern guard, validUntil filter - applyMistakeGuard: mode=off no-op, permissive augments additional context, permissive no-match no-op, strict denies with reason, permissive from passthrough emits fresh allow-with-warning Full suite: 825 -> 846 tests (+21), all passing. TypeScript clean. V3.0 PROGRESS — 9 of 12 scope items ✅ #1 foundation ✅ #2 ✅ #3 ✅ #6 ✅ #7 ✅ #8 ✅ #9 ✅ #10 ✅ #11 Remaining: - #1 completion (HTTP transport + real-server integration tests) - #4 Anthropic Auto-Memory bridge (blocked: needs MEMORY.md fixture) - #5 SSE streaming for rich packet assembly - #12 Official MCP Registry submission (post-ship)

Adds progressive delivery for rich packet assembly. Instead of blocking on Promise.allSettled (which waits for the slowest provider — Serena cold-start, mempalace ChromaDB warmup), clients can stream results as they arrive and render each section immediately. NEW — resolveRichPacketStreaming generator (src/providers/resolver.ts) AsyncGenerator<StreamEvent> that yields: { type: 'provider', result: ProviderResult } — as each resolves { type: 'done', providerCount, durationMs } — final totals Order = ARRIVAL order (fast providers first). Consumers who want priority order use the non-streaming resolveRichPacket() which applies full priority + mistakes-boost + budget logic. Implementation: fan-out all providers, funnel outcomes into a FIFO queue + wake-on-arrival pattern. No extra deps. Per-provider timeouts preserved (same resolveWithTimeout path as non-streaming). NEW — /context/stream SSE endpoint (src/server/http.ts) GET /context/stream?file=<relative-path> (auth required). Emits one SSE frame per StreamEvent. Frame shape matches MCP SEP-1699 (SSE resumption): id: 0 event: provider data: {"provider":"engram:ast", …} id: 1 event: provider data: {"provider":"engram:mistakes", …} id: N event: done data: {"providerCount":N,"durationMs":347} Supports Last-Event-ID header — clients reconnecting via 'Last-Event-ID: 3' skip events 0-3 and pick up from 4. Useful for long-running sessions that drop WiFi mid-stream without losing context. Client-disconnect aborts the stream cleanly (req.close handler short- circuits the generator loop). TESTS (+6 new) resolver.test.ts (+2): - Smoke: streaming generator terminates with a 'done' event for any project (no hang, no runaway) - Arrival-order invariant: toy generator mirrors production shape, verifies fast results yield before slow ones server/http.test.ts (+4): - Missing 'file' param returns 400 - Valid request returns 200 + text/event-stream + ends with 'done' - Every frame carries an 'id:' header (SEP-1699 resumption) - Auth required — unauthenticated returns 401 Full suite: 870 -> 876 tests (+6), all passing. TypeScript clean. V3.0 PROGRESS — 11 of 12 scope items done ✅ #1 foundation ✅ #2 ✅ #3 ✅ #4 ✅ #5 ✅ #6 ✅ #7 ✅ #8 ✅ #9 ✅ #10 ✅ #11 Only remaining in-scope work: - #12 MCP Registry submission (~2h, post-ship only) Plus item #1 completion (HTTP transport + minimal MCP server fixture for integration tests) — technically part of #1 which shipped its foundation as c719591; the HTTP transport path was explicitly deferred until this SSE work landed. Now it can.

Extends engram to index Claude Code plugins, agents, hooks, and MCP servers as concept nodes with subkind discriminators. Follows Nick's schema discipline (no new NodeKinds) and silent-failure conventions. Two new miners (plugin-miner, config-miner) plus a shared stack-detect utility. Design approved for implementation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

13-task TDD plan covering stack-detect utility, plugin-miner with relevance scoring, config-miner for hooks and MCP servers, schema extensions for new edge relations, and pipeline integration. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Additive change to EdgeRelation union for the ecosystem miners. No existing code touches these new values; rolled out in later tasks. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…detection Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…scoring Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ings Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…n withSkills Reuses the existing options.withSkills flag so ecosystem indexing is opt-in by the same mechanism as the skills-miner. Keeps stress tests and empty-project tests isolated from the real ~/.claude directory. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

shahe-dev · 2026-04-25T17:34:54Z

Rebased onto current main (5567d61, v3.0.2). Test count is now 906 passing / 6 pre-existing failures in tests/watcher.test.ts — same 6 failures present on upstream main independent of this branch.

Re your "subtle interaction" question between provided_by and triggered_by: I checked empirically by running the rebased branch against my real ~/.claude tree. The two relations sit on the same skill nodes without conflict — they answer different questions:

triggered_by: keyword → skill (what summons it)
provided_by: skill → plugin (who ships it)

Live counts on a real session: 201 provided_by + 85 triggered_by + 66 relevant_to, with 173 skills, 23 plugins, 33 agents, 4 hooks, 2 mcp_servers indexed. No edge collisions; schema migration to v8 runs clean.

Re splitting into two PRs: I'd prefer to keep it as one — commits are already topical (4 plugin-miner, 2 config-miner, 1 wiring, 2 docs) and config-miner is small enough that splitting feels like overhead. Happy to split if you'd rather; just say the word.

CI needs your maintainer approval to run on the new push when you have a moment.

One thing flagged but not blocking: v3.0.0 added valid_until and invalidated_by_commit columns for bi-temporal mistake validity. My miners write metadata.subkind cleanly without touching those new columns, so they're harmless — but if you want my plugin/agent/hook/mcp nodes to support pre-mortem invalidation too, that's a follow-up PR worth scoping.

shahe-dev and others added 9 commits April 25, 2026 21:09

feat(schema): add provided_by and relevant_to edge relations

eb531ec

Additive change to EdgeRelation union for the ecosystem miners. No existing code touches these new values; rolled out in later tasks. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

feat(graph): add stack-detect utility for project language/framework …

1fbf383

…detection Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

feat(plugin-miner): add relevance scoring with stack awareness

5344baf

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

feat(plugin-miner): index plugins, skills, and agents with relevance …

366bf6d

…scoring Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

feat(config-miner): index hooks and MCP servers from Claude Code sett…

5806d49

…ings Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

docs: changelog entry for ecosystem miners

3379651

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

shahe-dev force-pushed the feat/ecosystem-miners branch from 2b5aa6e to 3379651 Compare April 25, 2026 17:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: ecosystem miners — index plugins, agents, hooks, MCP servers#3

feat: ecosystem miners — index plugins, agents, hooks, MCP servers#3
shahe-dev wants to merge 9 commits intoNickCirv:mainfrom
shahe-dev:feat/ecosystem-miners

shahe-dev commented Apr 14, 2026

Uh oh!

NickCirv commented Apr 18, 2026

Uh oh!

shahe-dev commented Apr 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

shahe-dev commented Apr 14, 2026

Summary

Schema discipline

Why

Backward compatibility

Tests

Verified

Design docs

Commits

Uh oh!

NickCirv commented Apr 18, 2026

Uh oh!

shahe-dev commented Apr 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants