Skip to content

fix: prevent OOM crashes during init on large codebases#6

Open
mechtar-ru wants to merge 7 commits intoNickCirv:mainfrom
mechtar-ru:main
Open

fix: prevent OOM crashes during init on large codebases#6
mechtar-ru wants to merge 7 commits intoNickCirv:mainfrom
mechtar-ru:main

Conversation

@mechtar-ru
Copy link
Copy Markdown

Summary

Two fixes to prevent out-of-memory crashes when running engram init on large projects:

  1. MAX_DEPTH limit in extractDirectory (ast-miner.ts) — prevents stack overflow on deep directory trees
  2. MAX_FILES_PER_COMMIT limit in mineGitHistory (git-miner.ts) — prevents O(n²) memory explosion on commits with many files

Changes

ast-miner.ts

  • Added MAX_DEPTH = 100 depth limit to recursive walk() function
  • Wrapped readdirSync in try-catch to skip unreadable directories
  • Added .engramignore support for custom exclusions
  • Expanded default exclusions: target, .venv, .next, .nuxt, .output, coverage, .turbo, .cache

git-miner.ts

  • Added MAX_FILES_PER_COMMIT = 50 to prevent O(n²) co-change pair explosion
  • Skip dist/, build/ directories from co-change analysis

Testing

Tested on Axolotl project (2.2GB, 34K files) — now completes successfully in ~600ms.

Evgeniy Tikhomirov added 2 commits April 17, 2026 06:43
- Add MAX_DEPTH=100 to prevent stack overflow on deep directory trees
- Wrap readdirSync in try-catch to skip unreadable directories
- Add .engramignore support for custom exclusions
- Expand default exclusions (target, .venv, .next, .nuxt, .output, coverage, .turbo, .cache)
- Add MAX_FILES_PER_COMMIT=50 to prevent O(n²) explosion on commits with many files
- Skip build/dist directories to reduce noise
- Axolotl project has commits with 130 files which caused 8,385+ co-change pairs
@NickCirv
Copy link
Copy Markdown
Owner

Thanks for the diagnosis + PR — the git-miner O(n²) co-change explosion was spot-on, and the expanded IGNORED_DIRS is gold for modern monorepos.

CI is failing on all 4 matrix configs with the same 5 TypeScript errors in src/miners/ast-miner.ts:

error TS2322: Type 'Dirent<string>[]' is not assignable to type 'Dirent<NonSharedBuffer>[]'.
error TS2339: Property 'startsWith' does not exist on type 'NonSharedBuffer'.

This is the Node 25 strict-types issue that was already fixed for skills-miner.ts in v0.5.3 — readdirSync now returns Dirent<NonSharedBuffer> when encoding is omitted, which breaks all downstream string operations on .name.

One-line fix: pin encoding on the readdirSync call (same pattern as skills-miner.ts:226-229):

const entries = readdirSync(dir, { withFileTypes: true, encoding: "utf-8" });

Push that and CI should go green.

One other ask before merge: would you add a quick regression test? Two would be enough:

  1. tests/git-miner.test.ts — fixture git log with a 51-file commit, assert co-change pairs don't grow unbounded.
  2. tests/ast-miner.test.ts — fixture dir nested 101 levels, assert no throw + walk stops.

The fix is correct (manual verification on Axolotl + the root-cause analysis), I just want the limits locked in against future regressions. Happy to merge as soon as CI is green and tests are in.

On scope: .engramignore support is a welcome bonus — no need to split that out.

One note on our earlier comment on #5 asking you for du -sh ~/.claude/skills/ — wrong question on our part; you found the real cause faster by profiling directly. Nice work.

NickCirv added a commit that referenced this pull request Apr 21, 2026
Captures the brainstorming session outcome for the three-release plan:

- v2.1 "Reliability + Zero-Friction Install" — close the bleeding,
  merge contributor PRs (#6, #13), fix #11, ship engram update /
  engram doctor / engram setup, close #14 via Bash PostTool parser.
- v2.2 "Spine" — integrate Serena as an engram provider via a new
  reusable MCP-client subsystem. Engram becomes the orchestrator,
  not a fighter of semantic-search tools.
- v3.0 "Landmines" — mistakes-as-moat expansion + R2 repositioning
  ("the context tool that remembers what broke"). Keep the name,
  rebrand the tagline.

Strategic choices recorded:
- Trilogy (alpha) over mega-release (beta) or split (gamma)
- R2 (keep name, rebrand tagline) over R1 (keep everything) or R3
  (rename, prohibitive cost)
- Update UX: option A — passive notify + manual install, zero
  telemetry, ENGRAM_NO_UPDATE_CHECK + \$CI opt-out

Grounded in measured research:
- npm downloads 1.3K/week, 10/day organic baseline
- r/LocalLLaMA post ratioed (0.44) due to name collision with 4
  other "Engram" projects launched Mar-Apr 2026
- Serena just hit stable with published evals — credible complement
- 2 active external contributors writing substantive PRs

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
NickCirv added a commit that referenced this pull request Apr 24, 2026
Follow-up to cherry-picks b4f7944 (PR #6 commit 1) + 411ad13 (PR #6 commit 2)
from @mechtar-ru.

Deletes redundant DEFAULT_EXCLUDED_DIRS (15 entries) + loadEngramIgnore()
(16 lines) that lived in parallel with the canonical DEFAULT_SKIP_DIRS +
loadIgnorePatterns() pair (the latter shipped in v2.1.0 via PR #13).
Both pairs implemented the same .engramignore feature — keeping only
the v2.1 canonical pair keeps one source of truth.

Also tightens entries typing: 'let entries: Dirent[]' in extractDirectory
(ReturnType<typeof readdirSync> resolves to the string[] default overload,
not the Dirent[] shape actually returned with { withFileTypes: true }).

All 784 tests pass. TypeScript clean.

Closes issue #5 (via PR #6 content: MAX_DEPTH=100 + MAX_FILES_PER_COMMIT=50
+ .engramignore support + expanded default skip dirs — the OOM crash on
init for 2.2GB/34K-file projects like Axolotl is fixed).
NickCirv added a commit that referenced this pull request Apr 24, 2026
First land of the MCP-client subsystem (item #1 from the v3.0 Spine
implementation plan). Any MCP server can now become an engramx Context
Spine provider via ~/.engram/mcp-providers.json — no code changes needed.

WHAT SHIPS IN THIS COMMIT

src/providers/mcp-config.ts
  - McpProviderConfig type (stdio + http transports, tools array with
    arg templates, tokenBudget, timeoutMs, cacheTtlSec, priority, enabled)
  - loadMcpConfigs(): reads ~/.engram/mcp-providers.json (path overridable
    via ENGRAM_MCP_CONFIG_PATH for tests). Per-entry validation errors
    are COLLECTED not thrown — one bad provider never stops the rest.
  - validateProviderConfig(): strict structural validator with precise
    error messages (tells you which field on which entry failed)
  - applyArgTemplate(): substitutes {filePath}/{projectRoot}/{imports}/
    {fileBasename} tokens into tool args. Unknown tokens pass through.
  - Defaults: tokenBudget=200, timeoutMs=2000, cacheTtlSec=3600, priority
    from array order. Sensible for every MCP server we've seen.

src/providers/mcp-client.ts
  - McpClientWrapper — thin wrapper on @modelcontextprotocol/sdk v1.29
    Client + StdioClientTransport. Session-lifetime connection reuse.
    Lazy connect (no process spawned until first resolve). Error backoff
    (30s) prevents thrashing if the server crashes on startup.
  - createMcpProvider(config) — factory returning a ContextProvider that
    plugs into the existing resolver without modification. Tier 2 (matches
    context7 / obsidian semantics). Tools called in parallel per Read.
  - Budget enforcement + line-wise truncation (never mid-word).
  - Graceful shutdown on SIGTERM / SIGINT / beforeExit.
  - HTTP transport declared but deferred — throws 'not yet implemented'
    until item #5 SSE streaming lands with the Host/Origin hardening work.

src/providers/resolver.ts
  - getMcpProviders(): loads MCP configs and wraps them. Cached for session
    lifetime. Test hook _resetMcpProvidersCache() for forced reload.
  - getAllProviders(): now merges BUILTINS + plugins + MCP providers
    (all deduped against built-in names so users can't shadow core).
  - Parse failures emit a single-line stderr warning (per bad entry) —
    visible to users without crashing their session.

package.json
  - Adds @modelcontextprotocol/sdk@1.29.0 (4.3MB unpacked, pure JS,
    no native deps). Pinned behind a thin ProviderClient surface so
    migration to SDK v2 (alpha 2026-04) is a one-file swap later.

TESTS

tests/providers/mcp-config.test.ts — 24 cases covering:
  - File-doesn't-exist → empty configs
  - Valid stdio + http shapes round-trip
  - Invalid JSON reported as single failure
  - Bad entries skipped, good ones kept
  - Duplicate names: first wins
  - All validation rules (empty name/label, bad transport, confidence
    range, negative numeric fields, missing command/url, invalid URL)
  - Arg-template substitution: all tokens, unknown pass-through, non-
    strings unchanged, basename fallback, input-immutability

Full suite: 771 → 808 tests (+24), all passing. TypeScript clean.

WHAT THIS COMMIT DOES NOT DO (follow-up within item #1)

  - HTTP transport implementation — waits on item #5 SSE streaming for
    shared Host/Origin validation + resumable streams
  - Integration tests that actually spawn a real MCP server (needs
    tests/fixtures/minimal-mcp-server.mjs — next commit)
  - Tool-list caching — currently we call tools directly without
    listTools() first; the SDK may cache internally but we should
    verify + explicit-cache if not

With this in place, item #2 (plugin contract v2 — mcpConfig auto-wrap)
becomes a 2-day extension: plugin-loader.ts detects .mcpConfig on a
plugin and auto-calls createMcpProvider(). Item #6 (Serena provider)
becomes a 10-line ~/.engram/plugins/serena.mjs once the mcpConfig path
lands.
NickCirv added a commit that referenced this pull request Apr 24, 2026
…rena reference plugin

ITEM #2 — Plugin contract v2

Extends ContextProviderPlugin so plugin authors can declare an MCP server
via 'mcpConfig' and skip writing resolve()/isAvailable() by hand. The
loader auto-wraps via createMcpProvider() from item #1. Classic plugins
(custom resolve()) continue to work unchanged — if both fields are
present, the author's resolve() wins (they opted into custom logic).

Type changes (src/providers/types.ts):
  - ContextProviderPlugin stays strict (extends ContextProvider fully) —
    this is the POST-VALIDATION shape the resolver consumes
  - NEW: RawPluginShape — the pre-validation shape a plugin-file author
    writes in .mjs. tier/tokenBudget/timeoutMs/resolve/isAvailable all
    optional (loader fills from factory when mcpConfig present)

Loader changes (src/providers/plugin-loader.ts):
  - validatePlugin() branches on 'has mcpConfig vs. has resolve()'
  - name/label/version always required
  - Classic path: tier/tokenBudget/timeoutMs/isAvailable required
  - mcpConfig path: config validated via validateProviderConfig(),
    merged with plugin fields (author overrides win over factory defaults)
  - One clear error per rejection — 'invalid mcpConfig: <reason>' tells
    you exactly which sub-field on which plugin is broken

Tests (+7 cases in tests/providers/plugin-loader.test.ts):
  - mcpConfig-only plugin auto-wraps resolve + isAvailable
  - Plugin with neither resolve nor mcpConfig rejected (clear message)
  - Invalid mcpConfig rejected (bad command, bad http url)
  - Custom resolve wins over mcpConfig when both present
  - Plugin tokenBudget override wins over factory default
  - Missing version rejected even for mcpConfig plugins

ITEM #6 — Serena plugin reference

docs/plugins/examples/serena-plugin.mjs (~60 lines incl. docs) — the
full Serena (oraios/serena) wrapper as an mcpConfig-only plugin. Install
is cp + enable. Thanks to item #2, NO custom transport code needed.

docs/plugins/examples/static-context-plugin.mjs — the classic-path
reference showing a tier 1 plugin with hand-rolled resolve() for users
who just want to inject a fixed string on every Read.

docs/plugins/README.md — author-facing guide. Shape 1 (MCP-backed),
Shape 2 (classic), template tokens, safety guarantees, debugging
checklist, publishing notes.

FULL SUITE

808 -> 815 tests (+7), all passing. TypeScript clean, lint clean.

V3.0 PROGRESS

Done: #1 foundation, #2, #6, #7, #9, #10, #11 = 7 of 12 scope items.
Next: #3 budget-weighted resolver + mistakes-boost (~2-3d).
NickCirv added a commit that referenced this pull request Apr 24, 2026
Two orthogonal improvements to the resolver's assembly pipeline. Both
exported from resolver.ts so they're testable in isolation, and both
run in the main resolveRichPacket() flow before the final priority sort.

1. PER-PROVIDER BUDGET ENFORCEMENT (enforcePerProviderBudget)

Providers are SUPPOSED to self-truncate their content to 'tokenBudget',
but a bad plugin or a non-conforming MCP server shouldn't be able to
spend our entire total budget on one section. New helper truncates
each result to the provider's declared budget BEFORE assembly.

- Under-budget content passes through unchanged (zero-cost)
- Over-budget content is line-truncated (never cut mid-word)
- Edge: first line alone > budget -> hard-cap characters with marker

Default budget for unknown/missing providers is 200 tokens (matches
the MCP-config default from item #1).

2. MISTAKES-BOOST RERANKING (boostByMistakes)

If the engram:mistakes provider fires for this file, scan OTHER
providers' content for substring matches against mistake labels
(extracted from the '  ! <label> (flagged <age>)' format). Matching
results get confidence * 1.5 (capped at 1.0).

Runs BEFORE the priority sort, but the secondary sort is now
(priority asc, confidence desc) — so boost breaks ties WITHIN a
priority tier without overriding priority across tiers.

- Case-insensitive matching (labels normalized to lowercase)
- Does NOT boost the mistakes provider itself
- No-op if no mistakes are reported for this file (common case)

Examples of the intended effect:
- An engram:git commit message mentioning a known-broken function
  sorts UP within the git tier
- A mempalace decision that references a mistaken architectural
  choice bubbles ahead of unrelated decisions

TESTS (+10 cases in tests/providers/resolver.test.ts)

enforcePerProviderBudget:
  - Under-budget untouched
  - Over-budget truncated by line with marker
  - Hard-cap when first line alone exceeds budget
  - Default 200 tokens when provider not found

boostByMistakes:
  - No-op when no mistakes provider in set
  - Matching substring boosts confidence 0.6 -> 0.9
  - Cap enforced (0.8 * 1.5 = 1.2 -> 1.0)
  - Non-matching results left alone
  - Mistakes provider itself is never self-boosted
  - Case-insensitive matching across upper/lower case variations

Full suite: 815 -> 825 tests (+10), all passing. TypeScript clean.

V3.0 PROGRESS: 8 of 12 scope items done.
  ✅ #1 foundation ✅ #2#3#6#7#9#10#11
  Remaining: #4 Auto-Memory (blocked on MEMORY.md fixture), #5 SSE
  streaming, #8 pre-mortem warnings, #12 MCP Registry submit, and
  #1 completion (HTTP transport + real-server integration tests).
NickCirv added a commit that referenced this pull request Apr 24, 2026
Opt-in warnings that fire BEFORE Claude Code runs an Edit/Write/Bash
tool call against code previously flagged as a mistake. Fully gated
via ENGRAM_MISTAKE_GUARD env var — zero overhead when unset.

MODES

  unset / '0' → off (default — no database read, no overhead)
  '1'         → permissive: tool proceeds, a warning is prepended
                to any additionalContext the primary handler emits
  '2'         → strict:     tool is denied with the warning as reason

Hooks Edit/Write/Bash only. Read already surfaces mistakes via the
engram:mistakes context provider — duplicating at tool-call time would
be noise.

MATCHING

Edit/Write:
  - Normalize tool_input.file_path to relative POSIX vs projectRoot
  - Indexed lookup via store.getNodesByFile() (uses idx_nodes_source_file)
  - Dedupe by node id when both relative + raw shapes are stored

Bash:
  - Substring match on mistake.metadata.commandPattern (length >2)
  - Fallback: substring match on mistake.sourceFile (length >3 to avoid
    accidentally matching single-char paths like 'a')
  - Full-table scan of mistakes (unavoidable — no file axis to index on).
    Bounded by project size; only runs when the guard is explicitly on.

BI-TEMPORAL FILTER (item #7 interop)

Mistakes with validUntil <= now are suppressed — they refer to code
that has since been refactored away. Prevents stale-warning fatigue.

INTEGRATION

New file: src/intercept/handlers/mistake-guard.ts
  - currentGuardMode() — reads env var at call time, not module load,
    so tests can flip between cases cleanly
  - findMatchingMistakesAsync(target, projectRoot) — the matcher
  - formatWarning(matches) — human-readable warning block
  - applyMistakeGuard(rawResult, payload, kind) — wrapping fn that
    augments additionalContext (permissive) or overrides to deny (strict)

src/intercept/dispatch.ts wiring: after runHandler() returns for Edit/
Write/Bash, pass result through applyMistakeGuard() before returning.
Two-line diff. Doesn't touch the existing handlers.

SAFETY

Every code path in mistake-guard is wrapped in try/catch with a null
return. A guard failure MUST NEVER break the primary handler. If the
store open fails, the env var is wrong, the payload is malformed —
guard silently returns the raw result unchanged.

TESTS (+21 cases in tests/intercept/handlers/mistake-guard.test.ts)

  - currentGuardMode: off/permissive/strict recognition, bogus values
    coerced to off
  - formatWarning: empty-match string, single-match header, >5-match
    collapse with '… and N more'
  - findMatchingMistakesAsync (file): rel path, abs path normalization,
    no-match, validUntil filter
  - findMatchingMistakesAsync (bash): commandPattern substring match,
    sourceFile-in-command match, case-insensitive, too-short pattern
    guard, validUntil filter
  - applyMistakeGuard: mode=off no-op, permissive augments additional
    context, permissive no-match no-op, strict denies with reason,
    permissive from passthrough emits fresh allow-with-warning

Full suite: 825 -> 846 tests (+21), all passing. TypeScript clean.

V3.0 PROGRESS — 9 of 12 scope items

  ✅ #1 foundation  ✅ #2#3#6#7#8#9#10#11

Remaining:
  - #1 completion (HTTP transport + real-server integration tests)
  - #4 Anthropic Auto-Memory bridge (blocked: needs MEMORY.md fixture)
  - #5 SSE streaming for rich packet assembly
  - #12 Official MCP Registry submission (post-ship)
NickCirv added a commit that referenced this pull request Apr 24, 2026
Adds progressive delivery for rich packet assembly. Instead of blocking
on Promise.allSettled (which waits for the slowest provider — Serena
cold-start, mempalace ChromaDB warmup), clients can stream results
as they arrive and render each section immediately.

NEW — resolveRichPacketStreaming generator (src/providers/resolver.ts)

AsyncGenerator<StreamEvent> that yields:
  { type: 'provider', result: ProviderResult }  — as each resolves
  { type: 'done', providerCount, durationMs }  — final totals

Order = ARRIVAL order (fast providers first). Consumers who want
priority order use the non-streaming resolveRichPacket() which applies
full priority + mistakes-boost + budget logic.

Implementation: fan-out all providers, funnel outcomes into a FIFO
queue + wake-on-arrival pattern. No extra deps. Per-provider timeouts
preserved (same resolveWithTimeout path as non-streaming).

NEW — /context/stream SSE endpoint (src/server/http.ts)

GET /context/stream?file=<relative-path> (auth required).
Emits one SSE frame per StreamEvent. Frame shape matches MCP SEP-1699
(SSE resumption):

  id: 0
  event: provider
  data: {"provider":"engram:ast", …}

  id: 1
  event: provider
  data: {"provider":"engram:mistakes", …}

  id: N
  event: done
  data: {"providerCount":N,"durationMs":347}

Supports Last-Event-ID header — clients reconnecting via
'Last-Event-ID: 3' skip events 0-3 and pick up from 4. Useful for
long-running sessions that drop WiFi mid-stream without losing context.

Client-disconnect aborts the stream cleanly (req.close handler short-
circuits the generator loop).

TESTS (+6 new)

resolver.test.ts (+2):
  - Smoke: streaming generator terminates with a 'done' event for any
    project (no hang, no runaway)
  - Arrival-order invariant: toy generator mirrors production shape,
    verifies fast results yield before slow ones

server/http.test.ts (+4):
  - Missing 'file' param returns 400
  - Valid request returns 200 + text/event-stream + ends with 'done'
  - Every frame carries an 'id:' header (SEP-1699 resumption)
  - Auth required — unauthenticated returns 401

Full suite: 870 -> 876 tests (+6), all passing. TypeScript clean.

V3.0 PROGRESS — 11 of 12 scope items done

  ✅ #1 foundation  ✅ #2#3#4#5#6#7#8#9#10#11

Only remaining in-scope work:
  - #12 MCP Registry submission (~2h, post-ship only)

Plus item #1 completion (HTTP transport + minimal MCP server fixture
for integration tests) — technically part of #1 which shipped its
foundation as c719591; the HTTP transport path was explicitly deferred
until this SSE work landed. Now it can.
…ility

Add encoding: 'utf-8' to readdirSync calls to fix TypeScript errors in Node 25:
- Type 'Dirent<string>[]' not assignable to type 'Dirent<NonSharedBuffer>[]'
- Property 'startsWith' does not exist on type 'NonSharedBuffer'

Same pattern as skills-miner.ts:226-229
@mechtar-ru
Copy link
Copy Markdown
Author

added tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants