Skip to content

Improve episodic memory ranking with reinforcement and decay#2

Merged
mcheemaa merged 2 commits intoghostwright:mainfrom
coe0718:memory-ranking-reinforcement-decay
Mar 31, 2026
Merged

Improve episodic memory ranking with reinforcement and decay#2
mcheemaa merged 2 commits intoghostwright:mainfrom
coe0718:memory-ranking-reinforcement-decay

Conversation

@coe0718
Copy link
Copy Markdown
Contributor

@coe0718 coe0718 commented Mar 31, 2026

What Changed

This PR improves episodic memory retrieval without adding a second memory pipeline.

  • adds a small deterministic ranking helper for episodic memories
  • blends semantic match with importance, reinforcement from repeated access, and decay over time
  • increments episodic access_count on recall so reinforcement reflects actual usage
  • filters stale, low-signal episodic memories before they are injected into prompt context
  • adds focused tests for ranking behavior and context filtering
  • documents the updated episodic-memory behavior

Why

This is a first focused step toward the memory strategy discussed in #1.

Phantom already has good capture and search primitives, but episodic retrieval was mostly recency-biased and prompt injection could still include old one-off memories with little durable value. This change keeps TypeScript in the plumbing lane: ranking, decay, retrieval, and prompt shaping. It does not move semantic reasoning out of the Agent SDK or judges.

How I Tested

Commands run:

/home/coemedia/.bun/bin/bun test src/memory/__tests__/ranking.test.ts
/home/coemedia/.bun/bin/bun test src/memory/__tests__/episodic.test.ts
/home/coemedia/.bun/bin/bun test src/memory/__tests__/context-builder.test.ts
bash -lc 'export PATH="$HOME/.bun/bin:$PATH" && /home/coemedia/.bun/bin/bun test'
/home/coemedia/.bun/bin/bun run typecheck
./node_modules/.bin/biome check src/memory/ranking.ts src/memory/episodic.ts src/memory/context-builder.ts src/memory/__tests__/ranking.test.ts src/memory/__tests__/episodic.test.ts src/memory/__tests__/context-builder.test.ts

Notes:

  • Full test suite passes once Bun is on PATH for subprocess-spawn tests.
  • Full repo lint currently reports pre-existing issues in src/mcp/__tests__/dynamic-handlers.test.ts, unrelated to this PR.

Checklist

  • Tests pass (bun test)
  • Lint passes (bun run lint)
  • Typecheck passes (bun run typecheck)
  • No secrets or .env files included
  • Files stay under 300 lines
  • No Cardinal Rule violations (TypeScript does plumbing only, the Agent SDK does reasoning)
  • No default exports or barrel files added

@mcheemaa mcheemaa self-requested a review March 31, 2026 04:43
Copy link
Copy Markdown
Member

@mcheemaa mcheemaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really strong first contribution. Thank you for spending time understanding the codebase before jumping into code - the issue discussion in #1, the phased approach, and the attention to the Cardinal Rule all show a deep read of the project.

A few things I appreciated:

  • The extraction to ranking.ts is the right refactor. The episodic store was accumulating scoring logic that belongs in a pure-function module. Clean separation.
  • Catching the updateAccessCounts bug (private method not incrementing access_count) is a sharp find. Reinforcement would never have worked from the recall path without this fix.
  • The math is solid. Exponential decay with configurable half-lives, log-saturating reinforcement, sensible bypass rules for high-importance episodes.
  • The context filter is exactly the right scope - a single filter() call that keeps stale low-signal episodes from burning prompt tokens.

Two optional suggestions (neither blocks merge):

  1. In calculateEpisodeContextScore, the weightedAverage call passes a dummy zero for the third argument since the function takes 3 inputs but you only need 2. A brief comment or a 2-arg helper would help readability.

  2. The hoursSince IIFE has a redundant Number.isNaN guard - Date.parse already returns NaN for invalid strings and the Number.isFinite check handles it.

Verified locally (789 tests pass, typecheck and lint clean) and deployed to a test VM. Memory recall is working correctly with the new ranking in place.

Approving as-is. Welcome to Phantom.

@mcheemaa mcheemaa merged commit bbf89ef into ghostwright:main Mar 31, 2026
1 check passed
@coe0718
Copy link
Copy Markdown
Contributor Author

coe0718 commented Mar 31, 2026 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants