Engram

Self-correcting memory for LLM agents. A Claude Code plugin that learns from sessions, surfaces relevant memories, measures whether they're actually followed, and fixes the ones that aren't.

The problem

Claude Code has several instruction sources — CLAUDE.md, rules, skills — but no way to know if they're working. An instruction that's always loaded but never followed wastes context budget. A great instruction that only matches narrow keywords goes unseen. Without measurement, instruction sets decay: duplicates accumulate, stale guidance persists, and useful patterns stay buried.

How engram solves it

Engram hooks into every phase of a Claude Code session to create a closed feedback loop:

  Learn ──→ Surface ──→ Maintain
    ↑                      │
    └──────────────────────┘

Learn — Extracts structured memories from user corrections, instructions, and contextual facts. Each memory is a TOML file with tier-based metadata: title, content, principle, anti-pattern, keywords, concepts, and confidence tier (A/B/C).
Surface — At every prompt and tool use, retrieves relevant memories via BM25 keyword scoring and injects them as context. A per-hook token budget caps total injection to avoid overwhelming the model.
Maintain — Periodic diagnosis generates proposals for each memory based on its effectiveness quadrant. Proposals are applied with user confirmation: rewrites for stale content, keyword broadening for hidden gems, escalation for persistent violations, deletion for noise.

Instruction sources

Engram manages one instruction source — memories — and cross-references against other Claude Code sources for deduplication:

Source type	Description	Surfacing behavior
`memory`	TOML files in `~/.claude/engram/data/memories/`	Keyword-matched per prompt via BM25

Cross-reference sources (used for suppression, not managed by engram): CLAUDE.md, .claude/rules/, plugin skills.

Each memory TOML embeds its own registry data: content hash, surfaced count, last surfaced timestamp, evaluation counters (followed/contradicted/ignored), and enforcement level.

Measurements

Every time a memory is surfaced, the registry increments its surfaced_count and updates last_surfaced. At session end, the evaluator classifies each surfaced memory's outcome:

Followed — The model's behavior aligned with the instruction
Contradicted — The model did the opposite of what the instruction said
Ignored — The instruction was surfaced but had no observable effect

From these counters, engram computes effectiveness (followed / total evaluations) and frecency (recency-weighted frequency with configurable half-life decay).

Effectiveness quadrants

The registry classifies every instruction into one of four quadrants:

	High effectiveness	Low effectiveness
Often surfaced	Working — Keep as-is	Leech — Rewrite or escalate
Rarely surfaced	Hidden Gem — Broaden keywords	Noise — Remove

All memories are classified uniformly by the same thresholds. An additional "Insufficient" quadrant applies when evaluation data is below the minimum threshold.

Maintenance actions

Each quadrant has a prescribed action:

Working — Check for staleness; rewrite if content is outdated
Leech — Diagnose root cause (content quality, wrong keywords, enforcement gap) and either rewrite content or escalate enforcement
Hidden Gem — LLM suggests additional keywords/concepts to increase surfacing coverage
Noise — Delete the memory TOML file

Maintenance proposals are generated by engram maintain and applied interactively via engram maintain --apply --proposals <file>.

Enforcement escalation

For persistent leech memories (instructions that keep getting violated), engram applies a graduated escalation ladder:

Advisory — Standard surfacing (default)
Emphasized advisory — Surfaced with emphasis markers
Reminder — Reminder injected after tool use via PostToolUse hook

Escalation level is tracked per-memory.

Memory graph

Engram builds a directed graph of relationships between memories. When a memory is surfaced, spreading activation propagates to related memories, allowing conceptually linked instructions to surface together even when keyword overlap is low.

Links are created automatically via merge-on-write: when new memories are stored, engram detects duplicates and near-duplicates, merging them and preserving relationship edges. Links are re-computed after each merge to keep the graph consistent.

Session lifecycle

Engram hooks into 7 Claude Code hook points:

Phase	Hook	What happens
Start	`SessionStart`	Build binary if stale. Run maintain/triage. Notify user that `/recall` is available.
Prompt	`UserPromptSubmit`	Surface prompt-relevant memories (BM25). Detect inline corrections (UC-3).
Prompt (async)	`UserPromptSubmit`	Incremental learning extraction from transcript delta.
Tool use	`PreToolUse`	Surface tool-specific memories (e.g., file-path-relevant instructions).
After tool	`PostToolUse`	Surface memories relevant to the tool call and its output.
Tool failure	`PostToolUseFailure`	Diagnose errors and surface relevant memories.
Compact	`PreCompact`	Flush: learning extraction.
End	`Stop`	Flush: learning extraction.

Previous session context is loaded on demand via the /recall skill, which summarizes recent transcripts from the same project using Haiku. This keeps session-start lightweight and avoids injecting stale context automatically.

Data files

All data lives in ~/.claude/engram/data/:

File	Purpose
`memories/*.toml`	Structured memory files with embedded registry data (surfaced count, evaluation counters, enforcement level)
`surfacing-log.jsonl`	Running log of which memories were surfaced and when
`learn-offset.json`	Offset tracking for incremental transcript learning

Memory TOML structure

Each memory is a TOML file with structured fields:

title = "Use targ for builds"
content = "Always use targ build system instead of raw go commands"
observation_type = "workflow_preference"
concepts = ["build-system", "tooling"]
keywords = ["targ", "build", "go test", "go vet"]
principle = "Use targ test, targ check, targ build for all operations"
anti_pattern = "Running go test or go vet directly"
rationale = "targ encodes hard-won lessons about build configuration"
confidence = "A"

Confidence tiers: A (explicit instruction — "always/never/remember"), B (teachable correction), C (contextual fact). Anti-patterns are required for tier A, optional for B, empty for C.

Installation

Requires Go 1.25+.

# Clone and install as a Claude Code plugin
git clone https://github.com/toejough/engram.git

# Add to Claude Code — the plugin auto-builds on first session
claude plugin add /path/to/engram

The binary auto-builds on first hook invocation and rebuilds when Go source files change.

Project structure

cmd/engram/          CLI entry point (thin wiring layer)
internal/            Business logic (33 packages, all DI boundaries)
hooks/               Shell scripts for Claude Code hook integration
skills/              Plugin skills (memory triage)
.claude-plugin/      Plugin manifest (plugin.json)
docs/specs/          Specification artifacts (UC, REQ, DES, ARCH, TEST)

Design principles

DI everywhere — No function in internal/ calls os.*, http.*, or any I/O directly. All I/O through injected interfaces, wired at cmd/ and cli.go edges.
Pure Go, no CGO — TF-IDF/BM25 for retrieval instead of ONNX. External embedding API if vector similarity needed.
Fire and forget — Registry writes never block the critical path. Write failures are logged but don't fail the operation.
Measure impact, not frequency — A memory surfaced 1000 times but never followed is a leech, not a success.

Specification

23 use cases across 5 specification layers (UC → REQ/DES/ARCH → TEST → IMPL).

UC	Name
UC-1	Session Learning
UC-2	Hook-Time Surfacing & Enforcement
UC-3	Remember & Correct
UC-6	Memory Effectiveness Review
UC-14	Structured Session Continuity
UC-15	Automatic Outcome Signal
UC-16	Unified Memory Maintenance
UC-17	Context Budget Management
UC-18	PostToolUse Proactive Reminders
UC-21	Enforcement Escalation Ladder
UC-23	Unified Instruction Registry
UC-24	Proposal Application
UC-27	Global Binary Installation
UC-28	Automatic Maintenance and Promotion Triggers
UC-32	Memory Graph with Spreading Activation
UC-33	Merge-on-Write
UC-34	Pre-Classification Duplicate Consolidation
UC-P1-1	Cross-Source Contradiction Detection
UC-P5f-1	Re-compute Links After Merge

License

See LICENSE for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Engram

The problem

How engram solves it

Instruction sources

Measurements

Effectiveness quadrants

Maintenance actions

Enforcement escalation

Memory graph

Session lifecycle

Data files

Memory TOML structure

Installation

Project structure

Design principles

Specification

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1,352 Commits
.claude-plugin		.claude-plugin
.claude		.claude
cmd/engram		cmd/engram
dev		dev
docs		docs
hooks		hooks
internal		internal
skills		skills
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
go.mod		go.mod
go.sum		go.sum

Folders and files

Latest commit

History

Repository files navigation

Engram

The problem

How engram solves it

Instruction sources

Measurements

Effectiveness quadrants

Maintenance actions

Enforcement escalation

Memory graph

Session lifecycle

Data files

Memory TOML structure

Installation

Project structure

Design principles

Specification

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages