Skip to content

Latest commit

 

History

History
186 lines (136 loc) · 7.34 KB

File metadata and controls

186 lines (136 loc) · 7.34 KB

7. LLM CLI Integration

< Back to Design Overview


Integration Architecture

Mnemon integrates with LLM CLIs through lifecycle hooks, a skill file, and a behavioral guide. Claude Code's hook system is the reference implementation — all components are deployed automatically via mnemon setup.

7.1 Integration Architecture

Four hooks drive the memory lifecycle:

Session starts
    │
    ▼
  Prime (SessionStart) ─── prime.sh ──→ load guide.md (memory execution manual)
    │
    ▼
  User sends message
    │
    ▼
  Remind (UserPromptSubmit) ─── user_prompt.sh ──→ remind agent to recall & remember
    │
    ▼
  Skill (SKILL.md) ── command syntax reference (auto-discovered)
    │
    ▼
  LLM generates response (following guide.md behavioral rules)
    │
    ▼
  Nudge (Stop) ─── stop.sh ──→ remind agent to remember
    │
    ▼
  (when context compacts)
  Compact (PreCompact) ─── compact.sh ──→ extract critical insights to remember

Three layers work together:

Layer What Where Role
Hooks Shell scripts triggered by Claude Code lifecycle events .claude/hooks/mnemon/ Prime (guide), Remind (recall & remember), Nudge (remember), Compact (critical save)
Skill SKILL.md — command reference in Claude Code skill format .claude/skills/mnemon/ Teaches the LLM how to use mnemon commands
Guide guide.md — detailed execution manual for recall, remember, and delegation ~/.mnemon/prompt/ Teaches the LLM when to recall, what to remember, and how to delegate

7.2 Hook Details

Claude Code fires hooks at specific lifecycle events. Mnemon registers up to four, each with a distinct role in the memory lifecycle:

Prime (SessionStart) — prime.sh

Runs once when a session starts. Loads the behavioral guide — a detailed execution manual that teaches the agent when to recall, what to remember, and how to delegate memory writes:

STATS=$(mnemon status 2>/dev/null)
if [ -n "$STATS" ]; then
  # extract counts from JSON and show in status line
  echo "[mnemon] Memory active (<insights> insights, <edges> edges)."
else
  echo "[mnemon] Memory active."
fi
[ -f ~/.mnemon/prompt/guide.md ] && cat ~/.mnemon/prompt/guide.md

The guide content appears in the LLM's system context, establishing recall/remember/delegation behavior for the entire session.

Remind (UserPromptSubmit) — user_prompt.sh

Runs on every user message. A lightweight prompt that reminds the agent to evaluate whether recall and remember are needed before starting work:

echo "[mnemon] Evaluate: recall needed? After responding, evaluate: remember needed?"

The agent decides whether to act on this reminder based on the guide.md rules — it is a suggestion, not forced execution.

Nudge (Stop) — stop.sh

Runs after each LLM response. Reminds the agent to consider whether the exchange warrants a remember operation. Stays silent if memory was already addressed:

MSG=$(echo "$INPUT" | jq -r '.last_assistant_message // ""' 2>/dev/null)
if echo "$MSG" | grep -qi "mnemon remember\|sub-agent.*remember\|Stored.*imp="; then
  exit 0  # Already handled
fi
echo "[mnemon] Consider: does this exchange warrant a remember sub-agent?"

Compact (PreCompact) — compact.sh (optional)

Fires before context window compression. Instructs the agent to extract the most critical insights and remember them before context is lost:

echo "[mnemon] Context compaction starting. Review this session and remember the most valuable insights (up to 5) before context is compressed. Delegate to Task sub-agents now."

7.3 Automated Setup

mnemon setup handles all deployment automatically:

$ mnemon setup

Detecting LLM CLI environments...
  ✓ Claude Code (v1.x)    .claude/

Select environment: Claude Code
Install scope: Local — this project only (.claude/)

[1/3] Skill
  ✓ Skill     .claude/skills/mnemon/SKILL.md

[2/3] Prompts
  ✓ Prompts   ~/.mnemon/prompt/ (guide.md, skill.md)

[3/3] Optional hooks
  Select hooks to enable:
    [x] Remind  — remind agent to recall & remember (recommended)
    [x] Nudge   — remind agent to remember after work
    [ ] Compact — extract critical insights before compaction

Setup complete!
  Hooks   prime, remind, nudge
  Prompts ~/.mnemon/prompt/ (guide.md, skill.md)

Start a new Claude Code session to activate.
Edit ~/.mnemon/prompt/guide.md to customize behavior.
Run 'mnemon setup --eject' to remove.

Key setup options:

Flag Effect
--global Install to ~/.claude/ (all projects) instead of .claude/ (project-local)
--target claude-code Non-interactive, Claude Code only
--eject Remove all mnemon integrations
--yes Auto-confirm all prompts (CI-friendly)

The Prime hook is always installed. Remind, Nudge, and Compact hooks are optional (Remind and Nudge enabled by default).

7.4 Sub-Agent Delegation

Memory writes don't happen in the main conversation. Instead, the host LLM delegates to a lightweight sub-agent:

Main Agent (Opus)                     Sub-Agent (Sonnet)
┌──────────────────────┐              ┌──────────────────────┐
│ Full conversation     │  delegates   │ ~1000 tokens context │
│ context (~25k tokens) │ ──────────→ │ Reads SKILL.md       │
│                       │              │ Executes commands    │
│ Decides WHAT to       │  result      │ Evaluates candidates │
│ remember              │ ←────────── │ with judgment        │
└──────────────────────┘              └──────────────────────┘

Why sub-agent?

Dimension Main conversation Sub-agent
Context size ~25,000 tokens ~1,000 tokens
Model Opus (expensive) Sonnet (cheaper)
Scope Full conversation Memory task only
Execution Synchronous, blocks user Background, non-blocking

The main agent provides only WHAT to store — content, category, importance, entities. The sub-agent reads SKILL.md, executes the correct mnemon remember command, and evaluates remember's link candidates with judgment — not mechanical rules.

This separation means:

  • Token economy: ~7,000 total tokens per memory write vs ~25,000 if done in main conversation
  • Context isolation: Memory processing doesn't pollute the main conversation context
  • Model efficiency: Sonnet handles routine execution while Opus focuses on high-level decisions

7.5 Adapting to Other LLM CLIs

For CLIs with hook support, replicate the Claude Code pattern: register lifecycle hooks that call mnemon commands, deploy the skill file, and provide the behavioral guide.

For CLIs without hook support, merge the recall/remember guidance into the corresponding system prompt file:

  • Cursor -> .cursorrules
  • Windsurf -> RULES.md
  • OpenClaw -> mnemon setup --target openclaw deploys skill + guide, but hooks require manual plugin configuration
  • Others -> System prompt / rules file