Skip to content

feat: context-aware semantic instinct retrieval #64

@MattDevy

Description

@MattDevy

Problem

Currently, the injector selects the top N highest-confidence instincts (default: 20) and injects all of them into every system prompt globally. The LLM must read each instinct's trigger and self-filter relevance. This wastes context and prompt budget on instincts that are irrelevant to the current task.

Proposed Solution

Replace the global top-N injection with a semantic retrieval step that fetches only instincts contextually relevant to the current prompt/file/codebase location before each agent start.

Approach options:

  • Embed instinct bodies at write time (e.g. via a lightweight local embedder or Haiku embedding endpoint)
  • At injection time, embed the current context (active file path, recent tool calls, user prompt) and fetch the top-K nearest instincts by cosine similarity
  • Fall back to confidence-weighted top-N if no context signal is available

Benefits

  • Keeps system prompt lean — only relevant instincts injected per turn
  • Allows the instinct store to grow large (100s of instincts) without degrading prompt quality
  • Reduces noise from instincts with unrelated triggers being read and ignored by the LLM

References

Comparison with Claude Code's memory system, which uses semantic embedding search to retrieve only contextually relevant memories per prompt.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestqualityInstinct quality and validation

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions