Skip to content

Auto-consolidate similar memories into generalized principles #368

@toejough

Description

@toejough

Problem

Many memories are narrow observations of the same underlying principle. For example, 9 separate memories might each note a specific instance where code should have been deduplicated or reused. Individually they're noisy — each surfaces in slightly wrong contexts. But the extracted principle ("prefer reuse over duplication") would be high-value and broadly applicable.

This is a common pattern: multiple low-generalizability memories that share a theme but were created independently across sessions. They clutter surfacing, dilute each other's signal, and none individually captures the real lesson.

Desired behavior

Periodically (e.g., during maintain or a new consolidate subcommand):

  1. Cluster memories by semantic similarity (embeddings or keyword overlap)
  2. Identify clusters where N+ memories (e.g., 3+) share a common theme
  3. Propose a single generalized memory that captures the principle
  4. On approval: create the generalized memory, archive or remove the originals
  5. The generalized memory should have higher generalizability score and broader (but still precise) keywords

Examples

  • 9 memories about specific deduplication opportunities → 1 memory: "extract shared logic when 3+ call sites exist"
  • 5 memories about forgetting to check git status before destructive ops → 1 memory: "always check VCS state before destructive operations"
  • 4 memories about specific test isolation failures → 1 memory: "parallel tests must not share mutable state"

Design considerations

  • Clustering could use TF-IDF similarity (already in the codebase) on memory content
  • Haiku can propose the generalized principle from a cluster
  • Should be dry-run capable (show proposed consolidations without applying)
  • Threshold for cluster size should be configurable (default 3)
  • Generalized memory inherits the combined feedback history of its sources
  • Original memories get archived (not deleted) so consolidation can be reviewed/reversed

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions