Skip to content
This repository was archived by the owner on Feb 23, 2026. It is now read-only.
This repository was archived by the owner on Feb 23, 2026. It is now read-only.

recall-context returns excessive data causing token limit warnings #2

@mdlopresti

Description

@mdlopresti

Problem

The recall-context tool retrieves all memories at once, returning massive amounts of data (196K+ characters) that trigger token limit warnings in Claude Code and make the tool impractical for agents with moderate memory usage.

Evidence

First reported: December 17, 2025
Re-confirmed: December 26, 2025 (v0.4.0)

Actual Test Results (Dec 26, 2025)

Error: result (196,595 characters) exceeds maximum allowed tokens. 
Output has been saved to file.

With only:

  • 16 private longterm memories
  • 6 team memories
  • 0 personal memories

Total: 22 memories → 196,595 characters

Impact

Severity: MEDIUM 🟡

  • Triggers "large context" warnings in Claude Code
  • Makes recall-context impractical for regular use
  • Agents hit context limits faster
  • Inefficient use of context window
  • Forces workarounds (reading from saved files, manual filtering)

Current Behavior

recall-context returns ALL memories from requested scopes in a single response:

  • All private memories
  • All personal memories
  • All team memories
  • All public memories
  • Complete content for each memory
  • All metadata (tags, priority, timestamps, etc.)

This is a dump everything approach that doesn't scale.

Expected/Desired Behavior

Phase 1 (Quick Fix):

  • Add limit parameter (default: 20) ✅ ALREADY EXISTS
  • Document that users should use limit parameter
  • Maybe reduce default from current behavior

Phase 2 (Better Architecture):

  • Return index/summary first (memory IDs, titles, tags, truncated content)
  • Allow retrieval of individual memories by ID on demand
  • Support pagination for large memory sets

Root Cause

Pattern v0.4.0 enhanced recall-context with:

  • Tag filtering
  • Priority filtering
  • Date range filtering
  • Content search

But all filtering is client-side - still retrieves ALL memories first, then filters.

With moderate usage (20-30 memories), this becomes unusable.

Proposed Solutions

Option A: Limit by Default (Quick Fix)

  1. Change default limit from unlimited to 50
  2. Document limit parameter prominently
  3. Add warning if limit reached: "Showing X of Y total memories"

Option B: Two-Phase Retrieval (Better UX)

  1. recall-context returns summary/index with memory IDs, previews, tags
  2. New tool: get-memory for full retrieval by ID

Option C: Smart Summary (AI-Friendly)

  1. Generate 4KB summary at retrieval time
  2. Group by category, prioritize by recency/priority
  3. Include memory IDs for drill-down

Workaround

Currently, agents must:

  1. Read saved file when recall-context exceeds limits
  2. Manually parse/filter JSON
  3. Use specific filters to reduce result set

Environment

  • Pattern version: v0.4.0
  • 22 total memories across private + team scopes
  • Result: 196,595 characters (exceeded token limit)

Related

This is mentioned in Pattern memory as known issue (Dec 17), but was never formally filed as GitHub issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions