Memory strategy: reinforce durable memories, decay stale ones, and surface contradictions

## Description

Phantom already has strong building blocks for memory: session-end capture, semantic facts, contradiction handling, procedural memory, and hybrid retrieval. What feels missing is a stronger long-term memory strategy around reinforcement, forgetting, and startup retrieval.

I would like to propose an improvement to the existing memory system so it behaves more like a durable, adaptive memory rather than a growing archive.

The goal is not to add a second parallel memory system. The goal is to evolve the current one so it better answers three questions:

1. What should become durable memory?
2. What should fade away if it stops mattering?
3. What should be surfaced when new information conflicts with old beliefs?

## Use Case

Right now, Phantom can remember and retrieve useful information, but over time the memory model may drift toward accumulation without enough reinforcement or forgetting.

For a persistent co-worker, the important behavior is not just "can it search memory?" but "does it feel like it knows me, my preferences, and my working patterns without becoming noisy over time?"

A stronger memory strategy would improve workflows like:

- remembering repeated user preferences across many sessions
- surfacing stable patterns instead of many near-duplicate episodes
- adapting when a user's habits change
- keeping startup context focused instead of bloated
- making long-running Phantom instances feel more cumulative and personally tuned

## Proposed Solution

I think this fits best as an evolution of the existing episodic + semantic + procedural memory system, not a brand-new 5-layer subsystem.

Proposed direction:

- Keep session-end extraction as the primary capture point.
  - Avoid per-turn model calls for salience unless there is a strong reason.
  - Let significance emerge from the full session, not from isolated turns.

- Add reinforcement to durable memories.
  - When similar facts or patterns recur across sessions, increase their weight or confidence instead of storing near-duplicates forever.
  - Repeated signals should strengthen memory.

- Add explicit decay / forgetting.
  - Episodic memories that are never accessed or reinforced should decay over time.
  - Stale or low-value memories should stop competing with durable ones during retrieval.
  - This should keep the system sharp rather than just larger.

- Improve contradiction surfacing.
  - Existing contradiction handling is a great start.
  - In addition to superseding old facts, it would help to surface important contradictions during retrieval or startup so the agent can resolve or adapt to changed user behavior.

- Improve startup retrieval.
  - Instead of loading "all memory", inject a focused top-N set of durable, high-signal memories at session start.
  - Favor reinforced memories, recent high-importance memories, and active contradictions.

- Preserve the Cardinal Rule.
  - TypeScript should do plumbing: persistence, ranking, decay, retrieval, scheduling.
  - LLM judges should do reasoning: salience, extraction quality, contradiction interpretation, consolidation decisions.
  - Heuristics should stay fallback-only.

A possible implementation path could be split into small PRs:

1. Reinforcement and decay metadata for episodic / semantic memory
2. Ranking updates for retrieval and startup context
3. Better contradiction surfacing
4. Tests and docs for the new memory behavior

## Alternatives Considered

### Full 5-layer memory pipeline

I considered a more elaborate layered design with capture, observation, reflection, search, and startup loading as separate stages.

I do not think that should be added to Phantom as-is because:

- it risks duplicating memory and evolution pipelines that already exist
- per-turn LLM capture adds cost quickly
- a separate observations file could create a second source of truth
- loading all memory at startup would likely hurt context quality

### Keep the current system unchanged

The current system is already better than most agent projects, but I think it is missing a stronger notion of reinforcement and forgetting, which is what makes long-term memory feel personal instead of archival.

## Additional Context

This seems aligned with the contribution guide's "memory strategies" and "evolution pipeline improvements" categories.

I would be happy to help shape this into a focused PR series if the maintainer thinks this direction fits the project.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory strategy: reinforce durable memories, decay stale ones, and surface contradictions #1

Description

Use Case

Proposed Solution

Alternatives Considered

Full 5-layer memory pipeline

Keep the current system unchanged

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Memory strategy: reinforce durable memories, decay stale ones, and surface contradictions #1

Description

Description

Use Case

Proposed Solution

Alternatives Considered

Full 5-layer memory pipeline

Keep the current system unchanged

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions