feat: start 1.0.6 provenance and layered retrieval#34
Conversation
There was a problem hiding this comment.
Pull request overview
This PR advances Memorix 1.0.6 toward provenance-aware memory and progressive disclosure by introducing sourceDetail/valueCategory metadata end-to-end, then using it to drive layered session context, compact search/timeline/detail formatting, and source-aware retention/scoring.
Changes:
- Add provenance/value-category fields to observation/index/document shapes, persistence, and indexing.
- Implement disclosure-layer classification (L1/L2/L3) and apply it to session context output and compact formatters (search/timeline/detail).
- Make retention decay and session-context scoring source/value aware, and update/extend test coverage accordingly.
Reviewed changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/memory/session.test.ts | Updates expectation to match new L2 “Durable working context” messaging. |
| tests/memory/session-layered.test.ts | Adds layered session context tests (L1 routing, L2 key memories, L3 evidence hints). |
| tests/memory/provenance.test.ts | Adds provenance persistence, search exposure, retention/scoring behavior tests. |
| tests/integration/release-blockers.test.ts | Makes CLI search test less flaky by waiting for user-visible output. |
| tests/compact/search-format.test.ts | Verifies compact search table Src badges + source summary behavior. |
| tests/compact/timeline-provenance.test.ts | Verifies timeline Src column + anchor kind annotations and backward-compat. |
| tests/compact/detail-provenance.test.ts | Verifies provenance header behavior in detail output and backward-compat. |
| src/types.ts | Extends core types with sourceDetail and valueCategory. |
| src/store/orama-store.ts | Adds provenance fields to Orama schema and exposes them in search/timeline results. |
| src/server.ts | Marks explicit writes with provenance and updates tool descriptions for deep retrieval. |
| src/memory/session.ts | Adds disclosure-based partitioning (L1/L2/L3) and exports scoring helper for tests. |
| src/memory/retention.ts | Adds source-aware retention multiplier and core-based immunity. |
| src/memory/observations.ts | Persists provenance/value fields through storage/upsert/indexing paths. |
| src/memory/disclosure-policy.ts | Introduces layer classifier + compact source badge helper. |
| src/hooks/handler.ts | Tags hook-ingested observations as sourceDetail: 'hook'. |
| src/compact/index-format.ts | Adds Src column, tier summary, timeline annotations, and detail provenance header. |
| src/compact/engine.ts | Ensures compact detail docs carry provenance/value fields. |
| src/cli/tui/data.ts | Tags quick memories as explicit provenance. |
| src/cli/index.ts | Tags CLI remember writes as explicit provenance. |
| src/cli/commands/ingest-log.ts | Tags ingest-log writes as git-ingest. |
| src/cli/commands/ingest-commit.ts | Tags ingest-commit writes as git-ingest. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| sourceDetail: 'string' as const, | ||
| valueCategory: 'string' as const, |
There was a problem hiding this comment.
getDb() now declares sourceDetail/valueCategory as required string fields in the Orama schema, but hydrateIndex() still inserts documents without those properties. With Orama’s strict schema this can throw during CLI/TUI cold-start hydration (or silently drop docs). Consider defaulting these fields in hydrateIndex() (e.g., empty string) or making the schema tolerant of missing values in a backward-compatible way.
| sourceDetail: 'string' as const, | |
| valueCategory: 'string' as const, | |
| sourceDetail: { type: 'string', nullable: true } as const, | |
| valueCategory: { type: 'string', nullable: true } as const, |
| * Rules (phase 2 first-cut): | ||
| * L2 — default working-context: explicit, undefined, or core-valued | ||
| * L1 — routing signal: hook auto-captures (non-core) | ||
| * L3 — evidence layer: git-ingest (non-core), or any other low-trust source | ||
| * | ||
| * git-ingest defaults to L3 but can be promoted to L2 by valueCategory=core. | ||
| * Rules are kept explicit and easy to extend in future phases. | ||
| */ | ||
|
|
||
| export type DisclosureLayer = 'L1' | 'L2' | 'L3'; | ||
|
|
||
| export interface ProvenanceFields { | ||
| sourceDetail?: string; | ||
| valueCategory?: string; | ||
| } | ||
|
|
||
| /** | ||
| * Classify a single observation or index entry into a disclosure layer. | ||
| */ | ||
| export function classifyLayer(fields: ProvenanceFields): DisclosureLayer { | ||
| const { sourceDetail, valueCategory } = fields; | ||
|
|
||
| // Core-valued memories are always promoted to L2, regardless of source. | ||
| if (valueCategory === 'core') return 'L2'; | ||
|
|
||
| // Hook auto-captures without core classification → L1 routing signal. | ||
| if (sourceDetail === 'hook') return 'L1'; | ||
|
|
||
| // Git-ingest is evidence-grounded but defaults to L3. | ||
| // Caller may choose to promote selectively (e.g., when L2 is thin). | ||
| if (sourceDetail === 'git-ingest') return 'L3'; | ||
|
|
||
| // Explicit, undefined/legacy, manual → L2 working context. | ||
| return 'L2'; | ||
| } |
There was a problem hiding this comment.
The header comment says L3 includes “git-ingest (non-core), or any other low-trust source”, but classifyLayer() currently routes any unknown sourceDetail to L2. Either update the comment to match the implementation (only git-ingest ⇒ L3) or extend the logic to treat unknown/non-enumerated sourceDetail values as L3 so future sources don’t get accidentally promoted into L2.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 98642f0554
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| sourceDetail: 'string' as const, | ||
| valueCategory: 'string' as const, |
There was a problem hiding this comment.
Preserve provenance fields during index hydration
After adding sourceDetail/valueCategory to the Orama schema, hydrateIndex() still builds documents without those fields, so persisted memories loaded on cold start lose provenance metadata in search output (and, if schema validation is strict, get silently skipped by the existing catch). This breaks the new provenance-aware retrieval/display path for any pre-existing observations until they are rewritten; hydration should include obs.sourceDetail and obs.valueCategory just like other insert paths.
Useful? React with 👍 / 👎.
| projectId: project.id, | ||
| topicKey: targetObs.topicKey, | ||
| progress: progress as import('./types.js').ProgressInfo | undefined, | ||
| sourceDetail: 'explicit', | ||
| }); |
There was a problem hiding this comment.
Propagate value category in formation merge/evolve writes
In the Formation merge/evolve paths, the upsert call now sets sourceDetail but does not pass formationResult.evaluation.category; upsertObservation() only updates existing.valueCategory when that field is provided. As a result, merged/evolved memories can keep stale/undefined categories and miss new core-dependent behaviors (e.g., retention immunity and L2 promotion) even when Formation classifies the incoming memory as core.
Useful? React with 👍 / 👎.
Summary
This PR packages the first three milestones of the 1.0.6 mainline into one reviewable checkpoint:
What changed
Phase 1: provenance foundation
sourceDetailandvalueCategoryto observation/index document shapesPhase 2: layered disclosure
session_startoutput into L1/L2/L3-oriented sectionsPhase 3: evidence retrieval
memorix_detailmemorix_timelineWhy
Memorix 1.0.6 is moving from a flat memory pool toward:
This PR establishes the foundation without jumping ahead into graph fusion, storage migration, or a full citation system.
Validation
npm run buildnpx vitest runDeliberately not included