Skip to content

develop#665

Merged
wileland merged 19 commits intomainfrom
develop
Mar 4, 2026
Merged

develop#665
wileland merged 19 commits intomainfrom
develop

Conversation

@wileland
Copy link
Owner

@wileland wileland commented Mar 4, 2026

wileland and others added 18 commits February 26, 2026 00:05
* chore(codex): phase2 enrichment relay task

* feat(enrichment): add Phase 2 relay stage
…ifact-registry-v0) (#650)

* feat: oracle foundation (MemoryChunk receipts + context-pack-v0 + artifact-registry-v0)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(specs): remove leading --- from context-pack-v0 and artifact-registry-v0

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…ext pack (#654)

* feat(oracle): wire Atlas $vectorSearch retrieval into reflection context pack

Implements Phase 3 of Oracle retrieval (context-pack-v0.md):
- In reflectEntryWithContext (already-async layer), embed the entry
  transcript via generateEmbedding and run MemoryChunk.aggregate with
  $vectorSearch (index=VECTOR_INDEX_NAME, path=embedding, top-k=5,
  userId tenancy filter).
- Hallucination firewall: only chunks scoring >= 0.72 are mapped into
  retrievedReceipts ({content, messageIds, score}).
- Soft-fail: try/catch around the entire retrieval block; reflection
  continues unconditionally if Oracle retrieval throws.
- buildMCPContext signature unchanged (sealed caller in reflection.worker.js).
- 8 new Oracle retrieval tests covering score gating, soft-fail paths,
  userId filter assertion, and messageIds fallback.

All 285 tests pass (5 skipped).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(memory): persist userId tenancy on MemoryChunk ingest

* fix(librarian): pass userId to knowledge-store for tenant-scoped chunks

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…655)

* feat(archive): Phase 4A sovereign archive pipeline

Implements the full sovereign archive ingestion pipeline for Echo Doj0:

- Task 0: MemoryChunk.js schema — adds 5 Phase 4A fields (chunkId,
  densityScore, sourceRole, supabaseUserId, sourceFileSha256) plus
  two new indexes: { userId, sourceRole } and { chunkId } unique sparse.

- Task 7: TDD test suite (parseChatGptExport, chunkConversations,
  scoreDensity) — 342 passed, 5 skipped, 0 failed.

- Task 1: parseChatGptExport.js — pure fn, user-only (doctrine),
  SHA-256 textHash, sourceId = chatgpt:{convId}:{msgId}.

- Task 2: chunkConversations.js — scene merging ≤4000 chars,
  deterministic sort (createdAt + messageId tie-break), stable chunkId,
  MIN_CHUNK_CHARS filter, full provenance on every chunk.

- Task 3: scoreDensity.js + applyKeepRatio — density scoring with
  length/unique/sentence factors; keepRatio slice with tie-break sort.

- Task 4: embedAndStore.js — library module (no dotenv), concurrency-5
  worker pool, 3-attempt retry with exponential backoff, upsert on
  { chunkId } for true idempotency.

- Task 5: importChatGptExport.js — CLI entrypoint; normalizes array
  or {conversations:[]} export shapes; dryRun + quarantineOut modes;
  dynamic import of embedAndStore to avoid ESM hoisting / openai init
  race with dotenv.

- Task 6: validateRetrieval.js — CLI smoke test; Atlas $vectorSearch
  with userId filter; prints receipts with score ≥ 0.72.

ESM dotenv fix: config/openai.js evaluates OPENAI_API_KEY at module
load time. CLI entrypoints use dynamic import() for OpenAI-dependent
modules so dotenv.config() runs first in the module body.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(archive): hard-split oversized chunks

Add splitMessage() with layered split strategy (paragraph → sentence → hard-slice)
so every output chunk satisfies content.length <= MAX_CHUNK_CHARS. Single messages
larger than MAX_CHUNK_CHARS (e.g. the 54 595-char quarantine case) are expanded into
sub-parts before the scene-merging loop, closing the bypass that occurred when buffer
was empty.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(archive): --dir batch ingestion + manifest + resume

Add --dir batch mode to importChatGptExport.js:
- Glob conversations-*.json in target dir (lexical, no recursion)
- Process shards sequentially, never in parallel
- Write JSONL manifest entry per file (APPEND, never overwrite):
  {file, sourceFileSha256, startedAt, endedAt, parsedCount,
   chunkedCount, promotedCount, storedCount, skippedCount,
   errorCount, status, error}
- --resume=true: skip paths already status=success in manifest,
  logging "⏭️  Skipping (already success): <file>"
- File-level failures write failed entry and continue; never abort
- --quarantineOut APPENDs across all shards (unified JSONL)
- Prints summary: ✅ Completed | ⏭️ Skipped | ❌ Failed | 💾 Total
- Default manifestOut: ./.backup/import_manifest.jsonl
- --file and --dir are mutually exclusive → hard error if both
- Single-file mode (--file) unchanged; no manifest written

Dry-run verified: 16 shards, 1855 promoted chunks, 0 errors
Live import: 143 chunks stored, all manifest entries success
Atlas: max content 4000 chars, 0 oversized

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(deps): update package manifest and lockfile for archive pipeline

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…ce receipts (#659) (#659)

Replace all 4 agent fn stubs with real implementations:
- scribeFn: Whisper transcription + SHA-256 provenance anchor
- enrichmentFn: GPT-4o-mini extraction (Mirror not Oracle)
- reflectionFn: Assembly line (extract → validate → compare → generate)
- archivistFn: MongoDB write with full provenance trail

Add 4 pipeline stages (server/src/agents/pipeline/):
- extractStage: embed transcript via text-embedding-3-small
- validateStage: Atlas $vectorSearch with userId tenancy + score >= 0.72
- compareStage: SHA-256 integrity check, drops tampered chunks
- generateStage: GPT-4o-mini reflection grounded in receipt chunks

Upgrade ReflectionOutput to z.discriminatedUnion on receiptStatus:
- Path A (receipts_found): grounded reflection + content guard + orbXp
- Path B (no_receipts_available): present-tense only, orbXp=0

Content guard enforces provenance: historical claims must match receipt
quote field or throw INVALID_OUTPUT. Hash integrity drops mismatched
chunks with sanitized warnings (no user content in logs).

425 passed, 0 failed, 5 skipped (31 new tests).

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat(chat): Phase 6 Context Pack and Companion Chat v1

- 4 new Mongoose models: ChatSession, ChatMessage, ContextSnapshot, ContextPack
- Retrieval adapter maps chat domain to Phase 5 pipeline (text → transcript)
- Companion Chat: gpt-4o-mini, temperature 0, receipts-or-silence prompts
- Context Pack: cryptographically anchored save state (seedHash, not seedText)
- GraphQL mutations/queries in main tree (server/graphql/), not agent tree
- ContextSnapshot audit trail for every chat response and context pack
- 18 new test assertions across 3 test files (445 total passing, 0 regressions)
- Zero Entry modifications — chat is a separate domain
- Tenancy filter (userId) enforced on all vector queries via sealed pipeline

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(chat): merge typedefs + verify jwt + mutation context pack

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…terface (#662)

Adds the client-side companion chat UI that calls the Phase 6 GraphQL
surface (companionChat, buildContextPack, getContextPack). Receipt
status badges distinguish grounded vs present-moment responses. The
ReceiptsDrawer renders verified user quotes with truncated textHash
and sourceEntryId — no AI-generated text inside the drawer.

New components: CompanionChat, ChatMessage, ChatInput, ReceiptsDrawer,
ContextPackPanel, ChatPage. Route at /companion with sidebar nav link.
7 frontend test cases covering optimistic render, mutation shape,
receipt badge paths, drawer content, and sessionId persistence.

No server files modified. Backend: 445 passed, 0 failed.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…Persona lens (#663)

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…nfig, seed, README (#664)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link

github-actions bot commented Mar 4, 2026

Audit

Audit artifacts not found.

Schema

No drift detected ✅

Codex

No Codex outputs yet.

@wileland wileland merged commit 843b60f into main Mar 4, 2026
3 checks passed
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3b43082bd8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +37 to +38
const { data } = await companionChat({
variables: { sessionId, message: text },

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Send auth token when calling companion chat mutation

This mutation is invoked without any Authorization header/context, but the new server resolver path requires a verified JWT and throws Authentication required when token is empty (server/graphql/resolvers/chat.js getUserIdFromContext). With the current Apollo setup only static headers are sent, so /companion requests will fail for normal browser sessions unless the token is explicitly attached per request (the same pattern appears in ContextPackPanel and ArenaPortal).

Useful? React with 👍 / 👎.

if (!entryId) return;

try {
await createEnrichmentTask(entryId);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve reflection enqueue after introducing enrichment stage

The scribe completion listener now enqueues enrichment instead of reflection, and reflection is only enqueued from enrichment worker events; in the standard server startup path (server/index.js) only scribe/reflection workers are attached, so when running queue mode without a separately started enrichment runner, entries stop progressing after scribe completion. This creates a pipeline dead-end for that deployment mode.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant