Skip to content

[3570] Memory distill + cortex visibility integration across TUI/webchat#3571

Open
njfio wants to merge 15 commits intomasterfrom
codex/issue-3558-timeout-marker-clarity
Open

[3570] Memory distill + cortex visibility integration across TUI/webchat#3571
njfio wants to merge 15 commits intomasterfrom
codex/issue-3558-timeout-marker-clarity

Conversation

@njfio
Copy link
Owner

@njfio njfio commented Mar 1, 2026

Closes #3570

Issue

Spec

  • specs/3570-memory-distill-cortex-visibility.md

What changed

  • Expanded memory-distill candidate extraction coverage for additional user identity/location/constraint phrasing.
  • Enriched gateway memory-distill status contract with last-cycle counters and recent write summaries.
  • Added /memory-distill command alias in TUI (/memory retained) and surfaced richer memory-distill snapshot lines in sync output.
  • Added/updated targeted tests for extraction coverage, status contract fields, and TUI command/snapshot handling.

Why

  • Operators need deterministic evidence that distill is actively processing sessions and writing memory, with parity across gateway status and TUI surfaces.

Test evidence

  • cargo test -p tau-gateway unit_distill_candidates_extracts_alias_location_and_allergy_constraint -- --nocapture
  • cargo test -p tau-gateway integration_memory_distill_cycle_writes_memory_and_checkpoints_processed_entries -- --nocapture
  • cargo test -p tau-gateway integration_gateway_status_endpoint_returns_service_snapshot -- --nocapture
  • cargo test -p tau-tui unit_parse_local_tui_command_maps_dashboard_tools_routines_cortex_memory_sync_and_colors -- --nocapture
  • cargo test -p tau-tui integration_gateway_sync_snapshot_full_mode_reflects_status_contract_for_tools_and_cortex -- --nocapture
  • cargo test -p tau-tui functional_spec_c15_agent_launch_summary_is_compact_and_command_oriented -- --nocapture

Copilot AI review requested due to automatic review settings March 1, 2026 04:47
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3fb2b18abb

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +50 to +54
pub(super) fn start_memory_distill_runtime(
state: Arc<GatewayOpenResponsesServerState>,
heartbeat_enabled: bool,
heartbeat_interval: Duration,
) -> MemoryDistillRuntimeHandle {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Wire memory-distill runtime into gateway startup path

This new runtime entry point is effectively unreachable in this commit: there is no corresponding module wiring/call site in the gateway openresponses startup/status path, so distillation never runs and no live distill metrics are produced for /gateway/status. In practice, the feature added here remains dead code and operators cannot observe or benefit from memory distill behavior.

Useful? React with 👍 / 👎.

Comment on lines +4946 to +4950
let memory_distill_enabled = json_pointer_bool(
&status_payload,
"/gateway/web_ui/memory_distill_runtime/enabled",
)
.unwrap_or(false);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Surface missing distill fields instead of coercing to defaults

These fallbacks coerce absent memory_distill_runtime fields to false/0, so when the gateway payload does not include this contract the TUI reports a seemingly valid "disabled/zero" state instead of signaling missing data. That masks contract regressions and can mislead operators during incident triage because schema breakage appears as healthy telemetry.

Useful? React with 👍 / 👎.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to improve operator observability and parity for memory distill + cortex visibility across gateway status, TUI, and webchat/dashboard surfaces (Issue #3570). It also includes documentation/spec updates for the interactive start marker timeout contract (Issue #3558) and a new milestone spec entry.

Changes:

  • Added a new memory distill runtime implementation (checkpointing, candidate extraction, status snapshot fields, and tests).
  • Updated interactive progress start marker contract to emit both turn_timeout_ms and request_timeout_ms, and aligned docs/tests.
  • Added/updated specs and operator docs to reflect the above behavior/contracts.

Reviewed changes

Copilot reviewed 11 out of 12 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
specs/milestones/m325/index.md Adds milestone writeup for interactive timeout marker contract fix.
specs/3570-memory-distill-cortex-visibility.md Adds implementation spec for memory distill + observability parity.
specs/3558/tasks.md Documents completed task breakdown + verification commands for marker contract change.
specs/3558/spec.md Defines acceptance criteria and verification evidence for dual-timeout marker output.
specs/3558/plan.md Documents approach/risks for the marker contract change.
docs/guides/operator-deployment-guide.md Updates operator-facing marker examples to the new dual-timeout format.
crates/tau-gateway/src/gateway_openresponses/tests.rs Extends gateway tests for webchat/status alignment, tools fixtures, SSE events, and distill status fields.
crates/tau-gateway/src/gateway_openresponses/memory_distill_runtime.rs Introduces a background distill runtime with checkpointing, extraction, and write reporting.
crates/tau-coding-agent/src/startup_local_runtime.rs Wires request_timeout_ms into InteractiveRuntimeConfig.
crates/tau-coding-agent/src/runtime_loop.rs Updates interactive start marker formatting + tests to include both timeout domains.
README.md Updates interactive marker examples to the new dual-timeout format.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

}
};

let semantic_store = semantic_memory_store(state.config.memory_state_dir.as_path());
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This module references state.config.memory_state_dir and a semantic_memory_store(...) helper, but neither exists in the current tau-gateway codebase. As written, the code cannot compile once the module is wired in. Please either implement semantic_memory_store (or switch to an existing store constructor) and add memory_state_dir to GatewayOpenResponsesServerConfig, then ensure the module is declared (e.g. mod memory_distill_runtime;) so it’s actually built and exercised by tests.

Suggested change
let semantic_store = semantic_memory_store(state.config.memory_state_dir.as_path());
let semantic_store = state
.config
.state_dir
.join("openresponses")
.join("semantic_memory");

Copilot uses AI. Check for mistakes.
Comment on lines +88 to +93
let cycle_report = run_memory_distill_cycle(
state.as_ref(),
checkpoint_path.as_path(),
&mut checkpoints,
);
state.record_memory_distill_cycle_report(&cycle_report);
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run_memory_distill_cycle does synchronous filesystem and memory-store I/O, but it’s executed directly inside a tokio::spawn task on every tick. This can block the async runtime threads and degrade gateway latency. Run the cycle via tokio::task::spawn_blocking (or convert the inner I/O to tokio::fs) so the interval task stays non-blocking.

Suggested change
let cycle_report = run_memory_distill_cycle(
state.as_ref(),
checkpoint_path.as_path(),
&mut checkpoints,
);
state.record_memory_distill_cycle_report(&cycle_report);
// Move the current checkpoints state into a blocking task to avoid
// running synchronous I/O on the async runtime threads.
let checkpoints_in = std::mem::replace(
&mut checkpoints,
MemoryDistillCheckpointState::default(),
);
let state_for_blocking = state.clone();
let checkpoint_path_for_blocking = checkpoint_path.clone();
let blocking_handle = tokio::task::spawn_blocking(move || {
let mut checkpoints = checkpoints_in;
let cycle_report = run_memory_distill_cycle(
state_for_blocking.as_ref(),
checkpoint_path_for_blocking.as_path(),
&mut checkpoints,
);
(checkpoints, cycle_report)
});
match blocking_handle.await {
Ok((new_checkpoints, cycle_report)) => {
checkpoints = new_checkpoints;
state.record_memory_distill_cycle_report(&cycle_report);
}
Err(_) => {
// If the blocking task panicked or was cancelled, keep the
// default checkpoints state and skip reporting this cycle.
}
}

Copilot uses AI. Check for mistakes.
Comment on lines +808 to +809
if value.len() > 120 {
value.truncate(120);
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sanitize_distilled_value uses String::truncate(120), which panics if index 120 is not a UTF-8 character boundary (e.g., non-ASCII user text). That would crash the distill runtime on certain inputs. Truncate on a char boundary instead (e.g., take the first N chars() or compute the nearest valid boundary) to avoid panics.

Suggested change
if value.len() > 120 {
value.truncate(120);
if value.chars().count() > 120 {
value = value.chars().take(120).collect();

Copilot uses AI. Check for mistakes.
Comment on lines +715 to +724
if !value.ends_with('?') {
candidates.push(DistilledMemoryCandidate {
kind: "goal",
summary: format!("User goal: {value}"),
facts: vec![value.clone()],
tags: vec!["goal".to_string()],
memory_type: MemoryType::Goal,
importance: 0.85,
});
}
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The !value.ends_with('?') guard is ineffective because sanitize_distilled_value truncates at ?, so the returned value will never end with ?. If the intent is to skip extracting goals from questions, inspect the raw tail/original text for a trailing ? (or have sanitize_distilled_value report the terminator) instead of checking the sanitized value.

Suggested change
if !value.ends_with('?') {
candidates.push(DistilledMemoryCandidate {
kind: "goal",
summary: format!("User goal: {value}"),
facts: vec![value.clone()],
tags: vec!["goal".to_string()],
memory_type: MemoryType::Goal,
importance: 0.85,
});
}
candidates.push(DistilledMemoryCandidate {
kind: "goal",
summary: format!("User goal: {value}"),
facts: vec![value.clone()],
tags: vec!["goal".to_string()],
memory_type: MemoryType::Goal,
importance: 0.85,
});

Copilot uses AI. Check for mistakes.
Comment on lines +727 to +735
if let Some(value) = extract_phrase_value(&normalized, &["i can't", "i cannot"]) {
candidates.push(DistilledMemoryCandidate {
kind: "constraint",
summary: format!("User constraint: {value}"),
facts: vec![value.clone()],
tags: vec!["constraint".to_string()],
memory_type: MemoryType::Fact,
importance: 0.7,
});
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Constraint extraction for "i can't"/"i cannot" drops the negation (e.g. "I can't use sudo" becomes "User constraint: use sudo"), which flips the meaning. Preserve the negation in the extracted summary/fact (e.g. store "cannot use sudo" or include an explicit constraint_cannot= style fact) so the distilled memory is semantically correct.

Copilot uses AI. Check for mistakes.
Comment on lines +831 to +846
fn build_distilled_memory_id(
session_key: &str,
entry_id: u64,
candidate: &DistilledMemoryCandidate,
candidate_index: usize,
) -> String {
let material = format!(
"session={session_key}|entry_id={entry_id}|kind={}|index={candidate_index}|summary={}",
candidate.kind, candidate.summary
);
let digest = fnv1a64_hex(material.as_bytes());
format!(
"auto:{}:entry:{}:{}:{}",
session_key, entry_id, candidate.kind, digest
)
}
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

build_distilled_memory_id includes candidate_index in the hashed material. Any future change in candidate ordering (e.g., adding a new extractor earlier) will change IDs for the same underlying fact, breaking idempotence and potentially duplicating memories. Prefer an ID derived only from stable inputs (session_key, entry_id, kind, and/or normalized extracted value) without the positional index.

Copilot uses AI. Check for mistakes.
session_lock_wait_ms: 500,
session_lock_stale_ms: 10_000,
state_dir: root.join(".tau/gateway"),
memory_state_dir: root.join(".tau/memory"),
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GatewayOpenResponsesServerConfig does not define a memory_state_dir field (see gateway_openresponses/server_state.rs), so this struct literal will not compile. Either add memory_state_dir: PathBuf to the config/state (and wire it through) or remove/replace this field here and in the other config literals added in this PR.

Suggested change
memory_state_dir: root.join(".tau/memory"),

Copilot uses AI. Check for mistakes.
@njfio
Copy link
Owner Author

njfio commented Mar 1, 2026

Follow-up fix pushed for blank-assistant failure cases.\n\nWhat changed:\n- turn.failed structured events now append an Assistant pane line: assistant error: when no assistant answer text exists in-turn.\n- Plain stderr failure lines (interactive turn failed:, request timed out, request cancelled) now also append the same assistant-visible failure fallback.\n- Duplicate identical failure lines are deduped to reduce noise.\n\nVerification:\n- cargo test -p tau-tui regression_failed_turn_event_adds_assistant_failure_line_when_no_answer_text_exists -- --nocapture\n- cargo test -p tau-tui regression_failed_turn_event_dedupes_identical_assistant_failure_lines -- --nocapture\n- cargo test -p tau-tui regression_plain_failed_turn_line_adds_assistant_failure_line -- --nocapture

@njfio
Copy link
Owner Author

njfio commented Mar 1, 2026

Additional fix pushed based on latest TUI screenshot:\n\n- New turns now always reset stale failed progress to on , even if arrives as a string (or missing).\n- This prevents the Turn panel from showing previous state while a fresh turn is already in progress.\n\nRegression added:\n- \n\nValidation run:\n-
running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 30 filtered out; finished in 0.00s

running 1 test
test tests::regression_turn_submitted_resets_failed_progress_even_with_string_prompt_chars ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 74 filtered out; finished in 0.00s

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 5 filtered out; finished in 0.00s\n-
running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 30 filtered out; finished in 0.00s

running 1 test
test tests::regression_plain_failed_turn_line_adds_assistant_failure_line ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 74 filtered out; finished in 0.00s

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 5 filtered out; finished in 0.00s\n-
running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 30 filtered out; finished in 0.00s

running 1 test
test tests::regression_failed_turn_event_adds_assistant_failure_line_when_no_answer_text_exists ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 74 filtered out; finished in 0.00s

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 5 filtered out; finished in 0.00s

@njfio
Copy link
Owner Author

njfio commented Mar 1, 2026

Additional fix pushed based on latest TUI screenshot.\n\n- New turns now always reset stale failed progress to queued on turn.submitted, even if prompt_chars arrives as a string or is missing.\n- This prevents the Turn panel from showing previous failed state while a fresh turn is already in progress.\n\nRegression added:\n- regression_turn_submitted_resets_failed_progress_even_with_string_prompt_chars\n\nValidation run:\n- cargo test -p tau-tui regression_turn_submitted_resets_failed_progress_even_with_string_prompt_chars -- --nocapture\n- cargo test -p tau-tui regression_plain_failed_turn_line_adds_assistant_failure_line -- --nocapture\n- cargo test -p tau-tui regression_failed_turn_event_adds_assistant_failure_line_when_no_answer_text_exists -- --nocapture

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Memory distill + cortex visibility integration across TUI and webchat

2 participants