[3570] Memory distill + cortex visibility integration across TUI/webchat#3571
[3570] Memory distill + cortex visibility integration across TUI/webchat#3571
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3fb2b18abb
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| pub(super) fn start_memory_distill_runtime( | ||
| state: Arc<GatewayOpenResponsesServerState>, | ||
| heartbeat_enabled: bool, | ||
| heartbeat_interval: Duration, | ||
| ) -> MemoryDistillRuntimeHandle { |
There was a problem hiding this comment.
Wire memory-distill runtime into gateway startup path
This new runtime entry point is effectively unreachable in this commit: there is no corresponding module wiring/call site in the gateway openresponses startup/status path, so distillation never runs and no live distill metrics are produced for /gateway/status. In practice, the feature added here remains dead code and operators cannot observe or benefit from memory distill behavior.
Useful? React with 👍 / 👎.
| let memory_distill_enabled = json_pointer_bool( | ||
| &status_payload, | ||
| "/gateway/web_ui/memory_distill_runtime/enabled", | ||
| ) | ||
| .unwrap_or(false); |
There was a problem hiding this comment.
Surface missing distill fields instead of coercing to defaults
These fallbacks coerce absent memory_distill_runtime fields to false/0, so when the gateway payload does not include this contract the TUI reports a seemingly valid "disabled/zero" state instead of signaling missing data. That masks contract regressions and can mislead operators during incident triage because schema breakage appears as healthy telemetry.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Pull request overview
This PR aims to improve operator observability and parity for memory distill + cortex visibility across gateway status, TUI, and webchat/dashboard surfaces (Issue #3570). It also includes documentation/spec updates for the interactive start marker timeout contract (Issue #3558) and a new milestone spec entry.
Changes:
- Added a new memory distill runtime implementation (checkpointing, candidate extraction, status snapshot fields, and tests).
- Updated interactive progress start marker contract to emit both
turn_timeout_msandrequest_timeout_ms, and aligned docs/tests. - Added/updated specs and operator docs to reflect the above behavior/contracts.
Reviewed changes
Copilot reviewed 11 out of 12 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| specs/milestones/m325/index.md | Adds milestone writeup for interactive timeout marker contract fix. |
| specs/3570-memory-distill-cortex-visibility.md | Adds implementation spec for memory distill + observability parity. |
| specs/3558/tasks.md | Documents completed task breakdown + verification commands for marker contract change. |
| specs/3558/spec.md | Defines acceptance criteria and verification evidence for dual-timeout marker output. |
| specs/3558/plan.md | Documents approach/risks for the marker contract change. |
| docs/guides/operator-deployment-guide.md | Updates operator-facing marker examples to the new dual-timeout format. |
| crates/tau-gateway/src/gateway_openresponses/tests.rs | Extends gateway tests for webchat/status alignment, tools fixtures, SSE events, and distill status fields. |
| crates/tau-gateway/src/gateway_openresponses/memory_distill_runtime.rs | Introduces a background distill runtime with checkpointing, extraction, and write reporting. |
| crates/tau-coding-agent/src/startup_local_runtime.rs | Wires request_timeout_ms into InteractiveRuntimeConfig. |
| crates/tau-coding-agent/src/runtime_loop.rs | Updates interactive start marker formatting + tests to include both timeout domains. |
| README.md | Updates interactive marker examples to the new dual-timeout format. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } | ||
| }; | ||
|
|
||
| let semantic_store = semantic_memory_store(state.config.memory_state_dir.as_path()); |
There was a problem hiding this comment.
This module references state.config.memory_state_dir and a semantic_memory_store(...) helper, but neither exists in the current tau-gateway codebase. As written, the code cannot compile once the module is wired in. Please either implement semantic_memory_store (or switch to an existing store constructor) and add memory_state_dir to GatewayOpenResponsesServerConfig, then ensure the module is declared (e.g. mod memory_distill_runtime;) so it’s actually built and exercised by tests.
| let semantic_store = semantic_memory_store(state.config.memory_state_dir.as_path()); | |
| let semantic_store = state | |
| .config | |
| .state_dir | |
| .join("openresponses") | |
| .join("semantic_memory"); |
| let cycle_report = run_memory_distill_cycle( | ||
| state.as_ref(), | ||
| checkpoint_path.as_path(), | ||
| &mut checkpoints, | ||
| ); | ||
| state.record_memory_distill_cycle_report(&cycle_report); |
There was a problem hiding this comment.
run_memory_distill_cycle does synchronous filesystem and memory-store I/O, but it’s executed directly inside a tokio::spawn task on every tick. This can block the async runtime threads and degrade gateway latency. Run the cycle via tokio::task::spawn_blocking (or convert the inner I/O to tokio::fs) so the interval task stays non-blocking.
| let cycle_report = run_memory_distill_cycle( | |
| state.as_ref(), | |
| checkpoint_path.as_path(), | |
| &mut checkpoints, | |
| ); | |
| state.record_memory_distill_cycle_report(&cycle_report); | |
| // Move the current checkpoints state into a blocking task to avoid | |
| // running synchronous I/O on the async runtime threads. | |
| let checkpoints_in = std::mem::replace( | |
| &mut checkpoints, | |
| MemoryDistillCheckpointState::default(), | |
| ); | |
| let state_for_blocking = state.clone(); | |
| let checkpoint_path_for_blocking = checkpoint_path.clone(); | |
| let blocking_handle = tokio::task::spawn_blocking(move || { | |
| let mut checkpoints = checkpoints_in; | |
| let cycle_report = run_memory_distill_cycle( | |
| state_for_blocking.as_ref(), | |
| checkpoint_path_for_blocking.as_path(), | |
| &mut checkpoints, | |
| ); | |
| (checkpoints, cycle_report) | |
| }); | |
| match blocking_handle.await { | |
| Ok((new_checkpoints, cycle_report)) => { | |
| checkpoints = new_checkpoints; | |
| state.record_memory_distill_cycle_report(&cycle_report); | |
| } | |
| Err(_) => { | |
| // If the blocking task panicked or was cancelled, keep the | |
| // default checkpoints state and skip reporting this cycle. | |
| } | |
| } |
| if value.len() > 120 { | ||
| value.truncate(120); |
There was a problem hiding this comment.
sanitize_distilled_value uses String::truncate(120), which panics if index 120 is not a UTF-8 character boundary (e.g., non-ASCII user text). That would crash the distill runtime on certain inputs. Truncate on a char boundary instead (e.g., take the first N chars() or compute the nearest valid boundary) to avoid panics.
| if value.len() > 120 { | |
| value.truncate(120); | |
| if value.chars().count() > 120 { | |
| value = value.chars().take(120).collect(); |
| if !value.ends_with('?') { | ||
| candidates.push(DistilledMemoryCandidate { | ||
| kind: "goal", | ||
| summary: format!("User goal: {value}"), | ||
| facts: vec![value.clone()], | ||
| tags: vec!["goal".to_string()], | ||
| memory_type: MemoryType::Goal, | ||
| importance: 0.85, | ||
| }); | ||
| } |
There was a problem hiding this comment.
The !value.ends_with('?') guard is ineffective because sanitize_distilled_value truncates at ?, so the returned value will never end with ?. If the intent is to skip extracting goals from questions, inspect the raw tail/original text for a trailing ? (or have sanitize_distilled_value report the terminator) instead of checking the sanitized value.
| if !value.ends_with('?') { | |
| candidates.push(DistilledMemoryCandidate { | |
| kind: "goal", | |
| summary: format!("User goal: {value}"), | |
| facts: vec![value.clone()], | |
| tags: vec!["goal".to_string()], | |
| memory_type: MemoryType::Goal, | |
| importance: 0.85, | |
| }); | |
| } | |
| candidates.push(DistilledMemoryCandidate { | |
| kind: "goal", | |
| summary: format!("User goal: {value}"), | |
| facts: vec![value.clone()], | |
| tags: vec!["goal".to_string()], | |
| memory_type: MemoryType::Goal, | |
| importance: 0.85, | |
| }); |
| if let Some(value) = extract_phrase_value(&normalized, &["i can't", "i cannot"]) { | ||
| candidates.push(DistilledMemoryCandidate { | ||
| kind: "constraint", | ||
| summary: format!("User constraint: {value}"), | ||
| facts: vec![value.clone()], | ||
| tags: vec!["constraint".to_string()], | ||
| memory_type: MemoryType::Fact, | ||
| importance: 0.7, | ||
| }); |
There was a problem hiding this comment.
Constraint extraction for "i can't"/"i cannot" drops the negation (e.g. "I can't use sudo" becomes "User constraint: use sudo"), which flips the meaning. Preserve the negation in the extracted summary/fact (e.g. store "cannot use sudo" or include an explicit constraint_cannot= style fact) so the distilled memory is semantically correct.
| fn build_distilled_memory_id( | ||
| session_key: &str, | ||
| entry_id: u64, | ||
| candidate: &DistilledMemoryCandidate, | ||
| candidate_index: usize, | ||
| ) -> String { | ||
| let material = format!( | ||
| "session={session_key}|entry_id={entry_id}|kind={}|index={candidate_index}|summary={}", | ||
| candidate.kind, candidate.summary | ||
| ); | ||
| let digest = fnv1a64_hex(material.as_bytes()); | ||
| format!( | ||
| "auto:{}:entry:{}:{}:{}", | ||
| session_key, entry_id, candidate.kind, digest | ||
| ) | ||
| } |
There was a problem hiding this comment.
build_distilled_memory_id includes candidate_index in the hashed material. Any future change in candidate ordering (e.g., adding a new extractor earlier) will change IDs for the same underlying fact, breaking idempotence and potentially duplicating memories. Prefer an ID derived only from stable inputs (session_key, entry_id, kind, and/or normalized extracted value) without the positional index.
| session_lock_wait_ms: 500, | ||
| session_lock_stale_ms: 10_000, | ||
| state_dir: root.join(".tau/gateway"), | ||
| memory_state_dir: root.join(".tau/memory"), |
There was a problem hiding this comment.
GatewayOpenResponsesServerConfig does not define a memory_state_dir field (see gateway_openresponses/server_state.rs), so this struct literal will not compile. Either add memory_state_dir: PathBuf to the config/state (and wire it through) or remove/replace this field here and in the other config literals added in this PR.
| memory_state_dir: root.join(".tau/memory"), |
|
Follow-up fix pushed for blank-assistant failure cases.\n\nWhat changed:\n- turn.failed structured events now append an Assistant pane line: assistant error: when no assistant answer text exists in-turn.\n- Plain stderr failure lines (interactive turn failed:, request timed out, request cancelled) now also append the same assistant-visible failure fallback.\n- Duplicate identical failure lines are deduped to reduce noise.\n\nVerification:\n- cargo test -p tau-tui regression_failed_turn_event_adds_assistant_failure_line_when_no_answer_text_exists -- --nocapture\n- cargo test -p tau-tui regression_failed_turn_event_dedupes_identical_assistant_failure_lines -- --nocapture\n- cargo test -p tau-tui regression_plain_failed_turn_line_adds_assistant_failure_line -- --nocapture |
|
Additional fix pushed based on latest TUI screenshot:\n\n- New turns now always reset stale failed progress to on , even if arrives as a string (or missing).\n- This prevents the Turn panel from showing previous state while a fresh turn is already in progress.\n\nRegression added:\n- \n\nValidation run:\n- test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 30 filtered out; finished in 0.00s running 1 test test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 74 filtered out; finished in 0.00s running 0 tests test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 5 filtered out; finished in 0.00s\n- test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 30 filtered out; finished in 0.00s running 1 test test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 74 filtered out; finished in 0.00s running 0 tests test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 5 filtered out; finished in 0.00s\n- test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 30 filtered out; finished in 0.00s running 1 test test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 74 filtered out; finished in 0.00s running 0 tests test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 5 filtered out; finished in 0.00s |
|
Additional fix pushed based on latest TUI screenshot.\n\n- New turns now always reset stale failed progress to queued on turn.submitted, even if prompt_chars arrives as a string or is missing.\n- This prevents the Turn panel from showing previous failed state while a fresh turn is already in progress.\n\nRegression added:\n- regression_turn_submitted_resets_failed_progress_even_with_string_prompt_chars\n\nValidation run:\n- cargo test -p tau-tui regression_turn_submitted_resets_failed_progress_even_with_string_prompt_chars -- --nocapture\n- cargo test -p tau-tui regression_plain_failed_turn_line_adds_assistant_failure_line -- --nocapture\n- cargo test -p tau-tui regression_failed_turn_event_adds_assistant_failure_line_when_no_answer_text_exists -- --nocapture |
Closes #3570
Issue
Spec
What changed
/memory-distillcommand alias in TUI (/memoryretained) and surfaced richer memory-distill snapshot lines in sync output.Why
Test evidence
cargo test -p tau-gateway unit_distill_candidates_extracts_alias_location_and_allergy_constraint -- --nocapturecargo test -p tau-gateway integration_memory_distill_cycle_writes_memory_and_checkpoints_processed_entries -- --nocapturecargo test -p tau-gateway integration_gateway_status_endpoint_returns_service_snapshot -- --nocapturecargo test -p tau-tui unit_parse_local_tui_command_maps_dashboard_tools_routines_cortex_memory_sync_and_colors -- --nocapturecargo test -p tau-tui integration_gateway_sync_snapshot_full_mode_reflects_status_contract_for_tools_and_cortex -- --nocapturecargo test -p tau-tui functional_spec_c15_agent_launch_summary_is_compact_and_command_oriented -- --nocapture