feat: session transcripts for LLM KV cache stability#512
feat: session transcripts for LLM KV cache stability#512senamakel merged 6 commits intotinyhumansai:mainfrom
Conversation
…e stability - Added support for session transcript persistence, allowing the exact `Vec<ChatMessage>` to be stored as human-readable `.md` files. - Introduced methods for loading previous transcripts to pre-populate messages for KV cache reuse, enhancing session continuity. - Updated `AgentBuilder` to include an agent definition name for transcript file naming. - Refactored `Agent` to handle transcript loading and persistence during session turns, ensuring byte-identical message handling across sessions. - Created a new `transcript.rs` module to encapsulate transcript-related functionality, improving code organization and maintainability.
…tadata - Added `usage` and `openhuman` fields to the `ApiChatResponse` struct to capture standard OpenAI usage metrics and OpenHuman backend metadata. - Introduced `ApiUsage`, `OpenHumanMeta`, and related structs to encapsulate detailed usage and billing information. - Updated `parse_native_response` and `extract_usage` methods to process and return usage data, improving the integration of usage tracking in chat responses. - Modified the `UsageInfo` struct in traits to include new fields for cached input tokens and charged amount, enhancing the overall usage reporting capabilities.
…der responses - Added usage extraction functionality to the `OpenAiCompatibleProvider`, enhancing the API response structure to include usage metrics. - Updated the `parse_native_response` method to utilize a new `wrap_message` function for improved testability and clarity in response handling. - Refactored tests to accommodate the new usage extraction logic, ensuring comprehensive coverage of the updated response structure.
…e tracking - Added functionality to persist sub-agent conversation transcripts, enabling inspection for debugging and KV cache analysis. - Introduced a new `persist_subagent_transcript` function to handle transcript writing, including metadata for input/output tokens and charged amounts. - Updated the `Agent` implementation to accumulate usage statistics across iterations and include them in the session transcript. - Enhanced the `write_transcript` function to include additional metadata fields, improving the detail of stored transcripts. - Refactored tests to validate the new transcript persistence and usage tracking features, ensuring comprehensive coverage.
- Reformatted the `write_transcript` and `read_transcript` functions for better clarity by adjusting line breaks and indentation. - Enhanced the `parse_meta` function to improve readability by restructuring the `get` closure. - Updated test cases to maintain consistency with the new formatting, ensuring clarity in the test structure. - Overall, these changes aim to enhance code maintainability and readability without altering functionality.
📝 WalkthroughWalkthroughAdds session transcript persistence and resume: new transcript module for read/write/find/resolve, agent-level fields and builder setter for agent name, turn/subagent code to load and persist transcripts and aggregate usage, and provider UsageInfo extended with cached tokens and billing fields. Changes
Sequence Diagram(s)sequenceDiagram
participant Agent as Agent
participant Transcript as Transcript Module
participant FS as Filesystem
participant Provider as Provider
Note over Agent,Transcript: Turn start (resume attempt)
Agent->>Agent: if history empty && cached_transcript_messages.is_none()
Agent->>Transcript: find_latest_transcript(workspace, agent_name)
Transcript->>FS: scan today's and yesterday's session dirs
FS-->>Transcript: return latest path?
alt path found
Agent->>Transcript: read_transcript(path)
Transcript->>FS: read & parse .md into messages + meta
FS-->>Transcript: SessionTranscript
Transcript-->>Agent: provide messages + meta
Agent->>Agent: set cached_transcript_messages & cache_boundary
else no path
Transcript-->>Agent: none
end
Note over Agent,Provider: Provider interaction loop
loop per provider iteration
alt resumed & first iteration
Agent->>Agent: consume cached prefix, append history delta
else
Agent->>Agent: build messages from full history
end
Agent->>Provider: send messages
Provider-->>Agent: response + usage
Agent->>Agent: accumulate usage & capture last-sent messages
end
Note over Agent,Transcript: Turn completion
alt turn succeeded
Agent->>Transcript: persist_session_transcript(path?, messages, aggregated_usage, meta)
Transcript->>FS: create date dirs & write .md with metadata + <!--MSG--> blocks
FS-->>Transcript: write success/failure
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (3)
src/openhuman/providers/compatible.rs (1)
1066-1099: Add debug trace for usage-source selection inextract_usage.This branch is critical for token accounting and billing diagnostics; please add a structured debug log indicating whether values came from
openhumanor standardusageand whether billing metadata was present (without sensitive payload logging).
As per coding guidelines:src/**/*.rs: Add substantial, development-oriented logs on new/changed flows; include logs at branch decisions and error handling, with structured stable prefixes.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/providers/compatible.rs` around lines 1066 - 1099, In extract_usage, add a structured debug trace that logs which source provided token counts (openhuman vs standard usage) and whether billing metadata was present: inspect oh, std_usage, oh_usage, and oh_billing and emit a development-level logger call (with a stable prefix) that records booleans like from_openhuman, from_standard, and has_billing and the numeric token values (input_tokens, output_tokens, cached_input_tokens) but no sensitive payloads; ensure the log is placed after computing the token variables and before returning the ProviderUsageInfo so the instrumentation reflects the branch decision used to populate ProviderUsageInfo.src/openhuman/agent/harness/subagent_runner.rs (1)
432-444: Sub-agent transcripts lack usage metrics.The token and cost fields are hardcoded to zero, whereas the main agent's
turn.rsaccumulates actual usage from provider responses. For consistency and debugging value, consider threading usage info fromrun_inner_loopto the transcript. Currentlyrun_inner_loopdiscards theChatResponse.usagedata.This is acceptable for the initial implementation since transcripts are primarily for KV cache prefix replay, and sub-agent usage is typically a small fraction of total cost.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/agent/harness/subagent_runner.rs` around lines 432 - 444, TranscriptMeta is being populated with zeroed usage and cost fields; update the pipeline to thread actual usage from run_inner_loop into the transcript: modify run_inner_loop (and its callers) to return or expose the ChatResponse.usage (or an aggregated Usage struct) instead of discarding it, sum input/output/token usage and charged amount across sub-agent provider responses, and then set TranscriptMeta.input_tokens, output_tokens, cached_input_tokens (if applicable), and charged_amount_usd from that aggregated Usage when constructing TranscriptMeta in subagent_runner.rs so transcripts reflect real usage for debugging and accounting.src/openhuman/agent/harness/session/transcript.rs (1)
207-213: Edge case in escape/unescape.If content contains the literal escape sequence
<!--\/MSG-->, it will be corrupted to<!--/MSG-->on round-trip. This is extremely unlikely in practice (LLM conversations won't contain this exact byte sequence), but worth documenting in a code comment for future maintainers.📝 Suggested documentation
fn escape_content(content: &str) -> String { + // Note: if content contains the literal escape sequence `<!--\/MSG-->`, + // it will be incorrectly unescaped on read. This is an accepted limitation + // for an extremely unlikely edge case. content.replace(MSG_CLOSE, MSG_CLOSE_ESCAPED) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/agent/harness/session/transcript.rs` around lines 207 - 213, The current escape_content and unescape_content functions perform a simple replace between MSG_CLOSE and MSG_CLOSE_ESCAPED but are not fully reversible if the original content already contains the literal escape sequence (<!--\/MSG-->), which will be transformed on round-trip; add a concise code comment above the functions escape_content and unescape_content documenting this edge case (mention the exact sequences MSG_CLOSE and MSG_CLOSE_ESCAPED and that an input containing MSG_CLOSE_ESCAPED will be corrupted on unescape), and optionally note that a more robust fix would be to use a non-ambiguous encoding or a two-pass escaping scheme if full reversibility is later required.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/openhuman/providers/compatible.rs`:
- Around line 1000-1008: The helper parse_native_response currently returns a
successful ProviderChatResponse with text None when api_response.choices is
empty; change parse_native_response to return a Result<ProviderChatResponse,
Error> (or existing error type) and propagate an Err when
choices.into_iter().next() is None so empty choices are treated as an error;
update callers (e.g., chat() and the other occurrence around
parse_native_response usage at the second site) to handle the Result (propagate
or map the error) to keep the contract consistent and avoid silently accepting
invalid payloads.
- Around line 333-343: OpenHumanUsage currently defines non-optional u64 token
fields (input_tokens/output_tokens/total_tokens/cached_input_tokens) with
#[serde(default)] which deserializes missing fields to 0 and causes
oh_usage.map(|u| u.input_tokens) to return Some(0) and block fallback; change
those fields to Option<u64> (e.g., input_tokens: Option<u64>) and update
extract_usage to use oh_usage.and_then(|u| u.input_tokens) / and_then for each
token so missing keys yield None and allow the fallback chain to proceed; also
add a debug/trace log in extract_usage noting which branch provided the usage
(OpenHuman vs standard source) when making the branch decision so flow changes
are observable.
---
Nitpick comments:
In `@src/openhuman/agent/harness/session/transcript.rs`:
- Around line 207-213: The current escape_content and unescape_content functions
perform a simple replace between MSG_CLOSE and MSG_CLOSE_ESCAPED but are not
fully reversible if the original content already contains the literal escape
sequence (<!--\/MSG-->), which will be transformed on round-trip; add a concise
code comment above the functions escape_content and unescape_content documenting
this edge case (mention the exact sequences MSG_CLOSE and MSG_CLOSE_ESCAPED and
that an input containing MSG_CLOSE_ESCAPED will be corrupted on unescape), and
optionally note that a more robust fix would be to use a non-ambiguous encoding
or a two-pass escaping scheme if full reversibility is later required.
In `@src/openhuman/agent/harness/subagent_runner.rs`:
- Around line 432-444: TranscriptMeta is being populated with zeroed usage and
cost fields; update the pipeline to thread actual usage from run_inner_loop into
the transcript: modify run_inner_loop (and its callers) to return or expose the
ChatResponse.usage (or an aggregated Usage struct) instead of discarding it, sum
input/output/token usage and charged amount across sub-agent provider responses,
and then set TranscriptMeta.input_tokens, output_tokens, cached_input_tokens (if
applicable), and charged_amount_usd from that aggregated Usage when constructing
TranscriptMeta in subagent_runner.rs so transcripts reflect real usage for
debugging and accounting.
In `@src/openhuman/providers/compatible.rs`:
- Around line 1066-1099: In extract_usage, add a structured debug trace that
logs which source provided token counts (openhuman vs standard usage) and
whether billing metadata was present: inspect oh, std_usage, oh_usage, and
oh_billing and emit a development-level logger call (with a stable prefix) that
records booleans like from_openhuman, from_standard, and has_billing and the
numeric token values (input_tokens, output_tokens, cached_input_tokens) but no
sensitive payloads; ensure the log is placed after computing the token variables
and before returning the ProviderUsageInfo so the instrumentation reflects the
branch decision used to populate ProviderUsageInfo.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 4f588128-4c71-45d8-8ba5-4b01eac28a71
📒 Files selected for processing (11)
src/openhuman/agent/harness/session/builder.rssrc/openhuman/agent/harness/session/mod.rssrc/openhuman/agent/harness/session/transcript.rssrc/openhuman/agent/harness/session/turn.rssrc/openhuman/agent/harness/session/types.rssrc/openhuman/agent/harness/subagent_runner.rssrc/openhuman/context/guard.rssrc/openhuman/context/manager.rssrc/openhuman/context/pipeline.rssrc/openhuman/providers/compatible.rssrc/openhuman/providers/traits.rs
…64> for OpenHumanUsage fields - parse_native_response now returns Result and propagates an error when choices is empty instead of silently returning text: None - OpenHumanUsage token fields changed to Option<u64> so missing keys yield None and allow fallback to standard OpenAI usage block - extract_usage now logs which source (openhuman vs standard) provided token counts via structured tracing - run_inner_loop returns AggregatedUsage so subagent transcripts reflect real usage instead of zeros - Document escape_content/unescape_content edge case with pre-existing escape sequences
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/openhuman/agent/harness/session/transcript.rs`:
- Around line 207-223: The current escape_content/unescape_content pair is not
bijective because a pre-existing literal MSG_CLOSE_ESCAPED becomes MSG_CLOSE
during unescape; fix by making the scheme reversible: in escape_content
(function escape_content) first escape backslashes (e.g., replace "\" with an
escaped marker) and then escape MSG_CLOSE (MSG_CLOSE -> MSG_CLOSE_ESCAPED), and
in unescape_content perform the exact reverse order (first unescape
MSG_CLOSE_ESCAPED -> MSG_CLOSE, then unescape the backslash marker -> "\"), or
alternatively switch to a length-delimited encoding (prefix content with its
byte length) inside escape_content and decode accordingly in unescape_content;
update references to MSG_CLOSE and MSG_CLOSE_ESCAPED and ensure both functions
are exact inverses so every input round-trips exactly.
- Around line 166-175: resolve_new_transcript_path currently races by returning
a suggested filename (max+1) without reserving it; change the allocation to
atomically reserve a new transcript file by attempting to create the file with
O_EXCL style semantics in a retry loop: in resolve_new_transcript_path (or a new
helper e.g., reserve_new_transcript_path) compute sanitized name and index, then
try to open/create the candidate path using create_new
(OpenOptions::new().write(true).create_new(true)) or equivalent, and if
create_new fails because the file exists, increment index and retry until
success (with a reasonable cap/backoff); return the successfully created PathBuf
so the later fs::write in subagent_runner (calls at lines referenced) can safely
write without clobbering another concurrent writer. Ensure errors propagate via
Result and clean up partially created files on failure if needed.
- Around line 268-320: The parser currently treats malformed message tags as
end-of-input by using `break`, causing truncated files to load silently; in
`parse_messages()` replace those `break` branches (specifically the `else {
break; }` after failing to find `MSG_OPEN_SUFFIX`/role, and the final
double-`find` that falls back to `break` when `MSG_CLOSE` isn't found) with hard
errors (e.g., `anyhow::bail!` or `Err(anyhow!(...))`) that include context (the
current `search_from`/content_start index, the expected token like
`MSG_OPEN_PREFIX`/`MSG_OPEN_SUFFIX`/`MSG_CLOSE`, and a short snippet of the
remaining `raw`), so callers get an explicit parse error instead of a partial
`Ok(messages)` return; keep the successful message pushes using
`unescape_content` and `ChatMessage` unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 3d0eda17-61b5-4a1b-81b8-da8769149049
📒 Files selected for processing (3)
src/openhuman/agent/harness/session/transcript.rssrc/openhuman/agent/harness/subagent_runner.rssrc/openhuman/providers/compatible.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- src/openhuman/providers/compatible.rs
| pub fn resolve_new_transcript_path(workspace_dir: &Path, agent_name: &str) -> Result<PathBuf> { | ||
| let date_dir = today_session_dir(workspace_dir); | ||
| fs::create_dir_all(&date_dir) | ||
| .with_context(|| format!("create session dir {}", date_dir.display()))?; | ||
|
|
||
| let sanitized = sanitize_agent_name(agent_name); | ||
| let next_index = next_index(&date_dir, &sanitized)?; | ||
| let filename = format!("{}_{}.md", sanitized, next_index); | ||
|
|
||
| Ok(date_dir.join(filename)) |
There was a problem hiding this comment.
Path allocation can collide under concurrent writes.
resolve_new_transcript_path() computes max + 1 and returns the filename without reserving it. The new sub-agent call sites in src/openhuman/agent/harness/subagent_runner.rs at Lines 302-308 and 392-398 can hit this concurrently for the same agent_id, choose the same {agent}_{index}.md, and let the later fs::write overwrite the first transcript. This needs an atomic reservation step (create_new + retry loop, or equivalent) in the allocation/write path.
Also applies to: 339-356
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/openhuman/agent/harness/session/transcript.rs` around lines 166 - 175,
resolve_new_transcript_path currently races by returning a suggested filename
(max+1) without reserving it; change the allocation to atomically reserve a new
transcript file by attempting to create the file with O_EXCL style semantics in
a retry loop: in resolve_new_transcript_path (or a new helper e.g.,
reserve_new_transcript_path) compute sanitized name and index, then try to
open/create the candidate path using create_new
(OpenOptions::new().write(true).create_new(true)) or equivalent, and if
create_new fails because the file exists, increment index and retry until
success (with a reasonable cap/backoff); return the successfully created PathBuf
so the later fs::write in subagent_runner (calls at lines referenced) can safely
write without clobbering another concurrent writer. Ensure errors propagate via
Result and clean up partially created files on failure if needed.
| /// Escape the closing delimiter so it cannot appear literally in message | ||
| /// content. Replaces `<!--/MSG-->` with `<!--\/MSG-->`. | ||
| /// | ||
| /// **Known edge case:** if the original content already contains the literal | ||
| /// escape sequence `<!--\/MSG-->`, `unescape_content` will convert it into | ||
| /// `<!--/MSG-->`, corrupting the round-trip. In practice this sequence is | ||
| /// vanishingly unlikely in LLM output. A fully reversible fix would require | ||
| /// a two-pass escaping scheme (e.g. also escaping the backslash), but that | ||
| /// added complexity is not warranted unless the edge case materialises. | ||
| fn escape_content(content: &str) -> String { | ||
| content.replace(MSG_CLOSE, MSG_CLOSE_ESCAPED) | ||
| } | ||
|
|
||
| /// Reverse the escaping applied by [`escape_content`]. See that function's | ||
| /// doc comment for the known edge case with pre-existing escape sequences. | ||
| fn unescape_content(content: &str) -> String { | ||
| content.replace(MSG_CLOSE_ESCAPED, MSG_CLOSE) |
There was a problem hiding this comment.
Make the content escaping bijective.
This escape/unescape pair still cannot round-trip every valid ChatMessage.content: a literal <!--\/MSG--> is read back as <!--/MSG-->. That breaks the core “exact bytes back out” guarantee for any prompt or tool output that already contains the escaped token. Please switch to a fully reversible encoding here (for example, a length-delimited body or another bijective escaping scheme).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/openhuman/agent/harness/session/transcript.rs` around lines 207 - 223,
The current escape_content/unescape_content pair is not bijective because a
pre-existing literal MSG_CLOSE_ESCAPED becomes MSG_CLOSE during unescape; fix by
making the scheme reversible: in escape_content (function escape_content) first
escape backslashes (e.g., replace "\" with an escaped marker) and then escape
MSG_CLOSE (MSG_CLOSE -> MSG_CLOSE_ESCAPED), and in unescape_content perform the
exact reverse order (first unescape MSG_CLOSE_ESCAPED -> MSG_CLOSE, then
unescape the backslash marker -> "\"), or alternatively switch to a
length-delimited encoding (prefix content with its byte length) inside
escape_content and decode accordingly in unescape_content; update references to
MSG_CLOSE and MSG_CLOSE_ESCAPED and ensure both functions are exact inverses so
every input round-trips exactly.
| fn parse_messages(raw: &str) -> Result<Vec<ChatMessage>> { | ||
| let mut messages = Vec::new(); | ||
| let mut search_from = 0; | ||
|
|
||
| loop { | ||
| // Find next <!--MSG role="..."--> opening tag | ||
| let Some(open_start) = raw[search_from..].find(MSG_OPEN_PREFIX) else { | ||
| break; | ||
| }; | ||
| let open_start = search_from + open_start; | ||
| let after_prefix = open_start + MSG_OPEN_PREFIX.len(); | ||
|
|
||
| // Extract role from between the quotes | ||
| let Some(role_end) = raw[after_prefix..].find(MSG_OPEN_SUFFIX) else { | ||
| break; | ||
| }; | ||
| let role = raw[after_prefix..after_prefix + role_end].to_string(); | ||
|
|
||
| // Content starts after the opening tag + newline | ||
| let content_start = after_prefix + role_end + MSG_OPEN_SUFFIX.len(); | ||
| let content_start = if raw[content_start..].starts_with('\n') { | ||
| content_start + 1 | ||
| } else { | ||
| content_start | ||
| }; | ||
|
|
||
| // Find closing tag | ||
| let close_tag = format!("\n{MSG_CLOSE}"); | ||
| let Some(content_end_rel) = raw[content_start..].find(&close_tag) else { | ||
| // Try without leading newline for empty content | ||
| let Some(content_end_rel) = raw[content_start..].find(MSG_CLOSE) else { | ||
| break; | ||
| }; | ||
| let content = &raw[content_start..content_start + content_end_rel]; | ||
| messages.push(ChatMessage { | ||
| role, | ||
| content: unescape_content(content), | ||
| }); | ||
| search_from = content_start + content_end_rel + MSG_CLOSE.len(); | ||
| continue; | ||
| }; | ||
|
|
||
| let content = &raw[content_start..content_start + content_end_rel]; | ||
| messages.push(ChatMessage { | ||
| role, | ||
| content: unescape_content(content), | ||
| }); | ||
|
|
||
| search_from = content_start + content_end_rel + close_tag.len(); | ||
| } | ||
|
|
||
| Ok(messages) | ||
| } |
There was a problem hiding this comment.
Malformed transcripts currently load as partial history.
The error branches in parse_messages() break and then return Ok(messages), so a truncated or hand-edited file is treated as a valid-but-shorter conversation. For resume/cache reuse that is worse than failing fast, because it silently changes the prefix sent back to the model. Please turn these malformed-tag paths into hard parse errors with context.
Suggested direction
- let Some(role_end) = raw[after_prefix..].find(MSG_OPEN_SUFFIX) else {
- break;
- };
+ let Some(role_end) = raw[after_prefix..].find(MSG_OPEN_SUFFIX) else {
+ anyhow::bail!(
+ "malformed transcript: unterminated role header at byte {}",
+ open_start
+ );
+ };
...
- let Some(content_end_rel) = raw[content_start..].find(&close_tag) else {
+ let Some(content_end_rel) = raw[content_start..].find(&close_tag) else {
// Try without leading newline for empty content
- let Some(content_end_rel) = raw[content_start..].find(MSG_CLOSE) else {
- break;
- };
+ let Some(content_end_rel) = raw[content_start..].find(MSG_CLOSE) else {
+ anyhow::bail!(
+ "malformed transcript: missing closing tag for role '{}'",
+ role
+ );
+ };📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| fn parse_messages(raw: &str) -> Result<Vec<ChatMessage>> { | |
| let mut messages = Vec::new(); | |
| let mut search_from = 0; | |
| loop { | |
| // Find next <!--MSG role="..."--> opening tag | |
| let Some(open_start) = raw[search_from..].find(MSG_OPEN_PREFIX) else { | |
| break; | |
| }; | |
| let open_start = search_from + open_start; | |
| let after_prefix = open_start + MSG_OPEN_PREFIX.len(); | |
| // Extract role from between the quotes | |
| let Some(role_end) = raw[after_prefix..].find(MSG_OPEN_SUFFIX) else { | |
| break; | |
| }; | |
| let role = raw[after_prefix..after_prefix + role_end].to_string(); | |
| // Content starts after the opening tag + newline | |
| let content_start = after_prefix + role_end + MSG_OPEN_SUFFIX.len(); | |
| let content_start = if raw[content_start..].starts_with('\n') { | |
| content_start + 1 | |
| } else { | |
| content_start | |
| }; | |
| // Find closing tag | |
| let close_tag = format!("\n{MSG_CLOSE}"); | |
| let Some(content_end_rel) = raw[content_start..].find(&close_tag) else { | |
| // Try without leading newline for empty content | |
| let Some(content_end_rel) = raw[content_start..].find(MSG_CLOSE) else { | |
| break; | |
| }; | |
| let content = &raw[content_start..content_start + content_end_rel]; | |
| messages.push(ChatMessage { | |
| role, | |
| content: unescape_content(content), | |
| }); | |
| search_from = content_start + content_end_rel + MSG_CLOSE.len(); | |
| continue; | |
| }; | |
| let content = &raw[content_start..content_start + content_end_rel]; | |
| messages.push(ChatMessage { | |
| role, | |
| content: unescape_content(content), | |
| }); | |
| search_from = content_start + content_end_rel + close_tag.len(); | |
| } | |
| Ok(messages) | |
| } | |
| fn parse_messages(raw: &str) -> Result<Vec<ChatMessage>> { | |
| let mut messages = Vec::new(); | |
| let mut search_from = 0; | |
| loop { | |
| // Find next <!--MSG role="..."--> opening tag | |
| let Some(open_start) = raw[search_from..].find(MSG_OPEN_PREFIX) else { | |
| break; | |
| }; | |
| let open_start = search_from + open_start; | |
| let after_prefix = open_start + MSG_OPEN_PREFIX.len(); | |
| // Extract role from between the quotes | |
| let Some(role_end) = raw[after_prefix..].find(MSG_OPEN_SUFFIX) else { | |
| anyhow::bail!( | |
| "malformed transcript: unterminated role header at byte {}", | |
| open_start | |
| ); | |
| }; | |
| let role = raw[after_prefix..after_prefix + role_end].to_string(); | |
| // Content starts after the opening tag + newline | |
| let content_start = after_prefix + role_end + MSG_OPEN_SUFFIX.len(); | |
| let content_start = if raw[content_start..].starts_with('\n') { | |
| content_start + 1 | |
| } else { | |
| content_start | |
| }; | |
| // Find closing tag | |
| let close_tag = format!("\n{MSG_CLOSE}"); | |
| let Some(content_end_rel) = raw[content_start..].find(&close_tag) else { | |
| // Try without leading newline for empty content | |
| let Some(content_end_rel) = raw[content_start..].find(MSG_CLOSE) else { | |
| anyhow::bail!( | |
| "malformed transcript: missing closing tag for role '{}'", | |
| role | |
| ); | |
| }; | |
| let content = &raw[content_start..content_start + content_end_rel]; | |
| messages.push(ChatMessage { | |
| role, | |
| content: unescape_content(content), | |
| }); | |
| search_from = content_start + content_end_rel + MSG_CLOSE.len(); | |
| continue; | |
| }; | |
| let content = &raw[content_start..content_start + content_end_rel]; | |
| messages.push(ChatMessage { | |
| role, | |
| content: unescape_content(content), | |
| }); | |
| search_from = content_start + content_end_rel + close_tag.len(); | |
| } | |
| Ok(messages) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/openhuman/agent/harness/session/transcript.rs` around lines 268 - 320,
The parser currently treats malformed message tags as end-of-input by using
`break`, causing truncated files to load silently; in `parse_messages()` replace
those `break` branches (specifically the `else { break; }` after failing to find
`MSG_OPEN_SUFFIX`/role, and the final double-`find` that falls back to `break`
when `MSG_CLOSE` isn't found) with hard errors (e.g., `anyhow::bail!` or
`Err(anyhow!(...))`) that include context (the current
`search_from`/content_start index, the expected token like
`MSG_OPEN_PREFIX`/`MSG_OPEN_SUFFIX`/`MSG_CLOSE`, and a short snippet of the
remaining `raw`), so callers get an explicit parse error instead of a partial
`Ok(messages)` return; keep the successful message pushes using
`unescape_content` and `ChatMessage` unchanged.
Summary
Vec<ChatMessage>sent to the LLM provider as.mdfiles at{workspace}/sessions/DDMMYYYY/{agent}_{index}.md. On session resume, the transcript is read back to produce byte-identical messages, ensuring the inference backend's KV cache prefix is reused rather than re-prefilled.openhuman.usageandopenhuman.billingmetadata from API responses (includingcached_input_tokensandcharged_amount_usd) intoUsageInfo, which was previously alwaysNonefrom the compatible provider.Key files
src/openhuman/agent/harness/session/transcript.rssrc/openhuman/agent/harness/session/turn.rssrc/openhuman/agent/harness/session/types.rssession_transcript_path,cached_transcript_messages,agent_definition_namesrc/openhuman/agent/harness/subagent_runner.rssrc/openhuman/providers/compatible.rsusage+openhumanblocks from API responsesrc/openhuman/providers/traits.rsUsageInfogainscached_input_tokens,charged_amount_usdExample transcript header
Test plan
cargo check— clean compilation (core + Tauri shell)cargo test— all 2222 tests pass (0 failures){workspace}/sessions/directory and transcript contentcache_hit_pctin transcript header matches expected behaviorSummary by CodeRabbit