[Radar] Track upstream session-log fidelity gaps for cost and tool failures

## Background

External agent ecosystems keep surfacing a shared reliability gap: local session logs are useful, but some tools either omit cost details, undercount usage, or report tool failures without enough error context. agenttrace should track this as a radar item before deciding whether parser warnings, docs notes, or fixture work are needed.

## Evidence

- Claude Code issue #23254 reports subagent token usage is shown by `/cost` but not persisted to local session JSONL, which limits downstream local reporting: https://github.com/anthropics/claude-code/issues/23254
- Claude Code issue #27361 reports JSONL transcripts can miss final message stop events, making output token counts unreliable for local cost tools: https://github.com/anthropics/claude-code/issues/27361
- Claude Code issue #55430 shows large repeated tool outputs and full-file writes dominating token usage in a real session, with transcript-size evidence: https://github.com/anthropics/claude-code/issues/55430
- Claude Code issue #51218 reports prompt-cache expiry during active sessions causing sudden token spikes after idle periods: https://github.com/anthropics/claude-code/issues/51218
- Codex issue #2420 reports the CLI can show `tool failed` while local logs do not include the underlying error detail: https://github.com/openai/codex/issues/2420
- Duplicate checks for `Claude Code JSONL token usage subagent cost`, `tool failure logs Codex CLI`, `cache token spike Claude Code`, and `session log fidelity upstream cost accounting` found no exact open issue in this repository.

## User value

Users reading local reports need to know when a metric is derived from complete local evidence versus when the upstream session log may be missing cost, failure, or final-output details.

## Adoption rationale

Clear confidence boundaries improve Developer experience and Reliability value. They help users trust agenttrace reports while avoiding false precision when an upstream tool did not persist enough evidence.

## Suggested scope

- Keep this as radar until at least one minimal public fixture or reproducible local sample is available.
- Decide whether agenttrace should add parser-level confidence notes for known upstream log gaps.
- Decide whether docs should mention source-specific limitations for cost and tool-failure attribution.
- If fixture evidence becomes available, split concrete parser or product issues by source tool.

## Non-goals

- Do not infer private billing data that is not present in local logs.
- Do not upload or request user transcripts.
- Do not change parser behavior without fixture-backed evidence.
- Do not treat unrelated model-routing or hosted observability products as direct requirements.

## Acceptance criteria

- Maintainer decides whether this remains radar, becomes docs guidance, or splits into parser/product issues.
- Any follow-up issue names the source tool and the specific missing or unreliable field.
- Follow-up acceptance criteria require local fixture evidence or a reproducible public sample.
- agenttrace public copy avoids overclaiming exact cost attribution where the upstream log is known to be incomplete.

## Suggested lane

lane/radar, priority/P2, status/needs-human

## Risk

Medium. Overreacting could add noisy warnings; ignoring the signal could make reports look more precise than the underlying logs support.

## Source

source/radar: Tavily scan of public GitHub and ecosystem signals on 2026-05-04.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Radar] Track upstream session-log fidelity gaps for cost and tool failures #97

Background

Evidence

User value

Adoption rationale

Suggested scope

Non-goals

Acceptance criteria

Suggested lane

Risk

Source

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Radar] Track upstream session-log fidelity gaps for cost and tool failures #97

Description

Background

Evidence

User value

Adoption rationale

Suggested scope

Non-goals

Acceptance criteria

Suggested lane

Risk

Source

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions