Skip to content

Persist grounding traces to daily JSONL files #34

@anormang1992

Description

@anormang1992

Summary

Persist structured grounding trace data to daily JSONL files on disk, creating a durable, queryable history of all epistemic activity in the system.

Problem Statement

Grounding traces are currently ephemeral — they exist only for the duration of a single check() or learn_all() call and are then discarded. This means there is no historical record of what the agent attempted to ground, what gaps were encountered, what was learned, or how the graph evolved over time.

Without trace persistence, downstream features like the VRE Workstation (#TBD) cannot display historical activity, and any future epistemic memory or confidence decay system has no temporal data to reason over.

Proposed Solution

After each grounding operation, serialize the EpistemicResult (or a structured subset) to a daily JSONL file:

  • File path: configurable base directory, defaulting to something like ~/.vre/traces/YYYY-MM-DD.jsonl
  • Format: one JSON object per line, each representing a single grounding trace with timestamp, queried concepts, resolved concepts, gaps, grounding result, and any learning outcomes
  • Append-only: traces are appended to the day's file; no mutation of existing entries
  • Opt-in: trace persistence is off by default and enabled via configuration (e.g. a trace_dir parameter on the VRE class or a separate TraceWriter)

The JSONL format is chosen for simplicity — no infrastructure required, trivially parseable, and each line is independently valid JSON for streaming reads.

VRE Design Alignment

  • Inspectability: CLAUDE.md Section 8 lists inspectability as a core technology principle. Persisted traces make the agent's epistemic history fully inspectable after the fact.
  • Auditability: Section 5.1 requires that VRE returns "a structured representation of the epistemic pathway used." Persistence extends this guarantee beyond the current session.
  • No new graph concerns: Trace files are external artifacts — they do not modify the graph or the grounding contract. VRE remains stateless between calls; the trace writer is a side-effect layer.
  • Minimal footprint: JSONL files require no additional infrastructure, consistent with Section 8's emphasis on minimal dependencies.

Acceptance Criteria

  • Grounding traces are serialized to daily JSONL files after each check() call
  • Each trace entry includes: timestamp, queried concepts, resolved names, grounding result (grounded/not), gaps with types, and epistemic steps
  • Learning outcomes (accepted/rejected/skipped candidates) are included when learn() or learn_all() is invoked
  • Trace directory is configurable; persistence is opt-in
  • Files are append-only and each line is valid JSON
  • Existing grounding behavior is unaffected when tracing is disabled
  • Integration test verifying trace file creation and content structure

Open Questions

  • Should policy evaluation results also be persisted in the trace, or kept separate?
  • What is the right granularity — one entry per check(), or per concept within a check?
  • Should there be a rotation/retention policy, or leave file management to the user?
  • Should the trace writer be synchronous (simple) or async (non-blocking)?

Dependencies

None — this is a foundational feature with no upstream dependencies.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions