Skip to content

HEALTH and CTX% columns should reflect real agent state, not just simulator #34

@arcavenai

Description

@arcavenai

What

marvel get sessions has HEALTH and CTX% columns. For real agent sessions (forestage, bare claude, generic runtimes), both columns are permanently empty:

WORKSPACE  TEAM  ROLE    GEN  NAME              STATE    HEALTH   CTX%  DESK  AGENT
dev        solo  worker  1    solo-worker-g1-0  running  unknown  -     1     forestage

The values stay at unknown / - regardless of how long the agent runs, what it's doing, or how close it is to context exhaustion.

Observed

Live test on alpha-20260419.034700.20d90ac: launched a forestage role under marvel, injected a prompt (marvel inject ...), watched Claude process it end-to-end. The agent was actively working — "Channeling… (thinking)" visible in the captured pane — but the marvel get sessions HEALTH/CTX columns never changed.

Today these columns only move in the simulator tests (internal/simulator/engine.go).

What should happen

  1. HEALTH: a role with healthcheck.type: heartbeat should see its live agent's state reflected in the column — healthy while the agent is alive and responsive, failing after N missed heartbeats, unknown only before the first heartbeat arrives. The existing FailureCount / LastHeartbeat fields on Session are already the right shape; they're just never written by a real agent.

  2. CTX%: for LLM-backed runtimes (forestage, claude, and anything marvel manages whose runtime is a Claude Code subprocess), the session row should show the agent's current context window utilization as a percentage. The existing ContextPercent field is already displayed; the agent-side reporting is the missing half.

Why

  • The columns imply information that isn't there. An operator scanning marvel get sessions sees HEALTH unknown and reasonably assumes "the health check hasn't fired yet" — when actually the signal was never going to arrive.
  • The heartbeat healthcheck type is designed to let marvel detect and restart hung agents. Without live agents writing heartbeats, the check's failure_threshold cannot fire, so marvel can't distinguish "working" from "stuck" for real agents.
  • CTX% is load-bearing for the shift feature (rolling rotation of exhausted-context agents). Without real ctx data, marvel can't make shift decisions that match operator intent.

Scope (what, not how)

  • Applies to at least the forestage and claude adapters in internal/runtime/. Whether the generic adapter should participate is a judgment call — probably opt-in per role.
  • Both ends of the wire need a design: what the agent emits and how marvel ingests it. Both are needed; shipping one without the other leaves the columns still empty.
  • The initial cut can be coarse — a heartbeat that just says "I'm alive" and a context-percent ping every N seconds — and get refined later.

Not scope

  • Full OTEL instrumentation. OTEL is the right long-term answer but the daemon's Session struct and marvel get sessions output already expect these two specific values; filling them is narrower than a telemetry overhaul.
  • Agent-internal metrics (tool call counts, token rates, etc.). Those belong to a later observability surface.

Related

  • internal/api/types.go: Session.HealthState, LastHeartbeat, FailureCount, ContextPercent already defined
  • internal/team/controller.go:TestHealthEvalHeartbeatStale etc.: controller already evaluates heartbeat staleness — consumes the data it just doesn't receive
  • internal/simulator/engine.go: reference implementation of what a compliant agent's state transitions look like

Environment

  • marvel 0.1.0-alpha.20260419.034700.20d90ac (commit 20d90ac)
  • forestage alpha-20260418-050527-3a316b0
  • Linux aarch64 Pi

Metadata

Metadata

Assignees

No one assigned

    Labels

    area.clicmd/marvelpriority.p2Medium — should address this sprinttriage.newJust arrived, not yet reviewedtype.bugBroken behavior — something doesn't work as designed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions