Context
Data Machine owns flows, pipelines, jobs, batch scheduling, and processed items. At 10k+ wiki articles and graph backfills, operators need run-level visibility beyond individual job status.
Problem
Large backfills need durable metrics: items processed/skipped/failed, retries, duration, provider/source, flow/pipeline/agent/root context, and cost/token usage where available. Without this, a multi-day pipeline can fail quietly or produce poor-quality output before anyone notices.
Proposed scope
- Add generic run metrics emitted by flow/pipeline/system task execution.
- Track counts for processed, skipped, failed, retried, staged actions, and child job totals.
- Track duration and timestamps for start/end/last activity.
- Add optional token/cost fields when AI providers expose them.
- Expose metrics via CLI/REST/abilities.
Acceptance criteria
- A background flow can report progress and failure rates without domain-specific code.
- Metrics work for pipeline batches and system-task fan-out.
- Intelligence can display root/topic/source progress using generic DM run data plus its own brain context.
Context
Data Machine owns flows, pipelines, jobs, batch scheduling, and processed items. At 10k+ wiki articles and graph backfills, operators need run-level visibility beyond individual job status.
Problem
Large backfills need durable metrics: items processed/skipped/failed, retries, duration, provider/source, flow/pipeline/agent/root context, and cost/token usage where available. Without this, a multi-day pipeline can fail quietly or produce poor-quality output before anyone notices.
Proposed scope
Acceptance criteria