Skip to content

Expose run metrics for long background backfills #1735

@chubes4

Description

@chubes4

Context

Data Machine owns flows, pipelines, jobs, batch scheduling, and processed items. At 10k+ wiki articles and graph backfills, operators need run-level visibility beyond individual job status.

Problem

Large backfills need durable metrics: items processed/skipped/failed, retries, duration, provider/source, flow/pipeline/agent/root context, and cost/token usage where available. Without this, a multi-day pipeline can fail quietly or produce poor-quality output before anyone notices.

Proposed scope

  • Add generic run metrics emitted by flow/pipeline/system task execution.
  • Track counts for processed, skipped, failed, retried, staged actions, and child job totals.
  • Track duration and timestamps for start/end/last activity.
  • Add optional token/cost fields when AI providers expose them.
  • Expose metrics via CLI/REST/abilities.

Acceptance criteria

  • A background flow can report progress and failure rates without domain-specific code.
  • Metrics work for pipeline batches and system-task fan-out.
  • Intelligence can display root/topic/source progress using generic DM run data plus its own brain context.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions