nemo_retriever: Add structured harness metrics reports#1699
nemo_retriever: Add structured harness metrics reports#1699jioffe502 wants to merge 7 commits intoNVIDIA:mainfrom
Conversation
- route harness runs through shared batch, fused, and inprocess reports - persist runtime and detection metrics without scraping stdout - tighten tests around config and LanceDB metadata behavior Signed-off-by: Jacob Ioffe <jioffe@nvidia.com>
Revert non-essential docs/example/utility churn so this branch only carries the structured run report metrics path in harness and mode runners.
Signed-off-by: Jacob Ioffe <jioffe@nvidia.com>
Keep harness run metadata stable across container/worktree setups by falling back to git HEAD refs when rev-parse fails, and persist detection counters as JSON metadata so detection summaries survive LanceDB round-trips. Signed-off-by: Jacob Ioffe <jioffe@nvidia.com>
| ) | ||
|
|
||
|
|
||
| class FusedPipelineConfig(ModePipelineConfigModel): |
There was a problem hiding this comment.
This is almost exactly the same as the other PipelineConfig is there nothing we can reuse from both files, maybe create a generalized version of the config and tweak only where necessary. Also we have no support for fused yet, so don't think you need to add it.
| shutdown_ray_safely() | ||
|
|
||
|
|
||
| def render_fused_run_report(report: RunReport) -> None: |
There was a problem hiding this comment.
rendering the report should be the same no matter the pipeline. We should not have different logic in each module particular to an application mode.
There was a problem hiding this comment.
Consolidated rendering into shared report helpers in application/modes/reports.py and switched batch/inprocess to use the same render path.
| ) | ||
|
|
||
|
|
||
| class InProcessPipelineConfig(ModePipelineConfigModel): |
| return ingestor, file_patterns | ||
|
|
||
|
|
||
| def run_inprocess_pipeline(cfg: InProcessPipelineConfig) -> RunReport: |
There was a problem hiding this comment.
Shouldnt the only difference between this run pipeline and batch is the run_mode string? If it is, then this shouldnt exist as code it should just be a switch to one generalized function.
| model_config = ConfigDict(extra="forbid") | ||
|
|
||
|
|
||
| class _TeeStream: |
There was a problem hiding this comment.
Why do we need stream controls, I thought this was removing the need for that? We dont care what the display onscreen looks like (we leave it to lower level libraries, like ray), we just need the datapoints from the run, and we no longer get those from std out/err, right?
There was a problem hiding this comment.
Removed stream tee/logging controls from application/modes/shared.py to keep this path focused on structured datapoints and mode orchestration
| return import_module("lancedb") | ||
|
|
||
|
|
||
| def ensure_lancedb_table(uri: str, table_name: str) -> None: |
There was a problem hiding this comment.
Dont we have this in other parts of the code base already? If its not a duplicate, why move it here. This is lance stuff nothing to do with run modes. I should not have to import the lance package in this module.
There was a problem hiding this comment.
Moved LanceDB lifecycle/read helpers out of application/modes/shared.py into vector_store/lancedb_store.py, and updated mode/detection consumers to import from there
| return datetime.now(timezone.utc).strftime("%Y%m%d_%H%M%S_UTC") | ||
|
|
||
|
|
||
| _COMMIT_RE = re.compile(r"^[0-9a-fA-F]{7,40}$") |
There was a problem hiding this comment.
Why are we changing commit string conversion logic in this PR? Did you do other stuff in the PR as well?
There was a problem hiding this comment.
Kept commit fallback hardening because it stabilizes latest_commit in container/worktree runs, and added focused tests for gitdir ref-file and packed-refs fallback to make scope explicit
it came from running inside of a dev container
| "max_workers", | ||
| "gpu_devices", | ||
| } | ||
| FUSED_TUNING_FIELDS = { |
There was a problem hiding this comment.
we dont have support for fused , so why add the fused entries? Also why are some of the kwargs prefixed with fused... is it supposed to mean something different?
There was a problem hiding this comment.
Removed fused harness scope from this PR. harness/config.py and harness/run.py now only support batch/inprocess for this metrics work, and fused-specific harness test coverage was removed
run_fused is back to wrapper-only
Align the harness to structured batch/inprocess reporting boundaries so artifacts and runtime metadata remain interpretable and reviewable. This also removes stale fused-path plumbing and consolidates LanceDB/detection/report helpers to reduce drift in future harness iterations. Signed-off-by: Jacob Ioffe <jioffe@nvidia.com>
Switch the retriever harness from stdout scraping to structured run reports shared across supported run modes. Follow-on commits also harden artifact metadata handling so these reports stay reliable in real container/worktree runs.
latest_commitpopulated in harness artifacts by falling back to.git/HEADrefs whengit rev-parsefails in container/worktree setups