nemo_retriever: Add structured harness metrics reports by jioffe502 · Pull Request #1699 · NVIDIA/NeMo-Retriever

jioffe502 · 2026-03-23T21:52:45Z

Switch the retriever harness from stdout scraping to structured run reports shared across supported run modes. Follow-on commits also harden artifact metadata handling so these reports stay reliable in real container/worktree runs.

add shared batch, inprocess, and fused run-report plumbing and wire the harness to consume it
persist runtime and detection metrics as structured artifacts instead of parsing console output
keep latest_commit populated in harness artifacts by falling back to .git/HEAD refs when git rev-parse fails in container/worktree setups
preserve detection counters through LanceDB round-trips by writing metadata as JSON and accepting legacy metadata literals during detection-summary reads
validate behavior with real bo20/jp20 harness runs as integration coverage

- route harness runs through shared batch, fused, and inprocess reports - persist runtime and detection metrics without scraping stdout - tighten tests around config and LanceDB metadata behavior Signed-off-by: Jacob Ioffe <jioffe@nvidia.com>

Revert non-essential docs/example/utility churn so this branch only carries the structured run report metrics path in harness and mode runners.

Signed-off-by: Jacob Ioffe <jioffe@nvidia.com>

Keep harness run metadata stable across container/worktree setups by falling back to git HEAD refs when rev-parse fails, and persist detection counters as JSON metadata so detection summaries survive LanceDB round-trips. Signed-off-by: Jacob Ioffe <jioffe@nvidia.com>

jperez999 · 2026-03-31T00:30:33Z

nemo_retriever/src/nemo_retriever/application/modes/run_fused.py

+)
+
+
+class FusedPipelineConfig(ModePipelineConfigModel):


This is almost exactly the same as the other PipelineConfig is there nothing we can reuse from both files, maybe create a generalized version of the config and tweak only where necessary. Also we have no support for fused yet, so don't think you need to add it.

jperez999 · 2026-03-31T00:31:48Z

nemo_retriever/src/nemo_retriever/application/modes/run_fused.py

+        shutdown_ray_safely()
+
+
+def render_fused_run_report(report: RunReport) -> None:


rendering the report should be the same no matter the pipeline. We should not have different logic in each module particular to an application mode.

Consolidated rendering into shared report helpers in application/modes/reports.py and switched batch/inprocess to use the same render path.

jperez999 · 2026-03-31T00:32:58Z

nemo_retriever/src/nemo_retriever/application/modes/run_inprocess.py

+)
+
+
+class InProcessPipelineConfig(ModePipelineConfigModel):


same as above. dry

jperez999 · 2026-03-31T00:33:59Z

nemo_retriever/src/nemo_retriever/application/modes/run_inprocess.py

+    return ingestor, file_patterns
+
+
+def run_inprocess_pipeline(cfg: InProcessPipelineConfig) -> RunReport:


Shouldnt the only difference between this run pipeline and batch is the run_mode string? If it is, then this shouldnt exist as code it should just be a switch to one generalized function.

jperez999 · 2026-03-31T00:35:43Z

nemo_retriever/src/nemo_retriever/application/modes/shared.py

+    model_config = ConfigDict(extra="forbid")
+
+
+class _TeeStream:


Why do we need stream controls, I thought this was removing the need for that? We dont care what the display onscreen looks like (we leave it to lower level libraries, like ray), we just need the datapoints from the run, and we no longer get those from std out/err, right?

Removed stream tee/logging controls from application/modes/shared.py to keep this path focused on structured datapoints and mode orchestration

jperez999 · 2026-03-31T00:37:12Z

nemo_retriever/src/nemo_retriever/application/modes/shared.py

+    return import_module("lancedb")
+
+
+def ensure_lancedb_table(uri: str, table_name: str) -> None:


Dont we have this in other parts of the code base already? If its not a duplicate, why move it here. This is lance stuff nothing to do with run modes. I should not have to import the lance package in this module.

Moved LanceDB lifecycle/read helpers out of application/modes/shared.py into vector_store/lancedb_store.py, and updated mode/detection consumers to import from there

jperez999 · 2026-03-31T00:38:27Z

nemo_retriever/src/nemo_retriever/harness/artifacts.py

    return datetime.now(timezone.utc).strftime("%Y%m%d_%H%M%S_UTC")


+_COMMIT_RE = re.compile(r"^[0-9a-fA-F]{7,40}$")


Why are we changing commit string conversion logic in this PR? Did you do other stuff in the PR as well?

Kept commit fallback hardening because it stabilizes latest_commit in container/worktree runs, and added focused tests for gitdir ref-file and packed-refs fallback to make scope explicit

it came from running inside of a dev container

jperez999 · 2026-03-31T00:40:18Z

nemo_retriever/src/nemo_retriever/harness/config.py

+    "max_workers",
+    "gpu_devices",
+}
+FUSED_TUNING_FIELDS = {


we dont have support for fused , so why add the fused entries? Also why are some of the kwargs prefixed with fused... is it supposed to mean something different?

Removed fused harness scope from this PR. harness/config.py and harness/run.py now only support batch/inprocess for this metrics work, and fused-specific harness test coverage was removed

run_fused is back to wrapper-only

Align the harness to structured batch/inprocess reporting boundaries so artifacts and runtime metadata remain interpretable and reviewable. This also removes stale fused-path plumbing and consolidates LanceDB/detection/report helpers to reduce drift in future harness iterations. Signed-off-by: Jacob Ioffe <jioffe@nvidia.com>

jioffe502 requested a review from a team as a code owner March 23, 2026 21:52

jioffe502 marked this pull request as draft March 23, 2026 22:00

jioffe502 and others added 5 commits March 24, 2026 16:55

nemo_retriever: narrow harness metrics PR scope

e2ad404

Revert non-essential docs/example/utility churn so this branch only carries the structured run report metrics path in harness and mode runners.

docs(nemo_retriever): document harness results.json contract

90bf9bc

Signed-off-by: Jacob Ioffe <jioffe@nvidia.com>

Merge branch 'main' into fix/harness_metrics

2a37a26

Merge branch 'main' into fix/harness_metrics

2494548

jioffe502 marked this pull request as ready for review March 30, 2026 21:17

jioffe502 requested a review from a team as a code owner March 30, 2026 21:17

jioffe502 requested a review from edknv March 30, 2026 21:17

jperez999 requested changes Mar 31, 2026

View reviewed changes

		shutdown_ray_safely()


		def render_fused_run_report(report: RunReport) -> None:

		return ingestor, file_patterns


		def run_inprocess_pipeline(cfg: InProcessPipelineConfig) -> RunReport:

		return import_module("lancedb")


		def ensure_lancedb_table(uri: str, table_name: str) -> None:

		return datetime.now(timezone.utc).strftime("%Y%m%d_%H%M%S_UTC")


		_COMMIT_RE = re.compile(r"^[0-9a-fA-F]{7,40}$")

		)


		class FusedPipelineConfig(ModePipelineConfigModel):

		)


		class InProcessPipelineConfig(ModePipelineConfigModel):

Conversation

jioffe502 commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jioffe502 commented Mar 23, 2026 •

edited

Loading