Conversation
|
|
||
| ingestor = create_ingestor(run_mode="batch") | ||
| ingestor = ( | ||
| ingestor.files(documents) |
There was a problem hiding this comment.
How is this different from the normal batch_pipeline run? Why do we need to set this particular dataset and script here. The harness is not supposed to require more scripts to run. It should use batch_pipeline.py to run the graphs. Or we could run graphs directly.
There was a problem hiding this comment.
thank you for pointing this out yes batch_pipeline.py does seem to cover the exact same functionality that this script does and some more
|
|
||
| from nemo_retriever import create_ingestor | ||
|
|
||
| ingestor = create_ingestor(run_mode="batch") |
There was a problem hiding this comment.
Again, same as other script, why do we need this, this can be dynamically created?
| print(f" Page index key check: {matched}/{len(sampled)} sampled source_ids found") | ||
|
|
||
|
|
||
| def main() -> int: |
There was a problem hiding this comment.
Why not make this a tool we can call via import, instead of a main function.
| BATCH_SIZE = 50 | ||
|
|
||
|
|
||
| def load_queries(csv_path: str) -> list[dict]: |
There was a problem hiding this comment.
I feel like these tools would be good to have even outside the harness, no? Like if I run an evaluation, without the harness, I will need to use all these functions to actually get results right?
| _HERE = os.path.dirname(os.path.abspath(__file__)) | ||
|
|
||
|
|
||
| def main() -> int: |
There was a problem hiding this comment.
seems like this is also a tool, where I send it a parquet file (could be dataframe) and then you create the page level markdown. This definitely is useful outside of harness.
| @@ -0,0 +1,117 @@ | |||
| """ | |||
| Protocol definitions and dataclasses for the QA evaluation pipeline. | |||
There was a problem hiding this comment.
I think the tools/harness is not where we want our code, I think we want it in the nv-ingest/nemo_retriever/ sub folder.
| ) | ||
|
|
||
|
|
||
| class QAEvalPipeline: |
There was a problem hiding this comment.
We need to refactor this to work as an AbstractOperator type. Inherit from that and put everything this does in the appropriate life cycle hook.
| Candidate answer: {candidate}""" | ||
|
|
||
|
|
||
| class LLMJudge: |
There was a problem hiding this comment.
This should be an AbstractOperator. The process portion is the judge method. I should be able to pass a list of retrievals ( record in dataframe with query, golden_text and candidates) and the judge can provide a response for each record to judge.
| ] | ||
|
|
||
|
|
||
| class LiteLLMClient: |
There was a problem hiding this comment.
This should be an AbstractOperator class. It should be able to run alone, Like if I pass it a dataframe with query, and context it should provide me an answer for that (basically your generate call, plumbed through process lifecycle hook.
Description
Capabilities:
Note - the csv containing the q-a pairs is a subset of the existing https://github.com/NVIDIA/NeMo-Retriever/blob/main/data/digital_corpora_10k_annotations.csv. Currently have an separate PR up with a subset annotations for only bo767 specific files here - #1730
)## Checklist