-
Notifications
You must be signed in to change notification settings - Fork 1
Opik integration for manual annotation of LLM judge #134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Jacob-Chmura
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably we'll need to uv add opik
I have not used this framework before, but the integration seems reasonable. Will read more about it
|
Tests are fixed. Ready for review |
Jacob-Chmura
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work!
| # Log to Opik if provided. | ||
| for _dataset_idx, (dataset, stats) in enumerate(zip(datasets, results)): | ||
| for _sample_idx, sample in enumerate(dataset.samples): | ||
| if opik_client is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit, but this check can go outside the loop
Jacob-Chmura
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any other scripts missing here? I.e. looping through and setting the env variables? Given the results of our manual annotation trail, I think this merge may be postponed in preference of automated eval via mechanical turks etc (for rebuttals). Will keep this open for the short term.
Jacob-Chmura
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating everything. I'm not sure if we'll merge this as is, but we can keep this branch for the time being.
Log LLM Judge perspective to Comet Opik for manual annotation.
New environment variables:
TODO: