Skip to content

Conversation

@jacobthebanana
Copy link
Collaborator

@jacobthebanana jacobthebanana commented Apr 21, 2025

Log LLM Judge perspective to Comet Opik for manual annotation.

New environment variables:

export OPIK_BASE_URL="..."
export OPIK_PROJECT_NAME="..."

# optional for self-hosted installation
export OPIK_API_KEY="..."

TODO:

  • Include dataset name in data sent to opik server.

Copy link
Member

@Jacob-Chmura Jacob-Chmura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably we'll need to uv add opik

I have not used this framework before, but the integration seems reasonable. Will read more about it

@jacobthebanana
Copy link
Collaborator Author

Tests are fixed. Ready for review

Copy link
Member

@Jacob-Chmura Jacob-Chmura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!

# Log to Opik if provided.
for _dataset_idx, (dataset, stats) in enumerate(zip(datasets, results)):
for _sample_idx, sample in enumerate(dataset.samples):
if opik_client is not None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, but this check can go outside the loop

Copy link
Member

@Jacob-Chmura Jacob-Chmura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any other scripts missing here? I.e. looping through and setting the env variables? Given the results of our manual annotation trail, I think this merge may be postponed in preference of automated eval via mechanical turks etc (for rebuttals). Will keep this open for the short term.

Copy link
Member

@Jacob-Chmura Jacob-Chmura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating everything. I'm not sure if we'll merge this as is, but we can keep this branch for the time being.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants