Skip to content

Conversation

@AnilSorathiya
Copy link
Contributor

Pull Request Description

What and why?

Currently Scorers metrics are not listed in the list_tests function. The users are missing visibility in the list of metrics/tests.

How to test

Run the notebooks/how_to/explore_tests.ipynb notebook

What needs special review?

Dependencies, breaking changes, and deployment notes

Release notes

Checklist

  • What and why
  • Screenshots or videos (Frontend)
  • How to test
  • What needs special review
  • Dependencies, breaking changes, and deployment notes
  • Labels applied
  • PR linked to Shortcut
  • Unit tests added (Backend)
  • Tested locally
  • Documentation updated (if required)
  • Environment variable additions/changes documented (if required)

@AnilSorathiya AnilSorathiya added the chore Chore tasks that aren't bugs or new features label Jan 15, 2026
@github-actions
Copy link
Contributor

Pull requests must include at least one of the required labels: internal (no release notes required), highlight, enhancement, bug, deprecation, documentation. Except for internal, pull requests must also include a description in the release notes section.

1 similar comment
@github-actions
Copy link
Contributor

Pull requests must include at least one of the required labels: internal (no release notes required), highlight, enhancement, bug, deprecation, documentation. Except for internal, pull requests must also include a description in the release notes section.

Copy link
Contributor

@juanmleng juanmleng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks! Just left a couple of minor comments.

@github-actions
Copy link
Contributor

Pull requests must include at least one of the required labels: internal (no release notes required), highlight, enhancement, bug, deprecation, documentation. Except for internal, pull requests must also include a description in the release notes section.

@github-actions
Copy link
Contributor

Pull requests must include at least one of the required labels: internal (no release notes required), highlight, enhancement, bug, deprecation, documentation. Except for internal, pull requests must also include a description in the release notes section.

@github-actions
Copy link
Contributor

PR Summary

This PR refactors the test loading functionality to improve modularity and support for custom scorers. The primary changes include:

  1. Extraction of helper functions:

    • A new helper function _get_test_function_from_provider is introduced to first check the scorer_store for a custom scorer based on the test ID, and if not found, fall back to retrieving the test function from the corresponding provider. This encapsulates the logic for custom scorer lookup and error handling when a test provider is missing, or when the test cannot be loaded.

    • A new helper function _configure_test_function is added to set up the test function, including adding the test ID, ensuring a proper docstring, attaching input parameters, and guaranteeing the existence of tags and tasks attributes. This simplifies the main load_test function by abstracting the configuration details.

  2. Updates to the load_test function:

    • Refactoring of the logic to use the new helper functions, streamlining the process of loading and configuring a test function. With the improved workflow, the function now checks for a custom scorer before falling back to the provider-based test loader.
  3. Enhancements to test IDs listing:

    • The _list_test_ids function now aggregates test IDs from multiple sources including the default test provider and custom scorers from scorer_store. It also integrates scorers provided from a built-in provider if available, ensuring a comprehensive list of all available test IDs.

Overall, these changes improve code readability and maintainability while extending functionality to support custom scorer registrations alongside traditional test providers.

Test Suggestions

  • Verify that a custom scorer from the scorer_store is loaded successfully when present.
  • Test that the fallback mechanism correctly loads the test function from the test provider when a custom scorer is not found.
  • Ensure that a LoadTestError is raised when a non-existent test provider is referenced.
  • Check that the test function is properly configured with the expected attributes (test_id, docstring, inputs, params, tags, and tasks) after calling load_test.
  • Validate that _list_test_ids returns a complete and correctly sorted list of test IDs from both the provider and scorer_store.

@AnilSorathiya AnilSorathiya added the enhancement New feature or request label Jan 15, 2026
@AnilSorathiya AnilSorathiya merged commit 1294af4 into main Jan 16, 2026
17 of 18 checks passed
@AnilSorathiya AnilSorathiya deleted the anilsorathiya/sc-14010/create-and-add-scorer-store-in-the-list-tests branch January 16, 2026 11:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

chore Chore tasks that aren't bugs or new features enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants