Skip to content

Conversation

@dbermuehler
Copy link

@dbermuehler dbermuehler commented Dec 17, 2025

Description

Related Issues

#74

Documentation PR

Type of Change

New feature

Testing

How have you tested the change? Verify that the changes do not break functionality or introduce warnings in consuming repositories: agents-docs, agents-tools, agents-cli

  • I ran hatch run prepare

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@dbermuehler dbermuehler changed the title feat: Optional Case specific Goal for GoalSuccessRateEvaluator #74 feat: Optional Case specific Goal for GoalSuccessRateEvaluator Dec 17, 2025
Copy link

@strands-agent strands-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: ✅ LGTM with Minor Suggestions

Great feature addition! This allows users to provide case-specific goals for the GoalSuccessRateEvaluator, which is very useful for more targeted evaluations.

What's Good:

  1. Clean Implementation: The goal is read from evaluation_case.metadata.get("goal") - elegant and backward-compatible
  2. New Prompt Template (v1): Good versioning approach with a new v1 template that explicitly documents the optional goal
  3. Sensible Default: Changed default version to "v1" - users get the new feature by default
  4. Refactored _format_prompt: Now takes the full evaluation case instead of just session_input, allowing access to metadata

Questions/Suggestions:

  1. Missing __init__.py update? - Does src/strands_evals/evaluators/prompt_templates/goal_success_rate/__init__.py need to be updated to include goal_success_rate_v1? Without this, the get_template("v1") call won't find the new template.

  2. Tests: The checklist indicates no tests were added. Consider adding a test case like:

    def test_goal_success_rate_evaluator_with_custom_goal():
        evaluator = GoalSuccessRateEvaluator()
        case = EvaluationData(
            input="What's the weather?",
            output="It's sunny!",
            metadata={"goal": "Get accurate weather information"}
        )
        # Test that custom goal is used in evaluation
  3. Documentation: Would be helpful to add a brief example in the feature request issue (#74) showing how to use the new metadata field.

Non-Blocking:

  • Consider running hatch run prepare to ensure all checks pass (checklist item)

Overall this is a clean, useful feature. Happy to approve once the above items are addressed! 🦆


🤖 This is an experimental AI agent response from the Strands team, powered by Strands Agents. We're exploring how AI agents can help with community support and development. Your feedback helps us improve! If you'd prefer human assistance, please let us know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants