Skip to content

Conversation

@poshinchen
Copy link
Contributor

Description

  1. Add ResponseRelevanceEvaluator to assess the relevance of the LLM response to the question, in other words, how focused the LLM response is on the given question.
  2. Implement 5-level scoring system (Not At All, Not Generally, Neutral/Mixed, Generally Yes, Completely Yes)

Related Issues

#100

Documentation PR

WIP

Type of Change

  1. New feature ("ResponseRelevance Evaulator")

Notes

  • Need to run evaluations via agentcore online evaluator and compare the scores.

Testing

How have you tested the change? Verify that the changes do not break functionality or introduce warnings in consuming repositories: agents-docs, agents-tools, agents-cli

  • I ran hatch run prepare

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@poshinchen poshinchen deployed to auto-approve January 30, 2026 16:02 — with GitHub Actions Active
@poshinchen poshinchen changed the title (WIP) feat: added ResponseRelevanceEvaluator feat: added ResponseRelevanceEvaluator Jan 30, 2026
)
return [result]

def _get_last_turn(self, evaluation_case: EvaluationData[InputT, OutputT]) -> TraceLevelInput:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some duplicated extraction logic comparing with other evaluators. The follow-up will be to extract them in a better place. Currently Evaluator base class is not the best option because it contains non trace-based evaluators implementation. This is not a blocker though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant