Skip to content

Conversation

@jonburchel
Copy link
Contributor

@jonburchel jonburchel commented Jan 27, 2026

This pull request significantly refactors the evaluation script for the Modern Workplace Assistant sample, transitioning it from a custom, manual evaluation workflow to a streamlined, cloud-based evaluation using the Azure AI Projects SDK and built-in evaluators. The update also aligns the requirements with stable SDK versions.

Key changes include:

Migration to Cloud Evaluation Workflow:

  • The evaluate.py script is completely rewritten to use Azure AI Projects and OpenAI cloud evaluation APIs, replacing all custom validation, test data loading, and result calculation logic with standardized, cloud-based evaluation flows.

Simplification and Modernization:

  • Removed all legacy code for loading test questions, validating responses, batch evaluation, and manual result reporting; now the script programmatically creates agents, defines evaluation criteria, runs cloud-based evaluations, and retrieves results.
  • Updated the script to use environment variables for configuration and added clear usage instructions.

Requirements Update:

  • Updated requirements.txt to pin azure-ai-projects to version 1.0.0 and azure-ai-agents to 1.2.0b6, ensuring compatibility with the new evaluation workflow.

@jonburchel jonburchel requested a review from a team as a code owner January 27, 2026 18:16
@sdgilley
Copy link
Contributor

⚠️ Manual review required

This PR requires manual review due to the following issues:

  • Modified file has deleted cells: samples/python/enterprise-agent-tutorial/1-idea-to-prototype/evaluate.py

Please review and address these issues before merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants