2026 01 27 fix tutorial cloud evaluations implementation #500

jonburchel · 2026-01-27T18:16:39Z

This pull request significantly refactors the evaluation script for the Modern Workplace Assistant sample, transitioning it from a custom, manual evaluation workflow to a streamlined, cloud-based evaluation using the Azure AI Projects SDK and built-in evaluators. The update also aligns the requirements with stable SDK versions.

Key changes include:

Migration to Cloud Evaluation Workflow:

The evaluate.py script is completely rewritten to use Azure AI Projects and OpenAI cloud evaluation APIs, replacing all custom validation, test data loading, and result calculation logic with standardized, cloud-based evaluation flows.

Simplification and Modernization:

Removed all legacy code for loading test questions, validating responses, batch evaluation, and manual result reporting; now the script programmatically creates agents, defines evaluation criteria, runs cloud-based evaluations, and retrieves results.
Updated the script to use environment variables for configuration and added clear usage instructions.

Requirements Update:

Updated requirements.txt to pin azure-ai-projects to version 1.0.0 and azure-ai-agents to 1.2.0b6, ensuring compatibility with the new evaluation workflow.

sdgilley · 2026-01-28T12:50:02Z

⚠️ Manual review required

This PR requires manual review due to the following issues:

Modified file has deleted cells: samples/python/enterprise-agent-tutorial/1-idea-to-prototype/evaluate.py

Please review and address these issues before merging.

jonburchel added 2 commits January 21, 2026 13:33

Fixes requirements.txt

09c17a3

Fixes evaluation implementation

4260550

jonburchel requested a review from a team as a code owner January 27, 2026 18:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2026 01 27 fix tutorial cloud evaluations implementation #500

2026 01 27 fix tutorial cloud evaluations implementation #500

jonburchel commented Jan 27, 2026 •

edited

Loading

Uh oh!

sdgilley commented Jan 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

2026 01 27 fix tutorial cloud evaluations implementation #500

Are you sure you want to change the base?

2026 01 27 fix tutorial cloud evaluations implementation #500

Conversation

jonburchel commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sdgilley commented Jan 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jonburchel commented Jan 27, 2026 •

edited

Loading