Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
name: CI

on:
push:
branches: [main, add_ci]

pull_request:
branches: [main, add_ci]

workflow_dispatch:

jobs:
pre-commit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6

- name: "Set up Python"
uses: actions/setup-python@v6
with:
python-version-file: ".python-version"

- uses: pre-commit/action@v3.0.1
14 changes: 14 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v6.0.0
hooks:
- id: end-of-file-fixer
- id: trailing-whitespace
- id: check-added-large-files # Default is 500kB

# Ensure uv.lock file is up to date with pyproject.toml
- repo: https://github.com/astral-sh/uv-pre-commit
# uv version.
rev: 0.9.27
hooks:
- id: uv-lock
1 change: 1 addition & 0 deletions .python-version
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
3.10
14 changes: 12 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,13 +128,13 @@ If you find MAESTRO or its dataset useful in your research, please consider citi

```
@misc{maestro,
title={MAESTRO: Multi-Agent Evaluation Suite for Testing, Reliability, and Observability},
title={MAESTRO: Multi-Agent Evaluation Suite for Testing, Reliability, and Observability},
author={Tie Ma and Yixi Chen and Vaastav Anand and Alessandro Cornacchia and Amândio R. Faustino and Guanheng Liu and Shan Zhang and Hongbin Luo and Suhaib A. Fahmy and Zafar A. Qazi and Marco Canini},
year={2026},
eprint={2601.00481},
archivePrefix={arXiv},
primaryClass={cs.NI},
url={https://arxiv.org/abs/2601.00481},
url={https://arxiv.org/abs/2601.00481},
}
```

Expand All @@ -158,3 +158,13 @@ If you find MAESTRO or its dataset useful in your research, please consider citi
10. [Testing and Enhancing Multi-Agent Systems for Robust Code Generation](https://arxiv.org/html/2510.10460)
11. [Evaluating Variance in Visual Question Answering Benchmarks](https://arxiv.org/html/2508.02645)
-->


## Developing MAESTRO
```bash
git clone git@github.com:sands-lab/maestro.git
cd maestro
uv sync
# Install pre-commit hooks
uv run -- pre-commit install
```
2 changes: 1 addition & 1 deletion examples/.gitignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
deprecated/
agntcy/

langgraph_flat/
langgraph_flat/
2 changes: 1 addition & 1 deletion examples/adk/brand-search-optimization/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -219,4 +219,4 @@ This agent sample is provided for illustrative purposes only and is not intended

This sample has not been rigorously tested, may contain bugs or limitations, and does not include features or optimizations typically required for a production environment (e.g., robust error handling, security measures, scalability, performance considerations, comprehensive logging, or advanced configuration options).

Users are solely responsible for any further development, testing, security hardening, and deployment of agents based on this sample. We recommend thorough review, testing, and the implementation of appropriate safeguards before using any derived agent in a live or critical system.
Users are solely responsible for any further development, testing, security hardening, and deployment of agents based on this sample. We recommend thorough review, testing, and the implementation of appropriate safeguards before using any derived agent in a live or critical system.
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
6. Transfer to root_agent

You are helpful keyword finding agent for a brand name.
Your primary function is to find keywords shoppers would type in when trying to find for the products from the brand user provided.
Your primary function is to find keywords shoppers would type in when trying to find for the products from the brand user provided.

<Tool Calling>
- call `get_product_details_for_brand` tool to find product from a brand
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@
- if the user says google shopping, visit this website link is https://www.google.com/search?hl=en&q=<keyword> and click on "shopping" tab
</Navigation & Searching>

<Gather Information>
<Gather Information>
- getting titles of the top 3 products by analyzing the webpage
- Do not make up 3 products
- Show title of the products in a markdown format
Expand All @@ -40,7 +40,7 @@
<Key Constraints>
- Continue until you believe the title, description and attribute information is gathered
- Do not make up title, description and attribute information
- If you can not find the information, convery this information to the user
- If you can not find the information, convery this information to the user
</Key Constraints>

Please follow these steps to accomplish the task at hand:
Expand Down
Loading