-
Notifications
You must be signed in to change notification settings - Fork 564
feat(benchmark): AIPerf run script #1501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
Greptile OverviewGreptile SummaryThis PR adds AIPerf benchmarking support to NeMo Guardrails with a well-structured command-line tool. The implementation includes YAML-based configuration, parameter sweep capabilities, and comprehensive test coverage. Key Changes:
Issues Identified: Confidence Score: 3/5
Important Files ChangedFile Analysis
Sequence DiagramsequenceDiagram
participant User
participant CLI
participant AIPerfRunner
participant ConfigValidator
participant ServiceChecker
participant AIPerf
User->>CLI: nemoguardrails aiperf run --config-file config.yaml
CLI->>AIPerfRunner: Initialize with config path
AIPerfRunner->>ConfigValidator: Load and validate YAML
ConfigValidator->>ConfigValidator: Validate with Pydantic models
ConfigValidator-->>AIPerfRunner: Return AIPerfConfig
AIPerfRunner->>ServiceChecker: _check_service()
ServiceChecker->>ServiceChecker: GET /v1/models with API key
ServiceChecker-->>AIPerfRunner: Service available
alt Single Benchmark
AIPerfRunner->>AIPerfRunner: _build_command()
AIPerfRunner->>AIPerfRunner: _create_output_dir()
AIPerfRunner->>AIPerfRunner: _save_run_metadata()
AIPerfRunner->>AIPerf: subprocess.run(aiperf command)
AIPerf-->>AIPerfRunner: Benchmark results
AIPerfRunner->>AIPerfRunner: _save_subprocess_result_json()
else Batch Benchmarks with Sweeps
AIPerfRunner->>AIPerfRunner: _get_sweep_combinations()
loop For each sweep combination
AIPerfRunner->>AIPerfRunner: _build_command(sweep_params)
AIPerfRunner->>AIPerfRunner: _create_output_dir(sweep_params)
AIPerfRunner->>AIPerfRunner: _save_run_metadata()
AIPerfRunner->>AIPerf: subprocess.run(aiperf command)
AIPerf-->>AIPerfRunner: Benchmark results
AIPerfRunner->>AIPerfRunner: _save_subprocess_result_json()
end
end
AIPerfRunner-->>CLI: Return exit code
CLI-->>User: Display summary and exit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
10 files reviewed, 3 comments
nemoguardrails/benchmark/aiperf/aiperf_configs/single_concurrency.yaml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
9 files reviewed, no comments
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
Documentation preview |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
10 files reviewed, no comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
10 files reviewed, no comments
Edit Code Review Agent Settings | Greptile
React with 👍 or 👎 to share your feedback on this new summary format
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
10 files reviewed, 5 comments
Edit Code Review Agent Settings | Greptile
React with 👍 or 👎 to share your feedback on this new summary format
|
Note: I added the API Key towards the end of development to make testing against NVCF-functions more convenient. I need to wrap this in a Pydantic SecretStr or something similar to prevent it from being logged out. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
12 files reviewed, no comments
Edit Code Review Agent Settings | Greptile
React with 👍 or 👎 to share your feedback on this new summary format
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
12 files reviewed, no comments
Edit Code Review Agent Settings | Greptile
React with 👍 or 👎 to share your feedback on this new summary format
|
@tgasser-nv I noticed that the scope of this change is quite broad. It also introduces OpenAI-compatible endpoints on the server (at least for /chat/completions and /models) which is a major change. Given that, I think it might be better to wait until #1340 is finalized and merged. What do you think? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
11 files reviewed, 1 comment
Edit Code Review Agent Settings | Greptile
React with 👍 or 👎 to share your feedback on this new summary format
I reverted the OpenAI-compatible endpoints change, I added that by mistake. This isn't blocked by #1340. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
11 files reviewed, no comments
| To run a single benchmark with fixed parameters, use the `single_concurrency.yaml` configuration: | ||
|
|
||
| ```bash | ||
| poetry run nemoguardrails aiperf run --config-file nemoguardrails/benchmark/aiperf/aiperf_configs/single_concurrency.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems that optional sections 3, 4 and 5 in Prerequisites are required to run it successfully
also one needs license for https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct/
|
|
||
| from nemoguardrails.benchmark.aiperf.aiperf_models import AIPerfConfig | ||
|
|
||
| # Set up logging |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| # Set up logging |
| for combination in itertools.product(*param_values): | ||
| combinations.append(dict(zip(param_names, combination))) | ||
|
|
||
| return combinations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is building entire list in memory. for large sweeps (e.g., 10 params × 10 values = 10B combinations), this will OOM. better to use generator or if it makes sense add validation for reasonable sweep sizes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a limit of 100 to avoid refactoring the rest of the code around generators
…o get environment variable
01fd16e to
f95d772
Compare
Description
AIPerf (Github, Docs) is Nvidia's latest benchmarking tool for LLMs. It supports any OpenAI-compatible inference service and generates synthetic data loads, benchmarks, and all metrics needed for comparison.
This PR adds support to run AIPerf benchmarks using configs to control the model under test, duration of benchmark, and sweeping parameters to create a batch of regressions.
Test Plan
Pre-requisites
See README.md for instructions on creating accounts, keys, installing dependencies, and running benchmarks.
Running a single test
Pre-commit tests
Unit-tests
Chat server
Related Issue(s)
Checklist