feat: DynamoPlanner profiler to use hf_id for AIConfigurator 0.4.0 #4167

jasonqinzhou · 2025-11-06T23:23:06Z

Overview:

Details:

Use aic_hf_id to have better compatibility, since most users use huggingface id to define a model.
If aic_backend_version is empty, choose the latest version of the backend supported by AIC.

Where should the reviewer start?

benchmarks/profiler/profile_sla.py

copy-pr-bot · 2025-11-09T12:26:37Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Signed-off-by: Jason Zhou <jasonzho@jasonzho-mlt.client.nvidia.com>

coderabbitai · 2025-11-09T12:31:14Z

Walkthrough

This pull request systematically refactors the codebase to replace aic_model_name with aic_hf_id, aligning model identifiers with HuggingFace format (e.g., "Qwen/Qwen3-32B" instead of "QWEN3_32B"). The aiconfigurator dependency is upgraded via version bump and commit hash. Automatic version detection is added to the estimator when unspecified.

Changes

Cohort / File(s)	Change Summary
Dependency & Attribution Updates `ATTRIBUTIONS-Python.md`, `benchmarks/pyproject.toml`	Updated aiconfigurator package version from 0.2.0 to 0.4.0 and pinned commit hash to 5554d2eb.
Configuration Files (YAML) `benchmarks/profiler/deploy/profile_sla_aic_dgdr.yaml`, `deploy/cloud/operator/config/samples/nvidia.com_v1alpha1_dynamographdeploymentrequest.yaml`, `docs/planner/sla_planner_quickstart.md`	Renamed profiling configuration field from `aic_model_name` to `aic_hf_id` and updated model identifier value format from "QWEN3_32B" to "Qwen/Qwen3-32B".
Core Implementation – Model Identifier Refactor `benchmarks/profiler/utils/estimate_perf.py`, `benchmarks/profiler/utils/profiler_argparse.py`, `benchmarks/profiler/profile_sla.py`	Renamed constructor/parameter `aic_model_name` → `aic_hf_id` throughout; added automatic fetching of latest perf database version when `version` is unspecified; updated validation logic and model retrieval calls.
Test Files – Model Identifier Refactor `tests/profiler/test_profile_sla_aiconfigurator.py`, `tests/profiler/test_profile_sla_dryrun.py`, `deploy/cloud/operator/internal/controller/dynamographdeploymentrequest_controller_test.go`	Updated test fixtures and parameterized tests to use `aic_hf_id` instead of `aic_model_name`; changed model values to HuggingFace format; updated backend version handling.
Documentation `docs/benchmarks/sla_driven_profiling.md`	Updated profiling configuration examples to use `aic_hf_id` instead of `aic_model_name`; removed "Model name mapping examples" section; refined backend version guidance.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Areas requiring extra attention:

benchmarks/profiler/utils/estimate_perf.py – Automatic version fetching logic when version parameter is falsy; verify that the fallback behavior works correctly and doesn't mask missing version specifications.
Model identifier format consistency – Ensure all config files and code paths correctly reference the new HuggingFace identifier format across YAML configurations, Python calls, and tests.
Constructor call sites – Confirm that all instantiations of AIConfiguratorPerfEstimator pass the renamed hf_id parameter correctly (especially in benchmarks/profiler/profile_sla.py and test files).

Poem

🐰 The fields have been renamed with HuggingFace in mind,
From QWEN3_32B to Qwen/Qwen3, a model more refined,
Configuration swept and tests updated with care,
Each aic_hf_id now floating through the air! ✨

Pre-merge checks

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Description check	❓ Inconclusive	The PR description provides overview and details but uses placeholder text for related issues (#xxx) and lacks sufficient depth.	Replace placeholder 'closes GitHub issue: #xxx' with the actual GitHub issue number. Provide more detailed justification for the compatibility changes and version selection logic.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically describes the main change: updating the profiler to use hf_id for AIConfigurator 0.4.0, which aligns with the systematic replacement of aic_model_name with aic_hf_id throughout the codebase.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

tests/profiler/test_profile_sla_aiconfigurator.py (1)

63-69: Clarify whether aic_backend_version can be None.

The fixture at line 54 sets aic_backend_version = None to test auto-detection, but line 63 includes "aic_backend_version" in the missing args test, which expects a ValueError when it's None. This appears contradictory.

If the auto-detection feature allows None as a valid value (as suggested by the fixture and the parametrize case at line 98), then aic_backend_version should be removed from the missing_arg parametrize list. Otherwise, the fixture should use a non-None default.

🧹 Nitpick comments (1)

deploy/cloud/operator/config/samples/nvidia.com_v1alpha1_dynamographdeploymentrequest.yaml (1)
57-57: Update inline comment to reflect the field rename semantics.

The field has been renamed from aic_model_name to aic_hf_id to align with AIConfigurator 0.4.0, but the inline comment still references "Model name". Consider updating the comment to "HuggingFace model ID for AI Configurator" for clarity and consistency with the new field's purpose.

Apply this change:
-        aic_hf_id: Qwen/Qwen3-0.6B  # Model name for AI Configurator
+        aic_hf_id: Qwen/Qwen3-0.6B  # HuggingFace model ID for AI Configurator

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 09b26bf and 4bde8f7.

📒 Files selected for processing (12)

ATTRIBUTIONS-Python.md (1 hunks)
benchmarks/profiler/deploy/profile_sla_aic_dgdr.yaml (1 hunks)
benchmarks/profiler/profile_sla.py (1 hunks)
benchmarks/profiler/utils/estimate_perf.py (2 hunks)
benchmarks/profiler/utils/profiler_argparse.py (2 hunks)
benchmarks/pyproject.toml (1 hunks)
deploy/cloud/operator/config/samples/nvidia.com_v1alpha1_dynamographdeploymentrequest.yaml (1 hunks)
deploy/cloud/operator/internal/controller/dynamographdeploymentrequest_controller_test.go (2 hunks)
docs/benchmarks/sla_driven_profiling.md (1 hunks)
docs/planner/sla_planner_quickstart.md (1 hunks)
tests/profiler/test_profile_sla_aiconfigurator.py (3 hunks)
tests/profiler/test_profile_sla_dryrun.py (7 hunks)

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-09-04T19:03:06.643Z

Learnt from: biswapanda
Repo: ai-dynamo/dynamo PR: 2872
File: examples/multimodal/deploy/agg_qwen.yaml:53-60
Timestamp: 2025-09-04T19:03:06.643Z
Learning: In the dynamo repository, Kubernetes Custom Resources use `gpu: "1"` format for GPU resource limits and requests, not the standard Kubernetes `nvidia.com/gpu: 1` format. This applies to DynamoGraphDeployment resources and other dynamo CRs.

Applied to files:

deploy/cloud/operator/config/samples/nvidia.com_v1alpha1_dynamographdeploymentrequest.yaml

🧬 Code graph analysis (2)

benchmarks/profiler/profile_sla.py (1)

benchmarks/profiler/utils/estimate_perf.py (1)

AIConfiguratorPerfEstimator (29-233)

tests/profiler/test_profile_sla_aiconfigurator.py (1)

tests/profiler/test_profile_sla_dryrun.py (1)

trtllm_args (129-162)

🪛 Ruff (0.14.3)

benchmarks/profiler/profile_sla.py

153-155: Abstract raise to an inner function

(TRY301)

153-155: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Build and Test - dynamo

🔇 Additional comments (16)

ATTRIBUTIONS-Python.md (1)

444-444: Version bump correctly reflects aiconfigurator dependency upgrade.

The update from aiconfigurator (0.2.0) to (0.4.0) aligns with the broader PR refactoring that renames model identifiers. This is a documentation-only change with no functional impact.

Please verify that this version bump is consistent with the dependency declaration in benchmarks/pyproject.toml or other relevant dependency files to ensure no version mismatches exist across the codebase.

benchmarks/pyproject.toml (1)

43-43: LGTM! Dependency updated for AIConfigurator 0.4.0.

The aiconfigurator dependency is updated to the commit that supports the new aic_hf_id parameter.

deploy/cloud/operator/internal/controller/dynamographdeploymentrequest_controller_test.go (1)

353-353: LGTM! Test fixtures updated for HuggingFace model identifiers.

The AI Configurator test fixtures correctly reflect the field rename (aic_model_name → aic_hf_id) and value format change (internal QWEN3_32B → HuggingFace Qwen/Qwen3-32B).

Also applies to: 1063-1063

docs/planner/sla_planner_quickstart.md (1)

233-233: LGTM! Documentation updated for HuggingFace model identifiers.

The quickstart guide correctly documents the new aic_hf_id field with HuggingFace model identifier format.

docs/benchmarks/sla_driven_profiling.md (2)

302-302: LGTM! Documentation updated for HuggingFace model identifiers.

The field name is correctly updated from aic_model_name to aic_hf_id to reflect the new HuggingFace-based model identification.

303-303: Change verified as correct.

AIConfigurator v0.4.x officially supports TensorRT-LLM versions 0.20.0 and 1.0.0rc3 only. The removal of 1.0.0rc6 from line 303's documentation comment is accurate and reflects the actual supported versions. This is a documentation correction, not a breaking change.

tests/profiler/test_profile_sla_dryrun.py (1)

70-70: LGTM! Test fixtures consistently updated.

All dry-run test fixtures correctly renamed the field from aic_model_name to aic_hf_id.

Also applies to: 106-106, 156-156, 199-199, 265-265, 331-331, 397-397

benchmarks/profiler/deploy/profile_sla_aic_dgdr.yaml (1)

22-22: LGTM! Sample DGDR configuration updated.

The AI Configurator profiling sample correctly uses aic_hf_id with the HuggingFace model identifier format (Qwen/Qwen3-32B).

benchmarks/profiler/utils/profiler_argparse.py (1)

284-287: LGTM! CLI argument updated for HuggingFace model identifiers.

The argument rename from --aic-model-name to --aic-hf-id correctly reflects the use of HuggingFace model identifiers. The updated help text provides clear examples (e.g., Qwen/Qwen3-32B, meta-llama/Llama-3.1-405B).

Note: This is a breaking change. Users with existing scripts using --aic-model-name will need to update to --aic-hf-id. Consider documenting this in release notes or migration guides.

benchmarks/profiler/profile_sla.py (1)

152-169: LGTM! Validation and usage updated for HuggingFace model identifiers.

The validation logic correctly requires --aic-hf-id when using AI Configurator, and the parameter is properly passed to AIConfiguratorPerfEstimator. Error messages are clear and helpful.

benchmarks/profiler/utils/estimate_perf.py (3)

39-39: LGTM: Parameter rename aligns with HuggingFace conventions.

The rename from model_name to hf_id with the inline example improves clarity. The AI summary indicates related files have been updated accordingly.

64-73: LGTM: Consistent usage of hf_id.

The attribute assignment and API call correctly use the renamed hf_id parameter.

47-51: Verify error handling for auto-detection.

The auto-detection of the latest database version lacks explicit error handling. If get_latest_database_version raises an exception (e.g., network error, invalid configuration), it will propagate uncaught. If it returns None, the error will only be caught downstream at line 57-60 with a generic "Database not found" message. Consider wrapping the version fetch in a try-except block with a more informative error message, or verify with the aiconfigurator maintainers that exceptions from get_latest_database_version are not expected.

tests/profiler/test_profile_sla_aiconfigurator.py (3)

52-54: LGTM: Fixture updated to test auto-detection feature.

The rename to aic_hf_id aligns with the API changes, and setting aic_backend_version = None exercises the new auto-detection logic introduced in the estimator.

94-109: LGTM: Comprehensive test coverage for the new API.

The parametrize decorators test multiple backend versions (including None for auto-detection) and HuggingFace model IDs, providing good coverage of the refactored functionality.

110-117: LGTM: Test function correctly uses the renamed parameter.

The function signature and body are properly updated to use hf_model_id and assign it to aic_hf_id.

benchmarks/pyproject.toml

benchmarks/profiler/profile_sla.py

benchmarks/profiler/utils/profiler_argparse.py

Signed-off-by: Jason Zhou <jasonzho@nvidia.com>

benchmarks/profiler/profile_sla.py

benchmarks/profiler/utils/profiler_argparse.py

Signed-off-by: Jason Zhou <jasonzho@nvidia.com>

dagil-nvidia · 2025-11-10T19:41:50Z

/ok to test b403025

jasonqinzhou · 2025-11-10T19:52:07Z

/ok to test b403025

jasonqinzhou · 2025-11-10T21:09:54Z

/ok to test b403025

jasonqinzhou · 2025-11-10T21:10:30Z

/ok to test

copy-pr-bot · 2025-11-10T21:10:33Z

/ok to test

@jasonqinzhou, there was an error processing your request: E1

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/1/

jasonqinzhou · 2025-11-10T21:11:11Z

/ok to test 016d301

…4167) Signed-off-by: Jason Zhou <jasonzho@jasonzho-mlt.client.nvidia.com> Signed-off-by: Jason Zhou <jasonzho@nvidia.com> Co-authored-by: Jason Zhou <jasonzho@jasonzho-mlt.client.nvidia.com>

…4167) Signed-off-by: Jason Zhou <jasonzho@jasonzho-mlt.client.nvidia.com> Signed-off-by: Jason Zhou <jasonzho@nvidia.com> Co-authored-by: Jason Zhou <jasonzho@jasonzho-mlt.client.nvidia.com> Signed-off-by: Daiyaan <darfeen@nvidia.com>

pull-request-size bot added the size/M label Nov 6, 2025

jasonqinzhou changed the title ~~[WIP] feat: DynamoPlanner to adapt to AIConfigurator 0.4.0~~ feat: DynamoPlanner profiler to use hf_id for AIConfigurator 0.4.0 Nov 9, 2025

github-actions bot added the feat label Nov 9, 2025

jasonqinzhou marked this pull request as ready for review November 9, 2025 12:27

jasonqinzhou requested review from a team as code owners November 9, 2025 12:27

jasonqinzhou and others added 3 commits November 9, 2025 04:29

feat: DynamoPlanner to adapt to AIConfigurator 0.4.0

d10489e

Signed-off-by: Jason Zhou <jasonzho@jasonzho-mlt.client.nvidia.com>

fix up

883ef62

Signed-off-by: Jason Zhou <jasonzho@jasonzho-mlt.client.nvidia.com>

fix format

4bde8f7

Signed-off-by: Jason Zhou <jasonzho@jasonzho-mlt.client.nvidia.com>

jasonqinzhou force-pushed the jasonzho/aic_0.4.0 branch from bb605c1 to 4bde8f7 Compare November 9, 2025 12:29

coderabbitai bot reviewed Nov 9, 2025

View reviewed changes

tedzhouhk reviewed Nov 9, 2025

View reviewed changes

benchmarks/pyproject.toml Show resolved Hide resolved

tedzhouhk reviewed Nov 9, 2025

View reviewed changes

benchmarks/profiler/profile_sla.py Outdated Show resolved Hide resolved

tedzhouhk reviewed Nov 9, 2025

View reviewed changes

benchmarks/profiler/utils/profiler_argparse.py Show resolved Hide resolved

fix comments

d83eb68

jasonqinzhou requested a review from a team as a code owner November 10, 2025 07:10

pull-request-size bot added size/L and removed size/M labels Nov 10, 2025

jasonqinzhou enabled auto-merge (squash) November 10, 2025 07:11

Update profile_sla.py

70fa2c9

Signed-off-by: Jason Zhou <jasonzho@nvidia.com>

tedzhouhk approved these changes Nov 10, 2025

View reviewed changes

tmonty12 reviewed Nov 10, 2025

View reviewed changes

benchmarks/profiler/profile_sla.py Show resolved Hide resolved

benchmarks/profiler/utils/profiler_argparse.py Show resolved Hide resolved

jasonqinzhou requested a review from tmonty12 November 10, 2025 19:04

dillon-cullinan approved these changes Nov 10, 2025

View reviewed changes

Merge branch 'main' into jasonzho/aic_0.4.0

fd3ddd6

Signed-off-by: Jason Zhou <jasonzho@nvidia.com>

nnshah1 approved these changes Nov 10, 2025

View reviewed changes

Merge branch 'main' into jasonzho/aic_0.4.0

b403025

jasonqinzhou removed the request for review from tmonty12 November 10, 2025 19:32

jasonqinzhou disabled auto-merge November 10, 2025 19:33

jasonqinzhou enabled auto-merge (squash) November 10, 2025 19:33

copy-pr-bot bot temporarily deployed to GITLAB November 10, 2025 19:41 Inactive

copy-pr-bot bot temporarily deployed to GITLAB November 10, 2025 19:42 Inactive

fix lint

016d301

dagil-nvidia requested a review from tmonty12 November 10, 2025 21:00

tmonty12 approved these changes Nov 10, 2025

View reviewed changes

copy-pr-bot bot temporarily deployed to GITLAB November 10, 2025 21:11 Inactive

copy-pr-bot bot temporarily deployed to GITLAB November 10, 2025 21:14 Inactive

jasonqinzhou merged commit 4398637 into main Nov 10, 2025
42 of 47 checks passed

jasonqinzhou deleted the jasonzho/aic_0.4.0 branch November 10, 2025 23:17

feat: DynamoPlanner profiler to use hf_id for AIConfigurator 0.4.0 #4167

feat: DynamoPlanner profiler to use hf_id for AIConfigurator 0.4.0 #4167

Uh oh!

Conversation

jasonqinzhou commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Details:

Where should the reviewer start?

Uh oh!

copy-pr-bot bot commented Nov 9, 2025

Uh oh!

coderabbitai bot commented Nov 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dagil-nvidia commented Nov 10, 2025

Uh oh!

jasonqinzhou commented Nov 10, 2025

Uh oh!

jasonqinzhou commented Nov 10, 2025

Uh oh!

jasonqinzhou commented Nov 10, 2025

Uh oh!

copy-pr-bot bot commented Nov 10, 2025

Uh oh!

jasonqinzhou commented Nov 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

jasonqinzhou commented Nov 6, 2025 •

edited

Loading

coderabbitai bot commented Nov 9, 2025 •

edited

Loading