Skip to content

Conversation

@jasonqinzhou
Copy link
Contributor

@jasonqinzhou jasonqinzhou commented Nov 6, 2025

Overview:

feat: DynamoPlanner profiler to use hf_id for AIConfigurator 0.4.0 #4167

Details:

  1. Use aic_hf_id to have better compatibility, since most users use huggingface id to define a model.
  2. If aic_backend_version is empty, choose the latest version of the backend supported by AIC.

Where should the reviewer start?

benchmarks/profiler/profile_sla.py

@copy-pr-bot
Copy link

copy-pr-bot bot commented Nov 9, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@jasonqinzhou jasonqinzhou changed the title [WIP] feat: DynamoPlanner to adapt to AIConfigurator 0.4.0 feat: DynamoPlanner profiler to use hf_id for AIConfigurator 0.4.0 Nov 9, 2025
@github-actions github-actions bot added the feat label Nov 9, 2025
@jasonqinzhou jasonqinzhou marked this pull request as ready for review November 9, 2025 12:27
@jasonqinzhou jasonqinzhou requested review from a team as code owners November 9, 2025 12:27
jasonqinzhou and others added 3 commits November 9, 2025 04:29
Signed-off-by: Jason Zhou <jasonzho@jasonzho-mlt.client.nvidia.com>
Signed-off-by: Jason Zhou <jasonzho@jasonzho-mlt.client.nvidia.com>
Signed-off-by: Jason Zhou <jasonzho@jasonzho-mlt.client.nvidia.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 9, 2025

Walkthrough

This pull request systematically refactors the codebase to replace aic_model_name with aic_hf_id, aligning model identifiers with HuggingFace format (e.g., "Qwen/Qwen3-32B" instead of "QWEN3_32B"). The aiconfigurator dependency is upgraded via version bump and commit hash. Automatic version detection is added to the estimator when unspecified.

Changes

Cohort / File(s) Change Summary
Dependency & Attribution Updates
ATTRIBUTIONS-Python.md, benchmarks/pyproject.toml
Updated aiconfigurator package version from 0.2.0 to 0.4.0 and pinned commit hash to 5554d2eb.
Configuration Files (YAML)
benchmarks/profiler/deploy/profile_sla_aic_dgdr.yaml, deploy/cloud/operator/config/samples/nvidia.com_v1alpha1_dynamographdeploymentrequest.yaml, docs/planner/sla_planner_quickstart.md
Renamed profiling configuration field from aic_model_name to aic_hf_id and updated model identifier value format from "QWEN3_32B" to "Qwen/Qwen3-32B".
Core Implementation – Model Identifier Refactor
benchmarks/profiler/utils/estimate_perf.py, benchmarks/profiler/utils/profiler_argparse.py, benchmarks/profiler/profile_sla.py
Renamed constructor/parameter aic_model_nameaic_hf_id throughout; added automatic fetching of latest perf database version when version is unspecified; updated validation logic and model retrieval calls.
Test Files – Model Identifier Refactor
tests/profiler/test_profile_sla_aiconfigurator.py, tests/profiler/test_profile_sla_dryrun.py, deploy/cloud/operator/internal/controller/dynamographdeploymentrequest_controller_test.go
Updated test fixtures and parameterized tests to use aic_hf_id instead of aic_model_name; changed model values to HuggingFace format; updated backend version handling.
Documentation
docs/benchmarks/sla_driven_profiling.md
Updated profiling configuration examples to use aic_hf_id instead of aic_model_name; removed "Model name mapping examples" section; refined backend version guidance.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Areas requiring extra attention:

  • benchmarks/profiler/utils/estimate_perf.py – Automatic version fetching logic when version parameter is falsy; verify that the fallback behavior works correctly and doesn't mask missing version specifications.
  • Model identifier format consistency – Ensure all config files and code paths correctly reference the new HuggingFace identifier format across YAML configurations, Python calls, and tests.
  • Constructor call sites – Confirm that all instantiations of AIConfiguratorPerfEstimator pass the renamed hf_id parameter correctly (especially in benchmarks/profiler/profile_sla.py and test files).

Poem

🐰 The fields have been renamed with HuggingFace in mind,
From QWEN3_32B to Qwen/Qwen3, a model more refined,
Configuration swept and tests updated with care,
Each aic_hf_id now floating through the air! ✨

Pre-merge checks

❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Description check ❓ Inconclusive The PR description provides overview and details but uses placeholder text for related issues (#xxx) and lacks sufficient depth. Replace placeholder 'closes GitHub issue: #xxx' with the actual GitHub issue number. Provide more detailed justification for the compatibility changes and version selection logic.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: updating the profiler to use hf_id for AIConfigurator 0.4.0, which aligns with the systematic replacement of aic_model_name with aic_hf_id throughout the codebase.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/profiler/test_profile_sla_aiconfigurator.py (1)

63-69: Clarify whether aic_backend_version can be None.

The fixture at line 54 sets aic_backend_version = None to test auto-detection, but line 63 includes "aic_backend_version" in the missing args test, which expects a ValueError when it's None. This appears contradictory.

If the auto-detection feature allows None as a valid value (as suggested by the fixture and the parametrize case at line 98), then aic_backend_version should be removed from the missing_arg parametrize list. Otherwise, the fixture should use a non-None default.

🧹 Nitpick comments (1)
deploy/cloud/operator/config/samples/nvidia.com_v1alpha1_dynamographdeploymentrequest.yaml (1)

57-57: Update inline comment to reflect the field rename semantics.

The field has been renamed from aic_model_name to aic_hf_id to align with AIConfigurator 0.4.0, but the inline comment still references "Model name". Consider updating the comment to "HuggingFace model ID for AI Configurator" for clarity and consistency with the new field's purpose.

Apply this change:

-        aic_hf_id: Qwen/Qwen3-0.6B  # Model name for AI Configurator
+        aic_hf_id: Qwen/Qwen3-0.6B  # HuggingFace model ID for AI Configurator
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 09b26bf and 4bde8f7.

📒 Files selected for processing (12)
  • ATTRIBUTIONS-Python.md (1 hunks)
  • benchmarks/profiler/deploy/profile_sla_aic_dgdr.yaml (1 hunks)
  • benchmarks/profiler/profile_sla.py (1 hunks)
  • benchmarks/profiler/utils/estimate_perf.py (2 hunks)
  • benchmarks/profiler/utils/profiler_argparse.py (2 hunks)
  • benchmarks/pyproject.toml (1 hunks)
  • deploy/cloud/operator/config/samples/nvidia.com_v1alpha1_dynamographdeploymentrequest.yaml (1 hunks)
  • deploy/cloud/operator/internal/controller/dynamographdeploymentrequest_controller_test.go (2 hunks)
  • docs/benchmarks/sla_driven_profiling.md (1 hunks)
  • docs/planner/sla_planner_quickstart.md (1 hunks)
  • tests/profiler/test_profile_sla_aiconfigurator.py (3 hunks)
  • tests/profiler/test_profile_sla_dryrun.py (7 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-04T19:03:06.643Z
Learnt from: biswapanda
Repo: ai-dynamo/dynamo PR: 2872
File: examples/multimodal/deploy/agg_qwen.yaml:53-60
Timestamp: 2025-09-04T19:03:06.643Z
Learning: In the dynamo repository, Kubernetes Custom Resources use `gpu: "1"` format for GPU resource limits and requests, not the standard Kubernetes `nvidia.com/gpu: 1` format. This applies to DynamoGraphDeployment resources and other dynamo CRs.

Applied to files:

  • deploy/cloud/operator/config/samples/nvidia.com_v1alpha1_dynamographdeploymentrequest.yaml
🧬 Code graph analysis (2)
benchmarks/profiler/profile_sla.py (1)
benchmarks/profiler/utils/estimate_perf.py (1)
  • AIConfiguratorPerfEstimator (29-233)
tests/profiler/test_profile_sla_aiconfigurator.py (1)
tests/profiler/test_profile_sla_dryrun.py (1)
  • trtllm_args (129-162)
🪛 Ruff (0.14.3)
benchmarks/profiler/profile_sla.py

153-155: Abstract raise to an inner function

(TRY301)


153-155: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build and Test - dynamo
🔇 Additional comments (16)
ATTRIBUTIONS-Python.md (1)

444-444: Version bump correctly reflects aiconfigurator dependency upgrade.

The update from aiconfigurator (0.2.0) to (0.4.0) aligns with the broader PR refactoring that renames model identifiers. This is a documentation-only change with no functional impact.

Please verify that this version bump is consistent with the dependency declaration in benchmarks/pyproject.toml or other relevant dependency files to ensure no version mismatches exist across the codebase.

benchmarks/pyproject.toml (1)

43-43: LGTM! Dependency updated for AIConfigurator 0.4.0.

The aiconfigurator dependency is updated to the commit that supports the new aic_hf_id parameter.

deploy/cloud/operator/internal/controller/dynamographdeploymentrequest_controller_test.go (1)

353-353: LGTM! Test fixtures updated for HuggingFace model identifiers.

The AI Configurator test fixtures correctly reflect the field rename (aic_model_nameaic_hf_id) and value format change (internal QWEN3_32B → HuggingFace Qwen/Qwen3-32B).

Also applies to: 1063-1063

docs/planner/sla_planner_quickstart.md (1)

233-233: LGTM! Documentation updated for HuggingFace model identifiers.

The quickstart guide correctly documents the new aic_hf_id field with HuggingFace model identifier format.

docs/benchmarks/sla_driven_profiling.md (2)

302-302: LGTM! Documentation updated for HuggingFace model identifiers.

The field name is correctly updated from aic_model_name to aic_hf_id to reflect the new HuggingFace-based model identification.


303-303: Change verified as correct.

AIConfigurator v0.4.x officially supports TensorRT-LLM versions 0.20.0 and 1.0.0rc3 only. The removal of 1.0.0rc6 from line 303's documentation comment is accurate and reflects the actual supported versions. This is a documentation correction, not a breaking change.

tests/profiler/test_profile_sla_dryrun.py (1)

70-70: LGTM! Test fixtures consistently updated.

All dry-run test fixtures correctly renamed the field from aic_model_name to aic_hf_id.

Also applies to: 106-106, 156-156, 199-199, 265-265, 331-331, 397-397

benchmarks/profiler/deploy/profile_sla_aic_dgdr.yaml (1)

22-22: LGTM! Sample DGDR configuration updated.

The AI Configurator profiling sample correctly uses aic_hf_id with the HuggingFace model identifier format (Qwen/Qwen3-32B).

benchmarks/profiler/utils/profiler_argparse.py (1)

284-287: LGTM! CLI argument updated for HuggingFace model identifiers.

The argument rename from --aic-model-name to --aic-hf-id correctly reflects the use of HuggingFace model identifiers. The updated help text provides clear examples (e.g., Qwen/Qwen3-32B, meta-llama/Llama-3.1-405B).

Note: This is a breaking change. Users with existing scripts using --aic-model-name will need to update to --aic-hf-id. Consider documenting this in release notes or migration guides.

benchmarks/profiler/profile_sla.py (1)

152-169: LGTM! Validation and usage updated for HuggingFace model identifiers.

The validation logic correctly requires --aic-hf-id when using AI Configurator, and the parameter is properly passed to AIConfiguratorPerfEstimator. Error messages are clear and helpful.

benchmarks/profiler/utils/estimate_perf.py (3)

39-39: LGTM: Parameter rename aligns with HuggingFace conventions.

The rename from model_name to hf_id with the inline example improves clarity. The AI summary indicates related files have been updated accordingly.


64-73: LGTM: Consistent usage of hf_id.

The attribute assignment and API call correctly use the renamed hf_id parameter.


47-51: Verify error handling for auto-detection.

The auto-detection of the latest database version lacks explicit error handling. If get_latest_database_version raises an exception (e.g., network error, invalid configuration), it will propagate uncaught. If it returns None, the error will only be caught downstream at line 57-60 with a generic "Database not found" message. Consider wrapping the version fetch in a try-except block with a more informative error message, or verify with the aiconfigurator maintainers that exceptions from get_latest_database_version are not expected.

tests/profiler/test_profile_sla_aiconfigurator.py (3)

52-54: LGTM: Fixture updated to test auto-detection feature.

The rename to aic_hf_id aligns with the API changes, and setting aic_backend_version = None exercises the new auto-detection logic introduced in the estimator.


94-109: LGTM: Comprehensive test coverage for the new API.

The parametrize decorators test multiple backend versions (including None for auto-detection) and HuggingFace model IDs, providing good coverage of the refactored functionality.


110-117: LGTM: Test function correctly uses the renamed parameter.

The function signature and body are properly updated to use hf_model_id and assign it to aic_hf_id.

@jasonqinzhou jasonqinzhou requested a review from a team as a code owner November 10, 2025 07:10
@pull-request-size pull-request-size bot added size/L and removed size/M labels Nov 10, 2025
@jasonqinzhou jasonqinzhou enabled auto-merge (squash) November 10, 2025 07:11
Signed-off-by: Jason Zhou <jasonzho@nvidia.com>
Signed-off-by: Jason Zhou <jasonzho@nvidia.com>
@jasonqinzhou jasonqinzhou removed the request for review from tmonty12 November 10, 2025 19:32
@jasonqinzhou jasonqinzhou enabled auto-merge (squash) November 10, 2025 19:33
@dagil-nvidia
Copy link
Contributor

/ok to test b403025

@jasonqinzhou
Copy link
Contributor Author

/ok to test b403025

@jasonqinzhou
Copy link
Contributor Author

/ok to test b403025

@jasonqinzhou
Copy link
Contributor Author

/ok to test

@copy-pr-bot
Copy link

copy-pr-bot bot commented Nov 10, 2025

/ok to test

@jasonqinzhou, there was an error processing your request: E1

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/1/

@jasonqinzhou
Copy link
Contributor Author

/ok to test 016d301

@jasonqinzhou jasonqinzhou merged commit 4398637 into main Nov 10, 2025
42 of 47 checks passed
@jasonqinzhou jasonqinzhou deleted the jasonzho/aic_0.4.0 branch November 10, 2025 23:17
jasonqinzhou added a commit that referenced this pull request Nov 12, 2025
…4167)

Signed-off-by: Jason Zhou <jasonzho@jasonzho-mlt.client.nvidia.com>
Signed-off-by: Jason Zhou <jasonzho@nvidia.com>
Co-authored-by: Jason Zhou <jasonzho@jasonzho-mlt.client.nvidia.com>
daiyaanarfeen pushed a commit that referenced this pull request Nov 14, 2025
…4167)

Signed-off-by: Jason Zhou <jasonzho@jasonzho-mlt.client.nvidia.com>
Signed-off-by: Jason Zhou <jasonzho@nvidia.com>
Co-authored-by: Jason Zhou <jasonzho@jasonzho-mlt.client.nvidia.com>
Signed-off-by: Daiyaan <darfeen@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants