-
Notifications
You must be signed in to change notification settings - Fork 693
feat: DynamoPlanner profiler to use hf_id for AIConfigurator 0.4.0 #4167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Jason Zhou <jasonzho@jasonzho-mlt.client.nvidia.com>
Signed-off-by: Jason Zhou <jasonzho@jasonzho-mlt.client.nvidia.com>
bb605c1 to
4bde8f7
Compare
WalkthroughThis pull request systematically refactors the codebase to replace Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Areas requiring extra attention:
Poem
Pre-merge checks❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/profiler/test_profile_sla_aiconfigurator.py (1)
63-69: Clarify whetheraic_backend_versioncan be None.The fixture at line 54 sets
aic_backend_version = Noneto test auto-detection, but line 63 includes"aic_backend_version"in the missing args test, which expects a ValueError when it's None. This appears contradictory.If the auto-detection feature allows
Noneas a valid value (as suggested by the fixture and the parametrize case at line 98), thenaic_backend_versionshould be removed from themissing_argparametrize list. Otherwise, the fixture should use a non-None default.
🧹 Nitpick comments (1)
deploy/cloud/operator/config/samples/nvidia.com_v1alpha1_dynamographdeploymentrequest.yaml (1)
57-57: Update inline comment to reflect the field rename semantics.The field has been renamed from
aic_model_nametoaic_hf_idto align with AIConfigurator 0.4.0, but the inline comment still references "Model name". Consider updating the comment to "HuggingFace model ID for AI Configurator" for clarity and consistency with the new field's purpose.Apply this change:
- aic_hf_id: Qwen/Qwen3-0.6B # Model name for AI Configurator + aic_hf_id: Qwen/Qwen3-0.6B # HuggingFace model ID for AI Configurator
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
ATTRIBUTIONS-Python.md(1 hunks)benchmarks/profiler/deploy/profile_sla_aic_dgdr.yaml(1 hunks)benchmarks/profiler/profile_sla.py(1 hunks)benchmarks/profiler/utils/estimate_perf.py(2 hunks)benchmarks/profiler/utils/profiler_argparse.py(2 hunks)benchmarks/pyproject.toml(1 hunks)deploy/cloud/operator/config/samples/nvidia.com_v1alpha1_dynamographdeploymentrequest.yaml(1 hunks)deploy/cloud/operator/internal/controller/dynamographdeploymentrequest_controller_test.go(2 hunks)docs/benchmarks/sla_driven_profiling.md(1 hunks)docs/planner/sla_planner_quickstart.md(1 hunks)tests/profiler/test_profile_sla_aiconfigurator.py(3 hunks)tests/profiler/test_profile_sla_dryrun.py(7 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-04T19:03:06.643Z
Learnt from: biswapanda
Repo: ai-dynamo/dynamo PR: 2872
File: examples/multimodal/deploy/agg_qwen.yaml:53-60
Timestamp: 2025-09-04T19:03:06.643Z
Learning: In the dynamo repository, Kubernetes Custom Resources use `gpu: "1"` format for GPU resource limits and requests, not the standard Kubernetes `nvidia.com/gpu: 1` format. This applies to DynamoGraphDeployment resources and other dynamo CRs.
Applied to files:
deploy/cloud/operator/config/samples/nvidia.com_v1alpha1_dynamographdeploymentrequest.yaml
🧬 Code graph analysis (2)
benchmarks/profiler/profile_sla.py (1)
benchmarks/profiler/utils/estimate_perf.py (1)
AIConfiguratorPerfEstimator(29-233)
tests/profiler/test_profile_sla_aiconfigurator.py (1)
tests/profiler/test_profile_sla_dryrun.py (1)
trtllm_args(129-162)
🪛 Ruff (0.14.3)
benchmarks/profiler/profile_sla.py
153-155: Abstract raise to an inner function
(TRY301)
153-155: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build and Test - dynamo
🔇 Additional comments (16)
ATTRIBUTIONS-Python.md (1)
444-444: Version bump correctly reflects aiconfigurator dependency upgrade.The update from aiconfigurator (0.2.0) to (0.4.0) aligns with the broader PR refactoring that renames model identifiers. This is a documentation-only change with no functional impact.
Please verify that this version bump is consistent with the dependency declaration in
benchmarks/pyproject.tomlor other relevant dependency files to ensure no version mismatches exist across the codebase.benchmarks/pyproject.toml (1)
43-43: LGTM! Dependency updated for AIConfigurator 0.4.0.The aiconfigurator dependency is updated to the commit that supports the new
aic_hf_idparameter.deploy/cloud/operator/internal/controller/dynamographdeploymentrequest_controller_test.go (1)
353-353: LGTM! Test fixtures updated for HuggingFace model identifiers.The AI Configurator test fixtures correctly reflect the field rename (
aic_model_name→aic_hf_id) and value format change (internalQWEN3_32B→ HuggingFaceQwen/Qwen3-32B).Also applies to: 1063-1063
docs/planner/sla_planner_quickstart.md (1)
233-233: LGTM! Documentation updated for HuggingFace model identifiers.The quickstart guide correctly documents the new
aic_hf_idfield with HuggingFace model identifier format.docs/benchmarks/sla_driven_profiling.md (2)
302-302: LGTM! Documentation updated for HuggingFace model identifiers.The field name is correctly updated from
aic_model_nametoaic_hf_idto reflect the new HuggingFace-based model identification.
303-303: Change verified as correct.AIConfigurator v0.4.x officially supports TensorRT-LLM versions 0.20.0 and 1.0.0rc3 only. The removal of 1.0.0rc6 from line 303's documentation comment is accurate and reflects the actual supported versions. This is a documentation correction, not a breaking change.
tests/profiler/test_profile_sla_dryrun.py (1)
70-70: LGTM! Test fixtures consistently updated.All dry-run test fixtures correctly renamed the field from
aic_model_nametoaic_hf_id.Also applies to: 106-106, 156-156, 199-199, 265-265, 331-331, 397-397
benchmarks/profiler/deploy/profile_sla_aic_dgdr.yaml (1)
22-22: LGTM! Sample DGDR configuration updated.The AI Configurator profiling sample correctly uses
aic_hf_idwith the HuggingFace model identifier format (Qwen/Qwen3-32B).benchmarks/profiler/utils/profiler_argparse.py (1)
284-287: LGTM! CLI argument updated for HuggingFace model identifiers.The argument rename from
--aic-model-nameto--aic-hf-idcorrectly reflects the use of HuggingFace model identifiers. The updated help text provides clear examples (e.g.,Qwen/Qwen3-32B,meta-llama/Llama-3.1-405B).Note: This is a breaking change. Users with existing scripts using
--aic-model-namewill need to update to--aic-hf-id. Consider documenting this in release notes or migration guides.benchmarks/profiler/profile_sla.py (1)
152-169: LGTM! Validation and usage updated for HuggingFace model identifiers.The validation logic correctly requires
--aic-hf-idwhen using AI Configurator, and the parameter is properly passed toAIConfiguratorPerfEstimator. Error messages are clear and helpful.benchmarks/profiler/utils/estimate_perf.py (3)
39-39: LGTM: Parameter rename aligns with HuggingFace conventions.The rename from
model_nametohf_idwith the inline example improves clarity. The AI summary indicates related files have been updated accordingly.
64-73: LGTM: Consistent usage of hf_id.The attribute assignment and API call correctly use the renamed
hf_idparameter.
47-51: Verify error handling for auto-detection.The auto-detection of the latest database version lacks explicit error handling. If
get_latest_database_versionraises an exception (e.g., network error, invalid configuration), it will propagate uncaught. If it returnsNone, the error will only be caught downstream at line 57-60 with a generic "Database not found" message. Consider wrapping the version fetch in a try-except block with a more informative error message, or verify with the aiconfigurator maintainers that exceptions fromget_latest_database_versionare not expected.tests/profiler/test_profile_sla_aiconfigurator.py (3)
52-54: LGTM: Fixture updated to test auto-detection feature.The rename to
aic_hf_idaligns with the API changes, and settingaic_backend_version = Noneexercises the new auto-detection logic introduced in the estimator.
94-109: LGTM: Comprehensive test coverage for the new API.The parametrize decorators test multiple backend versions (including
Nonefor auto-detection) and HuggingFace model IDs, providing good coverage of the refactored functionality.
110-117: LGTM: Test function correctly uses the renamed parameter.The function signature and body are properly updated to use
hf_model_idand assign it toaic_hf_id.
Signed-off-by: Jason Zhou <jasonzho@nvidia.com>
Signed-off-by: Jason Zhou <jasonzho@nvidia.com>
|
/ok to test b403025 |
|
/ok to test b403025 |
|
/ok to test b403025 |
|
/ok to test |
@jasonqinzhou, there was an error processing your request: See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/1/ |
|
/ok to test 016d301 |
…4167) Signed-off-by: Jason Zhou <jasonzho@jasonzho-mlt.client.nvidia.com> Signed-off-by: Jason Zhou <jasonzho@nvidia.com> Co-authored-by: Jason Zhou <jasonzho@jasonzho-mlt.client.nvidia.com>
…4167) Signed-off-by: Jason Zhou <jasonzho@jasonzho-mlt.client.nvidia.com> Signed-off-by: Jason Zhou <jasonzho@nvidia.com> Co-authored-by: Jason Zhou <jasonzho@jasonzho-mlt.client.nvidia.com> Signed-off-by: Daiyaan <darfeen@nvidia.com>
Overview:
feat: DynamoPlanner profiler to use hf_id for AIConfigurator 0.4.0 #4167
Details:
Where should the reviewer start?
benchmarks/profiler/profile_sla.py