You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: DynamoPlanner profiler to use hf_id for AIConfigurator 0.4.0 (#4167)
Signed-off-by: Jason Zhou <jasonzho@jasonzho-mlt.client.nvidia.com>
Signed-off-by: Jason Zhou <jasonzho@nvidia.com>
Co-authored-by: Jason Zhou <jasonzho@jasonzho-mlt.client.nvidia.com>
Signed-off-by: Daiyaan <darfeen@nvidia.com>
decode_interpolation_granularity: Int (how many samples to benchmark to interpolate ITL under different active kv cache size and decode context length, default: 6)
81
81
use_ai_configurator: Boolean (use ai-configurator to estimate benchmarking results instead of running actual deployment, default: False)
82
82
aic_system: String (target system for use with aiconfigurator, default: None)
83
-
aic_model_name: String (aiconfigurator name of the target model, default: None)
83
+
aic_hf_id: String (aiconfigurator huggingface id of the target model, default: None)
84
84
aic_backend: String (aiconfigurator backend of the target model, if not provided, will use args.backend, default: "")
85
85
aic_backend_version: String (specify backend version when using aiconfigurator to estimate perf, default: None)
86
86
dry_run: Boolean (dry run the profile job, default: False)
0 commit comments