-
Notifications
You must be signed in to change notification settings - Fork 41
Description
Component
Setup/Standup
Desired use case or feature
The problem
Too many 'hidden' default values for successful model deployment.
Currently, the default deployed model is Facebook/opt-125m. For this small model, the existing default configuration (for example, the number of GPUs per decode pod) are just enough for successful deployment.
However, when switching to a larger model (such as Qwen/Qwen3-32B), the same default configuration will cause the stack deployment to fail.
I know it can be argued that adjusting these variables is the user’s responsibility... However, providing recommended (/default) configurations for larger models would save future developers significant time (the specific configurations are arguably 'trivial').
Proposed solution
We can take different paths for the solution- the main reason I created this issue is to discuss whether we want to introduce such a feature and how.
The most straightforward suggestion would be to create a few config files (extending env.sh) with recommended values for several model examples (ranging from small to large models).
Besides that, I think it would be beneficial to start breaking down the env.sh file into a more modular structure before it grows too extensively, using this as the first step toward that goal.
I would love to hear your take on this :)
Thanks!
Alternatives
No response
Additional context or screenshots
No response