Commit 01c6cba
committed
Add tp-size and pp-size variations and update TensorRT-LLM version for GPT-J
Fixes #671
This PR fixes both issues reported in #671:
1. Missing tp-size/pp-size variations
2. nvidia-ammo installation failure in Docker
## Changes
### Fix 1: Add tp-size and pp-size variations
- Added tp-size.# and pp-size.# variation definitions
- Set default tp-size.1 and pp-size.1 for pytorch,nvidia variation
- Added MLC_NVIDIA_TP_SIZE and MLC_NVIDIA_PP_SIZE to new_env_keys
This resolves the error: "no scripts were found with tags:
get,ml-model,gptj,_nvidia,_fp8,_tp-size.2"
### Fix 2: Update TensorRT-LLM to v5.0
- Updated TensorRT-LLM SHA from 0ab9d17 (Feb 2024) to 2ea17cd (v5.0)
- Added required submodules list to match llama2 implementation
- Removed _lfs tag as it's not needed with newer version
This resolves the nvidia-ammo "RuntimeError: Bad params" installation
failure that occurred with the older TensorRT-LLM version.
## Testing
- Validated YAML syntax
- Verified changes match llama2 script patterns
- Confirmed TensorRT-LLM version is same as llama2 v5.01 parent cd96a8d commit 01c6cba
1 file changed
+13
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
| 19 | + | |
| 20 | + | |
19 | 21 | | |
20 | 22 | | |
21 | 23 | | |
| |||
152 | 154 | | |
153 | 155 | | |
154 | 156 | | |
| 157 | + | |
| 158 | + | |
155 | 159 | | |
156 | 160 | | |
157 | 161 | | |
158 | 162 | | |
159 | | - | |
| 163 | + | |
160 | 164 | | |
161 | 165 | | |
162 | 166 | | |
| |||
253 | 257 | | |
254 | 258 | | |
255 | 259 | | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
256 | 268 | | |
257 | 269 | | |
258 | 270 | | |
| |||
0 commit comments