Commit 2eaa3a5
Feature: OpenAI-compatible endpoint for text generation (#1395)
* Instructions using openAI style remote endpoint
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* Readme for openai style remote endpoint
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* Adding remote textgen service, openai standard
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* Code and test for openai style endpoint
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Clarified instructions in README_endpoint_openai.md
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Commented out stop_containers at beginning.
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* Add a little code comment for clarity
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix the curl to text gen service s it doesn't need a key
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Modify unit test since vLLM 0.8.3 changed docker files path
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Cleaned up comments
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* Adding a suitable vllm block-size for cpu
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* Allow text-generation service.py to work with openai compatible endpoints that do not allow null or None as input e.g. openrouter
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* Updated README fixed small typos and make it easier to paste example curl
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Updated test_llms_textgen_endpoit_openai.sh
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Uncomment build_vllm_image
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* Fix the WORKPATH
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* Generalize OpeaTextGenService to be usable with other open ai compatible endpoints in addition to tgi and vllm
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* Add testing for both openai api chat completion and regular completions
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* Generalize the OpeaTextGenService so it can be used for openai like APIs beyond TGI and vLLM eg openrouter.ai
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Added logging import
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* Go back to relative path for ChatTemplate
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* Fixed two argument error and omit language arg for chatcompletions
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix unit tests
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Revert and simplify
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix stri interp bug
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* More logger fstring to fix
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Revert to old unit test.
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* To fix the test test_llms_text-generation_service_vllm_on_intel_hpu.sh The path of docker files used to build image from vllm-fork changed recently
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* Pin supported version of transformers 4.45.2 for gaudi 1.20.1 and use separate requirements_hpu.txt for building Dockerfile.intel_hpu_phi4
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* Update llama-index-core requirements to align with recent PRs
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* Revert back path to Dockerfile.hpu
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* Pin version range of numpy to be compatible with transformers and torch
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Added logging if vllm-gaudi-server fails
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* Seeing if omitting transformers and numpy will help hpu CI unit tests by not overwriting dependencies from the Gaudi container
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* Add more logging ot text-generation_service_vllm_on_intel_hpu and pin transformers and numpy
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* Refactored ALLOWED_CHATCOMPLETION_ARGS and ALLOWED_COMPLETION_ARGS
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Trying depedencies that are known to work with Gaudi 1.20.1
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* Revert back to main hpu test and text gen hpu Dockerfile
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Signed-off-by: Ed Lee <16417837+edlee123@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: xiguiw <111278656+xiguiw@users.noreply.github.com>
Co-authored-by: Liang Lv <liang1.lv@intel.com>
Co-authored-by: Abolfazl Shahbazi <12436063+ashahba@users.noreply.github.com>
Co-authored-by: Rachel R <rroumeliotis@gmail.com>1 parent 3240c96 commit 2eaa3a5
File tree
5 files changed
+348
-56
lines changed- comps
- cores/proto
- llms
- deployment/docker_compose
- src/text-generation
- integrations
- tests/llms
5 files changed
+348
-56
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1016 | 1016 | | |
1017 | 1017 | | |
1018 | 1018 | | |
| 1019 | + | |
| 1020 | + | |
| 1021 | + | |
| 1022 | + | |
| 1023 | + | |
| 1024 | + | |
| 1025 | + | |
| 1026 | + | |
| 1027 | + | |
| 1028 | + | |
| 1029 | + | |
| 1030 | + | |
| 1031 | + | |
| 1032 | + | |
| 1033 | + | |
| 1034 | + | |
| 1035 | + | |
| 1036 | + | |
| 1037 | + | |
| 1038 | + | |
| 1039 | + | |
| 1040 | + | |
| 1041 | + | |
| 1042 | + | |
| 1043 | + | |
| 1044 | + | |
| 1045 | + | |
| 1046 | + | |
| 1047 | + | |
| 1048 | + | |
| 1049 | + | |
| 1050 | + | |
| 1051 | + | |
| 1052 | + | |
| 1053 | + | |
| 1054 | + | |
| 1055 | + | |
1019 | 1056 | | |
1020 | 1057 | | |
Lines changed: 9 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
118 | 118 | | |
119 | 119 | | |
120 | 120 | | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
121 | 130 | | |
122 | 131 | | |
123 | 132 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
| 5 | + | |
5 | 6 | | |
| 7 | + | |
6 | 8 | | |
7 | 9 | | |
8 | 10 | | |
| |||
11 | 13 | | |
12 | 14 | | |
13 | 15 | | |
14 | | - | |
| 16 | + | |
15 | 17 | | |
16 | 18 | | |
17 | 19 | | |
18 | 20 | | |
19 | | - | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
20 | 28 | | |
21 | 29 | | |
22 | 30 | | |
| |||
96 | 104 | | |
97 | 105 | | |
98 | 106 | | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
99 | 114 | | |
100 | | - | |
101 | | - | |
| 115 | + | |
102 | 116 | | |
103 | 117 | | |
104 | 118 | | |
105 | | - | |
106 | | - | |
| 119 | + | |
107 | 120 | | |
| 121 | + | |
108 | 122 | | |
109 | | - | |
110 | | - | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
111 | 126 | | |
112 | | - | |
113 | | - | |
114 | | - | |
115 | | - | |
| 127 | + | |
116 | 128 | | |
117 | 129 | | |
118 | | - | |
119 | | - | |
| 130 | + | |
120 | 131 | | |
121 | 132 | | |
122 | 133 | | |
123 | 134 | | |
124 | 135 | | |
125 | 136 | | |
126 | 137 | | |
127 | | - | |
128 | | - | |
| 138 | + | |
| 139 | + | |
129 | 140 | | |
130 | 141 | | |
131 | 142 | | |
| |||
145 | 156 | | |
146 | 157 | | |
147 | 158 | | |
148 | | - | |
149 | | - | |
| 159 | + | |
150 | 160 | | |
151 | 161 | | |
152 | 162 | | |
| |||
179 | 189 | | |
180 | 190 | | |
181 | 191 | | |
182 | | - | |
183 | | - | |
| 192 | + | |
184 | 193 | | |
185 | 194 | | |
186 | 195 | | |
| |||
200 | 209 | | |
201 | 210 | | |
202 | 211 | | |
203 | | - | |
204 | | - | |
205 | | - | |
206 | | - | |
207 | | - | |
208 | | - | |
209 | | - | |
210 | | - | |
211 | | - | |
212 | | - | |
213 | | - | |
214 | | - | |
215 | | - | |
216 | | - | |
217 | | - | |
218 | | - | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
219 | 217 | | |
220 | 218 | | |
221 | 219 | | |
| |||
226 | 224 | | |
227 | 225 | | |
228 | 226 | | |
229 | | - | |
230 | | - | |
231 | | - | |
232 | | - | |
233 | | - | |
234 | | - | |
235 | | - | |
236 | | - | |
237 | | - | |
238 | | - | |
239 | | - | |
240 | | - | |
241 | | - | |
242 | | - | |
243 | | - | |
244 | | - | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
245 | 231 | | |
246 | 232 | | |
247 | 233 | | |
| |||
251 | 237 | | |
252 | 238 | | |
253 | 239 | | |
254 | | - | |
255 | | - | |
| 240 | + | |
256 | 241 | | |
257 | 242 | | |
258 | 243 | | |
259 | 244 | | |
260 | 245 | | |
261 | 246 | | |
262 | 247 | | |
263 | | - | |
264 | | - | |
| 248 | + | |
265 | 249 | | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
0 commit comments