-
-
Notifications
You must be signed in to change notification settings - Fork 11.7k
[CI] Renovation of nightly wheel build & generation #29690
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
1ca284e to
82e406e
Compare
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
| # detect if python3.10+ is available | ||
| has_new_python=$($PYTHON -c "print(1 if __import__('sys').version_info >= (3,10) else 0)") | ||
| if [[ "$has_new_python" -eq 0 ]]; then | ||
| # use new python from docker | ||
| docker pull python:3-slim | ||
| PYTHON="docker run --rm -v $(pwd):/app -w /app python:3-slim python3" | ||
| fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just assert python >= 3.10?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will not work on our build agents, which only have python 3.8 installed. We have to provide some shim.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@khluu can we directly upgrade our build agent?
setup.py
Outdated
| # try to fetch the wheel metadata from the nightly wheel repo | ||
| variant = "cu" + envs.VLLM_MAIN_CUDA_VERSION.replace(".", "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can introduce a new env var like VLLM_PRECOMPILED_VARIANT to explicitly set to cu129/cu130, or potentially extended for cpu in the future (and empty by default). no need to be hard-coded and coupled with VLLM_MAIN_CUDA_VERSION
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, if the indices work fine, then we can remove the "CUDA only" assertion from setup.py and download variant-specific wheels from the nightly index.
youkaichao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the great effort! LGTM overall. left two comments.
there's also a failing test here: https://buildkite.com/vllm/ci/builds/41163/steps/canvas?jid=019ad91a-3bdd-4e5a-930f-00f9582e79a6#019ad91a-3bdd-4e5a-930f-00f9582e79a6/7-7331
Please see my PR description :-):
So it will probably be automatically fixed once we have one commit building this new type of index. |
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
…download precompiled wheels Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
…ult variant is found Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
…load Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
…piled wheels Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
…ace hardcoding Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
3c7461c to
73488d2
Compare
|
Documentation preview: https://vllm--29690.org.readthedocs.build/en/29690/ |
|
as discussed, we need to merge this PR to fix the python-only installation test. |
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
…LLM_MAIN_CUDA_VERSION to 12.9) Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com> Signed-off-by: Hashem Hashemi <hashem.hashemi@amd.com>
Purpose
vllm's nightly build needs to be renovated due to the following reasons:
/<commit>/,/nightly/,/<version>/directory1.0.0.devwheel for the sole purpose of finding a precompiled wheelIn this PR, I have renovated the whole process. This includes:
generate-nightly-index.py) to elegantly handle the generation of indices and an extrametadata.json. It supports auto-creation of sub-indices for different variants (with automatic detection). Please read the comments in the code for a detailed explanation.upload-wheels.shto upload one binary wheel only once after each successful build of any wheel. It will callgenerate-nightly-index.pyto generate the index for all currently present wheels in the directory, and copy the indices to all necessary locations (e.g./<commit>/,/nightly/if it is on the master branch, and/<version>/if it is not a dev version).1.0.0.devwheels are uploaded to S3.manylinux1andmanylinux2014are corrected withmanylinux_2_31, which reflects the glibc version of vllm's building image (ubuntu-20.04)setup.pyis changed accordingly to download themetadata.jsonto find the actual wheel path, not using the hardcoded1.0.0.devanymore.More nits:
VLLM_MAIN_CUDA_VERSIONis bumped to 12.9 to avoid confusion.Test Plan
It's all CI changes. Let's test it by CI.
Test Result
release-pipelinehas passed.test-pipelinewill probably fail onpython_only_compile.sh, which testsVLLM_USE_PRECOMPILEDwith build namenightly. However, before this PR is merged tomain, no new indices and metadata will be uploaded to/nightly/.Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.