[CI] Renovation of nightly wheel build & generation #29690

Harry-Chen · 2025-11-28T16:46:07Z

Purpose

vllm's nightly build needs to be renovated due to the following reasons:

Links to aarch64 wheels are totally broken.
Binary wheels are being duplicated everywhere, up to 6 (six) times:
- /<commit>/, /nightly/, /<version>/ directory
- its own version, and a hardcoded 1.0.0.dev wheel for the sole purpose of finding a precompiled wheel
Only one variant out of all (cu129, cu130, cpu) has its index; others are totally ignored.

In this PR, I have renovated the whole process. This includes:

Rewrite a Python script (generate-nightly-index.py) to elegantly handle the generation of indices and an extra metadata.json. It supports auto-creation of sub-indices for different variants (with automatic detection). Please read the comments in the code for a detailed explanation.
Rewrite upload-wheels.sh to upload one binary wheel only once after each successful build of any wheel. It will call generate-nightly-index.py to generate the index for all currently present wheels in the directory, and copy the indices to all necessary locations (e.g. /<commit>/, /nightly/ if it is on the master branch, and /<version>/ if it is not a dev version).
- breaking change: no more hardcoded 1.0.0.dev wheels are uploaded to S3.
- nits: the wrongly marked manylinux1 and manylinux2014 are corrected with manylinux_2_31, which reflects the glibc version of vllm's building image (ubuntu-20.04)
The logic in setup.py is changed accordingly to download the metadata.json to find the actual wheel path, not using the hardcoded 1.0.0.dev anymore.

More nits:

CUDA 12.8 build is removed from CI, as per the discussion with @youkaichao.
VLLM_MAIN_CUDA_VERSION is bumped to 12.9 to avoid confusion.

Test Plan

It's all CI changes. Let's test it by CI.

Test Result

release-pipeline has passed.
test-pipeline will probably fail on python_only_compile.sh, which tests VLLM_USE_PRECOMPILED with build name nightly. However, before this PR is merged to main, no new indices and metadata will be uploaded to /nightly/.

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

chatgpt-codex-connector · 2025-11-29T05:34:52Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

youkaichao · 2025-12-01T08:56:23Z

.buildkite/scripts/upload-wheels.sh

+# detect if python3.10+ is available
+has_new_python=$($PYTHON -c "print(1 if __import__('sys').version_info >= (3,10) else 0)")
+if [[ "$has_new_python" -eq 0 ]]; then
+    # use new python from docker
+    docker pull python:3-slim
+    PYTHON="docker run --rm -v $(pwd):/app -w /app python:3-slim python3"
+fi


just assert python >= 3.10?

This will not work on our build agents, which only have python 3.8 installed. We have to provide some shim.

@khluu can we directly upgrade our build agent?

youkaichao · 2025-12-01T09:10:34Z

setup.py

+        # try to fetch the wheel metadata from the nightly wheel repo
+        variant = "cu" + envs.VLLM_MAIN_CUDA_VERSION.replace(".", "")


we can introduce a new env var like VLLM_PRECOMPILED_VARIANT to explicitly set to cu129/cu130, or potentially extended for cpu in the future (and empty by default). no need to be hard-coded and coupled with VLLM_MAIN_CUDA_VERSION

Yes, if the indices work fine, then we can remove the "CUDA only" assertion from setup.py and download variant-specific wheels from the nightly index.

youkaichao

thanks for the great effort! LGTM overall. left two comments.

there's also a failing test here: https://buildkite.com/vllm/ci/builds/41163/steps/canvas?jid=019ad91a-3bdd-4e5a-930f-00f9582e79a6#019ad91a-3bdd-4e5a-930f-00f9582e79a6/7-7331

Harry-Chen · 2025-12-01T09:16:38Z

there's also a failing test here: https://buildkite.com/vllm/ci/builds/41163/steps/canvas?jid=019ad91a-3bdd-4e5a-930f-00f9582e79a6#019ad91a-3bdd-4e5a-930f-00f9582e79a6/7-7331

Please see my PR description :-):

test-pipeline will probably fail on python_only_compile.sh, which tests VLLM_USE_PRECOMPILED with build name nightly. However, before this PR is merged to main, no new indices and metadata will be uploaded to /nightly/.

So it will probably be automatically fixed once we have one commit building this new type of index.