Skip to content

Conversation

@Harry-Chen
Copy link
Contributor

@Harry-Chen Harry-Chen commented Nov 28, 2025

Purpose

vllm's nightly build needs to be renovated due to the following reasons:

  • Links to aarch64 wheels are totally broken.
  • Binary wheels are being duplicated everywhere, up to 6 (six) times:
    • /<commit>/, /nightly/, /<version>/ directory
    • its own version, and a hardcoded 1.0.0.dev wheel for the sole purpose of finding a precompiled wheel
  • Only one variant out of all (cu129, cu130, cpu) has its index; others are totally ignored.

In this PR, I have renovated the whole process. This includes:

  • Rewrite a Python script (generate-nightly-index.py) to elegantly handle the generation of indices and an extra metadata.json. It supports auto-creation of sub-indices for different variants (with automatic detection). Please read the comments in the code for a detailed explanation.
  • Rewrite upload-wheels.sh to upload one binary wheel only once after each successful build of any wheel. It will call generate-nightly-index.py to generate the index for all currently present wheels in the directory, and copy the indices to all necessary locations (e.g. /<commit>/, /nightly/ if it is on the master branch, and /<version>/ if it is not a dev version).
    • breaking change: no more hardcoded 1.0.0.dev wheels are uploaded to S3.
    • nits: the wrongly marked manylinux1 and manylinux2014 are corrected with manylinux_2_31, which reflects the glibc version of vllm's building image (ubuntu-20.04)
  • The logic in setup.py is changed accordingly to download the metadata.json to find the actual wheel path, not using the hardcoded 1.0.0.dev anymore.

More nits:

  • CUDA 12.8 build is removed from CI, as per the discussion with @youkaichao.
  • VLLM_MAIN_CUDA_VERSION is bumped to 12.9 to avoid confusion.

Test Plan

It's all CI changes. Let's test it by CI.

Test Result

  • release-pipeline has passed.
  • test-pipeline will probably fail on python_only_compile.sh, which tests VLLM_USE_PRECOMPILED with build name nightly. However, before this PR is merged to main, no new indices and metadata will be uploaded to /nightly/.

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

@mergify mergify bot added the ci/build label Nov 28, 2025
@Harry-Chen Harry-Chen force-pushed the nightly-wheel-reno branch 4 times, most recently from 1ca284e to 82e406e Compare November 29, 2025 03:24
@Harry-Chen Harry-Chen marked this pull request as ready for review November 29, 2025 05:34
@chatgpt-codex-connector
Copy link

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

Comment on lines +14 to +20
# detect if python3.10+ is available
has_new_python=$($PYTHON -c "print(1 if __import__('sys').version_info >= (3,10) else 0)")
if [[ "$has_new_python" -eq 0 ]]; then
# use new python from docker
docker pull python:3-slim
PYTHON="docker run --rm -v $(pwd):/app -w /app python:3-slim python3"
fi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just assert python >= 3.10?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will not work on our build agents, which only have python 3.8 installed. We have to provide some shim.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@khluu can we directly upgrade our build agent?

setup.py Outdated
Comment on lines 675 to 676
# try to fetch the wheel metadata from the nightly wheel repo
variant = "cu" + envs.VLLM_MAIN_CUDA_VERSION.replace(".", "")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can introduce a new env var like VLLM_PRECOMPILED_VARIANT to explicitly set to cu129/cu130, or potentially extended for cpu in the future (and empty by default). no need to be hard-coded and coupled with VLLM_MAIN_CUDA_VERSION

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if the indices work fine, then we can remove the "CUDA only" assertion from setup.py and download variant-specific wheels from the nightly index.

Copy link
Member

@youkaichao youkaichao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the great effort! LGTM overall. left two comments.

there's also a failing test here: https://buildkite.com/vllm/ci/builds/41163/steps/canvas?jid=019ad91a-3bdd-4e5a-930f-00f9582e79a6#019ad91a-3bdd-4e5a-930f-00f9582e79a6/7-7331

@Harry-Chen
Copy link
Contributor Author

there's also a failing test here: https://buildkite.com/vllm/ci/builds/41163/steps/canvas?jid=019ad91a-3bdd-4e5a-930f-00f9582e79a6#019ad91a-3bdd-4e5a-930f-00f9582e79a6/7-7331

Please see my PR description :-):

  • test-pipeline will probably fail on python_only_compile.sh, which tests VLLM_USE_PRECOMPILED with build name nightly. However, before this PR is merged to main, no new indices and metadata will be uploaded to /nightly/.

So it will probably be automatically fixed once we have one commit building this new type of index.

Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
…download precompiled wheels

Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
…ult variant is found

Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
…load

Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
…piled wheels

Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
…ace hardcoding

Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
@mergify
Copy link

mergify bot commented Dec 1, 2025

Documentation preview: https://vllm--29690.org.readthedocs.build/en/29690/

@mergify mergify bot added the documentation Improvements or additions to documentation label Dec 1, 2025
@mergify mergify bot added the nvidia label Dec 1, 2025
@github-project-automation github-project-automation bot moved this to In review in NVIDIA Dec 1, 2025
@youkaichao
Copy link
Member

as discussed, we need to merge this PR to fix the python-only installation test.

@youkaichao youkaichao merged commit 36db0a3 into vllm-project:main Dec 1, 2025
91 of 93 checks passed
@github-project-automation github-project-automation bot moved this from In review to Done in NVIDIA Dec 1, 2025
kitaekatt pushed a commit to kitaekatt/vllm that referenced this pull request Dec 1, 2025
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
khluu added a commit that referenced this pull request Dec 1, 2025
simon-mo pushed a commit that referenced this pull request Dec 1, 2025
Harry-Chen added a commit to Harry-Chen/vllm that referenced this pull request Dec 2, 2025
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Harry-Chen added a commit to Harry-Chen/vllm that referenced this pull request Dec 2, 2025
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Harry-Chen added a commit to Harry-Chen/vllm that referenced this pull request Dec 2, 2025
…LLM_MAIN_CUDA_VERSION to 12.9)

Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
amd-hhashemi pushed a commit to amd-hhashemi/vllm that referenced this pull request Dec 2, 2025
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Signed-off-by: Hashem Hashemi <hashem.hashemi@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build documentation Improvements or additions to documentation nvidia

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants