Skip to content

Conversation

@bigPYJ1151
Copy link
Member

@bigPYJ1151 bigPYJ1151 commented Nov 17, 2025

Purpose

  • Add x86 CPU wheel release pipeline, for now each wheel is ~30MB
  • Seperate CPU wheels (both x86 and Arm) to vllm-wheels/cpu

After #29838 merged, we only need to add a x86 CPU release pipeline.

With this change, CPU wheels can be installed via:

  • Regular release version:
uv pip install vllm=="$VERSION+cpu" --extra-index-url "https://wheels.vllm.ai/$VERSION"
  • Specific commits:
uv pip install vllm=="$DEV_VERSION" --extra-index-url "https://wheels.vllm.ai/$COMMIT_ID"

Test Plan

verified the scripts locally

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

@bigPYJ1151 bigPYJ1151 requested a review from khluu November 17, 2025 09:43
@bigPYJ1151 bigPYJ1151 added x86-cpu Related to Intel & AMD CPU cpu Related to CPU backends aarch64-cpu and removed nvidia labels Nov 17, 2025
@mergify mergify bot added the nvidia label Nov 17, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new CI pipeline for building and releasing x86 CPU wheels and correctly separates them into a vllm-wheels/cpu path in the S3 bucket. The changes are well-structured and mostly correct. However, I found a significant issue in the upload-wheels.sh script where it copies an incorrect index file for CPU wheel builds. This could lead to installation problems for users. I've provided a specific comment with a suggested fix to address this.

aws s3 cp "s3://vllm-wheels/nightly/index.html" "s3://vllm-wheels/$BUILDKITE_COMMIT/index.html"
# also upload cpu wheels as is available on both x86 and arm64
aws s3 cp index.html "s3://$ROOT_PATH/$BUILDKITE_COMMIT/vllm/index.html"
aws s3 cp "s3://vllm-wheels/nightly/index.html" "s3://$ROOT_PATH/$BUILDKITE_COMMIT/index.html"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There is an issue with this aws s3 cp command. For CPU wheel builds, $ROOT_PATH will be vllm-wheels/cpu, but the source S3 path is hardcoded to s3://vllm-wheels/nightly/index.html, which is the index for CUDA wheels. This will result in copying the CUDA wheel index into the CPU wheel commit-specific directory, leading to incorrect package resolution for users.

The source path should also use the $ROOT_PATH variable to ensure the correct index is copied for CPU builds. Additionally, it's good practice to handle the case where the nightly index might not exist yet (e.g., for the first build of a day) to prevent the step from failing.

Suggested change
aws s3 cp "s3://vllm-wheels/nightly/index.html" "s3://$ROOT_PATH/$BUILDKITE_COMMIT/index.html"
aws s3 cp "s3://$ROOT_PATH/nightly/index.html" "s3://$ROOT_PATH/$BUILDKITE_COMMIT/index.html" || true

@bigPYJ1151 bigPYJ1151 removed this from NVIDIA Nov 17, 2025
@bigPYJ1151
Copy link
Member Author

Hi @khluu Could you please help to check this PR? Thanks!

@khluu
Copy link
Collaborator

khluu commented Nov 19, 2025

The changes on release-pipeline.yaml look good to me. Triggered a release run to verify: https://buildkite.com/vllm/release/builds/10316
For the changes to nightly index, @youkaichao can you help review it?

@bigPYJ1151
Copy link
Member Author

Thanks a lot @khluu
The job is finished, and I have verified the index. Now the dev wheel can be installed via

uv pip install vllm==0.11.2.dev20+gef967c682.cpu --extra-index-url=https://wheels.vllm.ai/cpu/ef967c682b9720859f60c55bd317edddc7db928a

The CPU wheel index is valid. For CUDA there should have no change.

ROOT_PATH="vllm-wheels"

if [[ $version == *cpu* ]]; then
ROOT_PATH="$ROOT_PATH/cpu"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking about the refactor recently. We should add directories inside s3://vllm-wheels/$BUILDKITE_COMMIT/, e.g. s3://vllm-wheels/$BUILDKITE_COMMIT/cpu/ .

Copy link
Member Author

@bigPYJ1151 bigPYJ1151 Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.
Now CPU wheels and index will be uploaded to s3://vllm-wheels/$BUILDKITE_COMMIT/cpu/ and s3://vllm-wheels/$BUILDKITE_COMMIT/cpu/vllm

Signed-off-by: jiang1.li <jiang1.li@intel.com>
@bigPYJ1151 bigPYJ1151 changed the title [CI/Build] Add x86 CPU wheel release pipeline and seperate CPU wheels with CUDA wheels [CI/Build] Add x86 CPU wheel release pipeline Dec 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

aarch64-cpu ci/build cpu Related to CPU backends nvidia x86-cpu Related to Intel & AMD CPU

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants