Skip to content

Commit 3108287

Browse files
committed
doc: update instructions on installing from renovated nightly index
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
1 parent fcf323d commit 3108287

File tree

2 files changed

+27
-37
lines changed

2 files changed

+27
-37
lines changed

docs/getting_started/installation/cpu.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,10 @@ vLLM is a Python library that supports the following CPU variants. Select your C
4646

4747
### Pre-built wheels
4848

49-
Currently, there are no pre-built CPU wheels.
49+
Please refer to the instructions for [pre-built wheels on GPU](./gpu.md#pre-built-wheels).
50+
51+
When specifying the index URL, please make sure to use the `cpu` variant subdirectory.
52+
For example, the nightly build index is: `https://wheels.vllm.ai/nightly/cpu/`.
5053

5154
### Build wheel from source
5255

docs/getting_started/installation/gpu.cuda.inc.md

Lines changed: 23 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -26,43 +26,50 @@ uv pip install vllm --torch-backend=auto
2626

2727
??? console "pip"
2828
```bash
29-
# Install vLLM with CUDA 12.8.
30-
pip install vllm --extra-index-url https://download.pytorch.org/whl/cu128
29+
# Install vLLM with CUDA 12.9.
30+
pip install vllm --extra-index-url https://download.pytorch.org/whl/cu129
3131
```
3232

33-
We recommend leveraging `uv` to [automatically select the appropriate PyTorch index at runtime](https://docs.astral.sh/uv/guides/integration/pytorch/#automatic-backend-selection) by inspecting the installed CUDA driver version via `--torch-backend=auto` (or `UV_TORCH_BACKEND=auto`). To select a specific backend (e.g., `cu126`), set `--torch-backend=cu126` (or `UV_TORCH_BACKEND=cu126`). If this doesn't work, try running `uv self update` to update `uv` first.
33+
We recommend leveraging `uv` to [automatically select the appropriate PyTorch index at runtime](https://docs.astral.sh/uv/guides/integration/pytorch/#automatic-backend-selection) by inspecting the installed CUDA driver version via `--torch-backend=auto` (or `UV_TORCH_BACKEND=auto`). To select a specific backend (e.g., `cu128`), set `--torch-backend=cu128` (or `UV_TORCH_BACKEND=cu128`). If this doesn't work, try running `uv self update` to update `uv` first.
3434

3535
!!! note
3636
NVIDIA Blackwell GPUs (B200, GB200) require a minimum of CUDA 12.8, so make sure you are installing PyTorch wheels with at least that version. PyTorch itself offers a [dedicated interface](https://pytorch.org/get-started/locally/) to determine the appropriate pip command to run for a given target configuration.
3737

38-
As of now, vLLM's binaries are compiled with CUDA 12.8 and public PyTorch release versions by default. We also provide vLLM binaries compiled with CUDA 12.6, 11.8, and public PyTorch release versions:
38+
As of now, vLLM's binaries are compiled with CUDA 12.9 and public PyTorch release versions by default. We also provide vLLM binaries compiled with CUDA 12.8, 13.0, and public PyTorch release versions:
3939

4040
```bash
41-
# Install vLLM with a specific CUDA version (e.g., 11.8 or 12.6).
41+
# Install vLLM with a specific CUDA version (e.g., 13.0).
4242
export VLLM_VERSION=$(curl -s https://api.github.com/repos/vllm-project/vllm/releases/latest | jq -r .tag_name | sed 's/^v//')
43-
export CUDA_VERSION=118 # or 126
44-
uv pip install https://github.com/vllm-project/vllm/releases/download/v${VLLM_VERSION}/vllm-${VLLM_VERSION}+cu${CUDA_VERSION}-cp38-abi3-manylinux1_x86_64.whl --extra-index-url https://download.pytorch.org/whl/cu${CUDA_VERSION}
43+
export CUDA_VERSION=130 # or other
44+
uv pip install https://github.com/vllm-project/vllm/releases/download/v${VLLM_VERSION}/vllm-${VLLM_VERSION}+cu${CUDA_VERSION}-cp38-abi3-manylinux_2_31_x86_64.whl --extra-index-url https://download.pytorch.org/whl/cu${CUDA_VERSION}
4545
```
4646

4747
#### Install the latest code
4848

49-
LLM inference is a fast-evolving field, and the latest code may contain bug fixes, performance improvements, and new features that are not released yet. To allow users to try the latest code without waiting for the next release, vLLM provides wheels for Linux running on an x86 platform with CUDA 12 for every commit since `v0.5.3`.
49+
LLM inference is a fast-evolving field, and the latest code may contain bug fixes, performance improvements, and new features that are not released yet. To allow users to try the latest code without waiting for the next release, vLLM provides wheels for every commit since `v0.5.3` on <https://wheels.vllm.ai/nightly>. There are multiple indices that could be used:
50+
51+
* `https://wheels.vllm.ai/nightly`: the default variant (CUDA with version specified in `VLLM_MAIN_CUDA_VERSION`) built with the last commit on the `main` branch. Currently it is CUDA 12.9.
52+
* `https://wheels.vllm.ai/nightly/<variant>`: all other variants. Now this includes `cu130`, and `cpu`. The default variant (`cu129`) also has a subdirectory to keep consistency.
53+
54+
To install from nightly index, run:
5055

5156
```bash
5257
uv pip install -U vllm \
5358
--torch-backend=auto \
54-
--extra-index-url https://wheels.vllm.ai/nightly
59+
--extra-index-url https://wheels.vllm.ai/nightly # add variant subdirectory here if needed
5560
```
5661

57-
??? console "pip"
62+
!!! warning "`pip` caveat"
63+
64+
Using `pip` to install from nightly indices is _not supported_, because `pip` combines packages from `--extra-index-url` and the default index, choosing only the latest version, which makes it difficult to install a development version prior to the released version. In contrast, `uv` gives the extra index [higher priority than the default index](https://docs.astral.sh/uv/pip/compatibility/#packages-that-exist-on-multiple-indexes).
65+
66+
If you insist on using `pip`, you have to specify the full URL of the wheel file (which can be obtained from the web page).
67+
5868
```bash
59-
pip install -U vllm \
60-
--pre \
61-
--extra-index-url https://wheels.vllm.ai/nightly
69+
pip install -U https://wheels.vllm.ai/nightly/vllm-0.11.2.dev399%2Bg3c7461c18-cp38-abi3-manylinux_2_31_x86_64.whl # current nightly build (the filename will change!)
70+
pip install -U https://wheels.vllm.ai/${VLLM_COMMIT}/vllm-0.11.2.dev399%2Bg3c7461c18-cp38-abi3-manylinux_2_31_x86_64.whl # from specific commit
6271
```
6372

64-
`--pre` is required for `pip` to consider pre-released versions.
65-
6673
##### Install specific revisions
6774

6875
If you want to access the wheels for previous commits (e.g. to bisect the behavior change, performance regression), you can specify the commit hash in the URL:
@@ -71,29 +78,9 @@ If you want to access the wheels for previous commits (e.g. to bisect the behavi
7178
export VLLM_COMMIT=72d9c316d3f6ede485146fe5aabd4e61dbc59069 # use full commit hash from the main branch
7279
uv pip install vllm \
7380
--torch-backend=auto \
74-
--extra-index-url https://wheels.vllm.ai/${VLLM_COMMIT}
81+
--extra-index-url https://wheels.vllm.ai/${VLLM_COMMIT} # add variant subdirectory here if needed
7582
```
7683

77-
The `uv` approach works for vLLM `v0.6.6` and later and offers an easy-to-remember command. A unique feature of `uv` is that packages in `--extra-index-url` have [higher priority than the default index](https://docs.astral.sh/uv/pip/compatibility/#packages-that-exist-on-multiple-indexes). If the latest public release is `v0.6.6.post1`, `uv`'s behavior allows installing a commit before `v0.6.6.post1` by specifying the `--extra-index-url`. In contrast, `pip` combines packages from `--extra-index-url` and the default index, choosing only the latest version, which makes it difficult to install a development version prior to the released version.
78-
79-
??? note "pip"
80-
If you want to access the wheels for previous commits (e.g. to bisect the behavior change,
81-
performance regression), due to the limitation of `pip`, you have to specify the full URL of the
82-
wheel file by embedding the commit hash in the URL:
83-
84-
```bash
85-
export VLLM_COMMIT=33f460b17a54acb3b6cc0b03f4a17876cff5eafd # use full commit hash from the main branch
86-
pip install https://wheels.vllm.ai/${VLLM_COMMIT}/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl
87-
```
88-
89-
Note that the wheels are built with Python 3.8 ABI (see [PEP
90-
425](https://peps.python.org/pep-0425/) for more details about ABI), so **they are compatible
91-
with Python 3.8 and later**. The version string in the wheel file name (`1.0.0.dev`) is just a
92-
placeholder to have a unified URL for the wheels, the actual versions of wheels are contained in
93-
the wheel metadata (the wheels listed in the extra index url have correct versions). Although we
94-
don't support Python 3.8 any more (because PyTorch 2.5 dropped support for Python 3.8), the
95-
wheels are still built with Python 3.8 ABI to keep the same wheel name as before.
96-
9784
# --8<-- [end:pre-built-wheels]
9885
# --8<-- [start:build-wheel-from-source]
9986

0 commit comments

Comments
 (0)