-
Notifications
You must be signed in to change notification settings - Fork 75
An installation for RTX6000 Blackwell Edition #29
Copy link
Copy link
Open
Description
Hi,
I just want to ping here my environment setup here for RTX6000 Blackwell Edition, which might also help others to run on Blackwell GPUs.
Here is the list of package versions:
Package Version
---------------------------------- -------------
absl-py 2.4.0
accelerate 1.12.0
aiohappyeyeballs 2.6.1
aiohttp 3.13.3
aiohttp-cors 0.8.1
aiosignal 1.4.0
annotated-doc 0.0.4
annotated-types 0.7.0
antlr4-python3-runtime 4.9.3
anyio 4.12.1
astor 0.8.1
attrs 26.1.0
blake3 1.0.8
cachetools 7.0.5
cbor2 5.9.0
certifi 2026.2.25
cffi 2.0.0
cfgv 3.5.0
charset-normalizer 3.4.6
click 8.3.1
cloudpickle 3.1.2
codetiming 1.4.0
colorful 0.5.8
compressed-tensors 0.10.2
cryptography 46.0.5
cuda-pathfinder 1.4.3
cupy-cuda12x 14.0.1
datasets 4.4.2
depyf 0.19.0
dill 0.4.0
diskcache 5.6.3
distlib 0.4.0
distro 1.9.0
dnspython 2.8.0
einops 0.8.2
email-validator 2.3.0
fastapi 0.127.0
fastapi-cli 0.0.24
fastapi-cloud-cli 0.15.0
fastar 0.9.0
filelock 3.25.2
flash_attn 2.8.3
frozenlist 1.8.0
fsspec 2025.10.0
gguf 0.18.0
gitdb 4.0.12
GitPython 3.1.46
google-api-core 2.30.0
google-auth 2.49.1
googleapis-common-protos 1.73.0
grpcio 1.78.0
h11 0.16.0
hf-xet 1.4.2
httpcore 1.0.9
httptools 0.7.1
httpx 0.28.1
huggingface_hub 0.36.2
hydra-core 1.3.2
identify 2.6.18
idna 3.11
importlib_metadata 8.7.1
interegular 0.3.3
Jinja2 3.1.6
jiter 0.13.0
jsonschema 4.26.0
jsonschema-specifications 2025.9.1
lark 1.2.2
latex2sympy2 1.5.4
latex2sympy2_extended 1.10.2
liger_kernel 0.7.0
llguidance 0.7.30
llvmlite 0.44.0
lm-format-enforcer 0.10.12
Markdown 3.10.2
markdown-it-py 4.0.0
MarkupSafe 3.0.2
math-verify 0.8.0
mdurl 0.1.2
mistral_common 1.10.0
mpmath 1.3.0
msgpack 1.1.2
msgspec 0.20.0
multidict 6.7.1
multiprocess 0.70.18
networkx 3.6.1
ninja 1.13.0
nodeenv 1.10.0
numba 0.61.2
numpy 1.26.4
nvidia-cublas-cu12 12.8.4.1
nvidia-cuda-cupti-cu12 12.8.90
nvidia-cuda-nvrtc-cu12 12.8.93
nvidia-cuda-runtime-cu12 12.8.90
nvidia-cudnn-cu12 9.10.2.21
nvidia-cufft-cu12 11.3.3.83
nvidia-cufile-cu12 1.13.1.3
nvidia-curand-cu12 10.3.9.90
nvidia-cusolver-cu12 11.7.3.90
nvidia-cusparse-cu12 12.5.8.93
nvidia-cusparselt-cu12 0.7.1
nvidia-nccl-cu12 2.27.3
nvidia-nvjitlink-cu12 12.8.93
nvidia-nvtx-cu12 12.8.90
omegaconf 2.3.0
openai 2.29.0
openai-harmony 0.0.8
opencensus 0.11.4
opencensus-context 0.1.3
opencv-python-headless 4.13.0.92
opentelemetry-api 1.40.0
opentelemetry-exporter-prometheus 0.61b0
opentelemetry-proto 1.40.0
opentelemetry-sdk 1.40.0
opentelemetry-semantic-conventions 0.61b0
orjson 3.11.7
outlines_core 0.2.10
packaging 26.0
pandas 2.3.3
partial-json-parser 0.2.1.1.post7
peft 0.18.0
pillow 12.0.0
pip 26.0.1
platformdirs 4.9.4
pre_commit 4.5.1
prometheus_client 0.24.1
prometheus-fastapi-instrumentator 7.1.0
propcache 0.4.1
proto-plus 1.27.1
protobuf 6.33.6
psutil 7.2.2
py-cpuinfo 9.0.0
py-spy 0.4.1
pyarrow 22.0.0
pyasn1 0.6.3
pyasn1_modules 0.4.2
pybase64 1.4.3
pybind11 3.0.1
pycountry 26.2.16
pycparser 3.0
pydantic 2.12.5
pydantic_core 2.41.5
pydantic-extra-types 2.11.1
pydantic-settings 2.13.1
Pygments 2.19.2
pylatexenc 2.10
python-dateutil 2.9.0.post0
python-discovery 1.2.0
python-dotenv 1.2.2
python-json-logger 4.0.0
python-multipart 0.0.22
pytz 2026.1.post1
pyvers 0.1.0
PyYAML 6.0.3
pyzmq 27.1.0
ray 2.53.0
referencing 0.37.0
regex 2026.2.28
requests 2.32.5
rich 14.3.3
rich-toolkit 0.19.7
rignore 0.7.6
rpds-py 0.30.0
safetensors 0.7.0
scipy 1.17.1
sentencepiece 0.2.1
sentry-sdk 2.55.0
setproctitle 1.3.7
setuptools 82.0.1
setuptools-scm 9.2.2
sglang 0.5.2
shellingham 1.5.4
six 1.17.0
smart_open 7.5.1
smmap 5.0.3
sniffio 1.3.1
soundfile 0.13.1
soxr 1.0.0
starlette 0.50.0
sympy 1.14.0
tensorboard 2.20.0
tensorboard-data-server 0.7.2
tensordict 0.10.0
tiktoken 0.12.0
tokenizers 0.22.2
torch 2.8.0+cu128
torchaudio 2.8.0+cu128
torchdata 0.11.0
torchvision 0.23.0+cu128
tqdm 4.67.3
transformers 4.57.1
triton 3.4.0
typer 0.24.1
typing_extensions 4.15.0
typing-inspection 0.4.2
tzdata 2025.3
urllib3 2.6.3
uvicorn 0.40.0
uvloop 0.22.1
verl 0.7.0.dev0
virtualenv 21.2.0
vllm 0.11.0
wandb 0.23.1
watchfiles 1.1.1
websockets 16.0
Werkzeug 3.1.6
wheel 0.46.3
word2number 1.1
wrapt 2.1.2
xformers 0.0.31
xgrammar 0.1.21
xxhash 3.6.0
yarl 1.23.0
zipp 3.23.0
I'm running on a 4 RTX6000 GPU SLURM node. For flash-atto, I use these commands to install:
module load cuda/12.8.1 cudnn/12.8-v9.10.0 gcc/11.2.0
FLASH_ATTENTION_FORCE_BUILD=TRUE MAX_JOBS=8 pip install -v --no-build-isolation --no-cache-dir --no-binary flash-attn "flash-attn==2.8.3"
For the vLLM, I use this command to install (if I remember correctly):
pip install vllm --no-deps
When running the training script, I pass these configurations to avoid OOM error:
actor_rollout_ref.actor.fsdp_config.model_dtype=bfloat16 \
actor_rollout_ref.ref.fsdp_config.model_dtype=bfloat16 \
I'm still running experiment reproduction, but hope this helps others.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels