ventus-pytorch

ventus-pytorch is a Ventus-enabled PyTorch fork for running small-model inference on the Ventus software stack. This repository includes the PyTorch-side integration, Ventus kernel assets, and a minimal QEMU runtime image for reproducible execution.

Release Artifacts

Do not commit the runtime image into git. Publish it through GitHub Releases instead.

Recommended release assets:

ventus-runtime/ventus-runtime.qcow2
ventus-runtime/Makefile

Users should download the runtime image from GitHub Releases, and place it at:

ventus-runtime/ventus-runtime.qcow2

so that make qemu can find it directly.

The qcow2 image is prepared as a minimal runtime VM. On boot it:

logs in automatically on the serial console as root
enters /opt/ventus
sources ./env.sh

Runtime Requirements

For stable execution, allocate at least:

10 CPU cores
8G memory

rtlsim was built from a Verilator configuration using 8 threads, so undersized CPU allocation is not recommended.

The provided Makefile defaults are higher than the minimum.

Run The VM

From ventus-runtime/:

make qemu

Or specify resources explicitly:

make qemu VM_CPUS=10 VM_MEMORY=8G

After boot, the shell is already in /opt/ventus with env.sh loaded.

Packaged Model Layout

The runtime image keeps the execution environment small:

/opt/ventus/install/
/opt/ventus/ventus-pytorch/torch/
/opt/ventus/models/gpt2/run_generate.py
/opt/ventus/models/pythia-410m-deduped/run_generate.py
/opt/ventus/models/qwen2.5-0.5b-instruct/run_generate.py

Model weights are intentionally not bundled into the qcow2 image. Before running a model, download its Hugging Face checkpoint files into the matching directory under /opt/ventus/models/.

Examples:

/opt/ventus/models/gpt2/
/opt/ventus/models/pythia-410m-deduped/
/opt/ventus/models/qwen2.5-0.5b-instruct/

Each directory should contain the model files plus the packaged run_generate.py.

Verified Native Inference

Native Ventus inference has been validated in three precisions:

GPT-2: TF32
Pythia-410m-deduped: FP16
Qwen2.5-0.5B-Instruct: BF16

Current reference results:

GPT-2 remains the minimal text-generation smoke test.
Qwen2.5-0.5B-Instruct on rtlsim generated one token from Hello, my name is and produced Hello, my name is Alex in about 7:00:56.18.
Pythia-410m-deduped on rtlsim generated one token from Hello, my name is and produced Hello, my name is John in about 4:54:40.50.

Stability Notes

Pythia-410m-deduped currently runs in FP16, but stability is not yet sufficient for KV-cache-based decoding. Keep use_cache=False for this model. With KV cache enabled, small numerical error can accumulate and propagate into NaN values.

Run The Packaged Scripts

Inside the VM:

export VENTUS_BACKEND=spike
python models/gpt2/run_generate.py
python models/pythia-410m-deduped/run_generate.py
python models/qwen2.5-0.5b-instruct/run_generate.py

For rtlsim, switch the backend before launching:

export VENTUS_BACKEND=rtlsim
python models/qwen2.5-0.5b-instruct/run_generate.py

Name		Name	Last commit message	Last commit date
Latest commit History 100,571 Commits
.ci		.ci
.circleci		.circleci
.claude/skills		.claude/skills
.ctags.d		.ctags.d
.devcontainer		.devcontainer
.github		.github
.spin		.spin
.vscode		.vscode
android		android
aten		aten
benchmarks		benchmarks
binaries		binaries
c10		c10
caffe2		caffe2
cmake		cmake
docs		docs
functorch		functorch
mypy_plugins		mypy_plugins
scripts		scripts
test		test
third_party		third_party
tools		tools
torch		torch
torchgen		torchgen
ventus-runtime		ventus-runtime
.bazelignore		.bazelignore
.bazelrc		.bazelrc
.bazelversion		.bazelversion
.bc-linter.yml		.bc-linter.yml
.clang-format		.clang-format
.clang-tidy		.clang-tidy
.cmakelintrc		.cmakelintrc
.coveragerc		.coveragerc
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.flake8		.flake8
.gdbinit		.gdbinit
.git-blame-ignore-revs		.git-blame-ignore-revs
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
.lintrunner.toml		.lintrunner.toml
.lldbinit		.lldbinit
AGENTS.md		AGENTS.md
BUCK.oss		BUCK.oss
BUILD.bazel		BUILD.bazel
CITATION.cff		CITATION.cff
CLAUDE.md		CLAUDE.md
CMakeLists.txt		CMakeLists.txt
CODEOWNERS		CODEOWNERS
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
GLOSSARY.md		GLOSSARY.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
NOTICE		NOTICE
README.md		README.md
RELEASE.md		RELEASE.md
SECURITY.md		SECURITY.md
WORKSPACE		WORKSPACE
aten.bzl		aten.bzl
buckbuild.bzl		buckbuild.bzl
build.bzl		build.bzl
build_variables.bzl		build_variables.bzl
codex_setup.sh		codex_setup.sh
defs.bzl		defs.bzl
docker.Makefile		docker.Makefile
mypy-strict.ini		mypy-strict.ini
mypy.ini		mypy.ini
pt_ops.bzl		pt_ops.bzl
pt_template_srcs.bzl		pt_template_srcs.bzl
pyproject.toml		pyproject.toml
pyrefly.toml		pyrefly.toml
pytest.ini		pytest.ini
requirements-build.txt		requirements-build.txt
requirements.txt		requirements.txt
setup.py		setup.py
ubsan.supp		ubsan.supp
ufunc_defs.bzl		ufunc_defs.bzl
version.txt		version.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ventus-pytorch

Release Artifacts

Runtime Requirements

Run The VM

Packaged Model Layout

Verified Native Inference

Stability Notes

Run The Packaged Scripts

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ventus-pytorch

Release Artifacts

Runtime Requirements

Run The VM

Packaged Model Layout

Verified Native Inference

Stability Notes

Run The Packaged Scripts

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages