-
-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
As hoped, xllamacpp binaries now available for non-CUDA GPUs, providing a more modern alternative to GPT4all.
The relevant indexes are:
| Device Type | xllamacpp Build | Index |
|---|---|---|
| NVIDIA | CUDA | https://xorbitsai.github.io/xllamacpp/whl/cu128 |
| AMD | Vulkan* | https://xorbitsai.github.io/xllamacpp/whl/vulkan |
| Intel (XPU) | Vulkan | https://xorbitsai.github.io/xllamacpp/whl/vulkan |
| Apple Silicon | Metal | Default (PyPI) |
* ROCm builds are available for Linux but are not as performant (see xllamacpp#61 (comment))
XLllamaCPP inference is normally handled by Xorbits Inference. However, this is a very dependency-heavy package and as a result you may wish to consider utilizing the basic test code as a starter template. Note that as of 0.2.6 xllamacpp also supports structured JSON outputs.
Update 12/1: support for structured JSON outputs
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels