inference-hive is a toolkit to run distributed LLM inference on SLURM clusters. Configure a few cluster, inference server and data settings, and scale your inference workload across thousands of GPUs.
chat data scale offline cluster completion slurm inference distributed openai synthetic large huggingface llm vllm sglang eurohpc openeurollm
-
Updated
Mar 10, 2026 - Python