Cheesebrain

Cheesebrain is a high-performance C/C++ runtime for Large Language Model (LLM) inference. It is designed to be small, fast, and self-contained so you can run modern GGUF models on laptops, workstations, and servers with minimal setup.

Cheesebrain is its own project and is not described as “built from cheese.cpp”. It focuses on:

Portability – a single codebase that runs well on Linux, macOS, and Windows.
Performance – tight low-level code with aggressive quantization support and hardware-aware kernels.
Practical tooling – CLI, HTTP server, Web UI, quantization tools, and model conversion utilities.

Key Features

C / C++ implementation for easy integration into existing systems.
Hardware-optimized backends:
- Apple Silicon (NEON, Accelerate, Metal)
- x86 (SSE/AVX/AVX2/AVX-512/AMX where available)
- Optional GPU backends (CUDA / Metal / others, depending on build flags)
Quantization-aware: supports multiple GGUF quantization schemes to reduce memory and improve throughput.
Rich tooling:
- cheese-cli for interactive and scripted use.
- cheese-server for an OpenAI-compatible HTTP API (with optional Web UI).
- Quantization, benchmarking, and conversion helpers under tools/.

Quick Start

Build

From the repository root:

cmake -B build
cmake --build build --config Release

This produces binaries under build/bin/.

Run a model

Assuming you have a GGUF model at ./models/model.gguf:

# Chat from the terminal
./build/bin/cheese-cli -m ./models/model.gguf -cnv

# Start an OpenAI-compatible HTTP server (add --webui for the Web UI)
./build/bin/cheese-server -m ./models/model.gguf --port 8080

The server exposes /v1/chat/completions, /v1/completions, /v1/embeddings, and related endpoints.

Documentation

See the in-repo docs for details:

Performance Tips

To get the best performance on your machine:

Build with an appropriate backend (BLAS for CPU, CUDA/Metal/SYCL where applicable).
Tune runtime flags:
- Threads: -t N
- GPU layers: -ngl N
- Batch size / ubatch: -ub N
See:

Cheesebrain aims to be a pragmatic, low-friction way to run GGUF models locally while remaining small and hackable.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.devops		.devops
.gemini		.gemini
.github		.github
benches		benches
ci		ci
cmake		cmake
common		common
docs		docs
examples		examples
ggml		ggml
gguf-py		gguf-py
grammars		grammars
include		include
licenses		licenses
media		media
models		models
pocs		pocs
requirements		requirements
scripts		scripts
src		src
tests		tests
third_party		third_party
tools		tools
vendor		vendor
.clang-format		.clang-format
.clang-tidy		.clang-tidy
.dockerignore		.dockerignore
.dockerignore:Zone.Identifier		.dockerignore:Zone.Identifier
.ecrc		.ecrc
.editorconfig		.editorconfig
.flake8		.flake8
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
CMakeLists.txt		CMakeLists.txt
CMakePresets.json		CMakePresets.json
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
convert_cheese_ggml_to_gguf.py		convert_cheese_ggml_to_gguf.py
convert_hf_to_gguf.py		convert_hf_to_gguf.py
convert_hf_to_gguf_update.py		convert_hf_to_gguf_update.py
convert_lora_to_gguf.py		convert_lora_to_gguf.py
flake.lock		flake.lock
flake.nix		flake.nix
mypy.ini		mypy.ini
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
pyrightconfig.json		pyrightconfig.json
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cheesebrain

Key Features

Quick Start

Build

Run a model

Documentation

Performance Tips

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Cheesebrain

Key Features

Quick Start

Build

Run a model

Documentation

Performance Tips

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages