There are simultaneously too many and not enough GGUF converters in the world.
- llama.cpp under the hood - so that part works
- Download - automatically download models and their auxiliary files from HuggingFace
- Convert - safetensors and PyTorch models to GGUF format
- Quantize - to multiple formats at once
- Cross-platform - works on Windows and Linux (and probably Mac but untested)
- Easy - auto-installs an environment + llama.cpp + CPU binaries for quantizing
- Flexible - can use any local llama.cpp repo or binary installation for quantizing
- Minimal mess - virtual environment prevents conflicts with your python setup
- Single or split files mode - Generate single or split files for intermediates and quants
- Split/Merge Shards - Split, merge, or resplit GGUF and safetensors files with custom shard sizes
- Importance Matrix - Generate or reuse imatrix files for better low-bit quantization (IQ2, IQ3)
- Imatrix Statistics - Analyze importance matrix files to view statistics
- Custom intermediates - Use existing GGUF files as intermediates for quantization
- Enhanced dtype detection - Detects model precision (BF16, F16, etc.) from configs and safetensors headers
- Model quirks detection - Handles Mistral format, pre-quantized models, and architecture-specific flags
- Vision/Multimodal models - Automatic detection and two-step conversion (text model +
mmproj-*.gguf) - Sentence-transformers - Auto-detect and include dense modules for embedding models
- VRAM Calculator - Estimate VRAM usage and recommended GPU layers (-ngl) for GGUF models
All quantization types from llama.cpp are supported. Choose based on your size/quality tradeoff:
| Type | Size | Quality | Category | Imatrix | Notes |
|---|---|---|---|---|---|
| F32 | Largest | Original | Unquantized | - | Full 32-bit precision |
| F16 | Large | Near-original | Unquantized | - | Half precision |
| BF16 | Large | Near-original | Unquantized | - | Brain float 16-bit |
| Q8_0 | Very Large | Excellent | Legacy | - | Near-original quality |
| Q5_1, Q5_0 | Medium | Good | Legacy | - | Legacy 5-bit |
| Q4_1, Q4_0 | Small | Fair | Legacy | - | Legacy 4-bit |
| Q6_K | Large | Very High | K-Quant | Suggested | Near-F16 quality |
| Q5_K_M | Medium | Better | K-Quant | Suggested | Higher quality |
| Q5_K_S | Medium | Better | K-Quant | Suggested | 5-bit K small |
| Q4_K_M | Small | Good | K-Quant | Suggested | 4-bit K medium |
| Q4_K_S | Small | Good | K-Quant | Suggested | 4-bit K small |
| Q3_K_L | Very Small | Fair | K-Quant | Recommended | 3-bit K large |
| Q3_K_M | Very Small | Fair | K-Quant | Recommended | 3-bit K medium |
| Q3_K_S | Very Small | Fair | K-Quant | Recommended | 3-bit K small |
| Q2_K | Tiny | Minimal | K-Quant | Recommended | 2-bit K |
| Q2_K_S | Tiny | Minimal | K-Quant | Recommended | 2-bit K small |
| IQ4_NL | Small | Good | I-Quant | Recommended | 4-bit non-linear |
| IQ4_XS | Small | Good | I-Quant | Recommended | 4-bit extra-small |
| IQ3_M | Very Small | Fair | I-Quant | Recommended | 3-bit medium |
| IQ3_S | Very Small | Fair+ | I-Quant | Recommended | 3.4-bit small |
| IQ3_XS | Very Small | Fair | I-Quant | Required | 3-bit extra-small |
| IQ3_XXS | Very Small | Fair | I-Quant | Required | 3-bit extra-extra-small |
| IQ2_M | Tiny | Minimal | I-Quant | Required | 2-bit medium |
| IQ2_S | Tiny | Minimal | I-Quant | Required | 2-bit small |
| IQ2_XS | Tiny | Minimal | I-Quant | Required | 2-bit extra-small |
| IQ2_XXS | Tiny | Minimal | I-Quant | Required | 2-bit extra-extra-small |
| IQ1_M | Extreme | Poor | I-Quant | Required | 1-bit medium |
| IQ1_S | Extreme | Poor | I-Quant | Required | 1-bit small |
Quick Guide:
- Bigger is better (more precision)
- For best quality use F16 or Q8_0
- For decent quality use Q6_K or Q5_K_M
- Medium quality... Use Q4_K_M
- For smallest size use IQ3_M or IQ2_M with importance matrix
# Clone the repository
git clone https://github.com/usrname0/YaGGUF.git
cd YaGGUF
# Run the launcher script for Windows (runs a setup script if no venv detected):
.\run_gui.bat# If you want to select folders via the gui install tkinter (optional):
sudo apt install python3-tk # Ubuntu/Debian
sudo dnf install python3-tkinter # Fedora/RHEL
sudo pacman -S tk # Arch
# Clone the repository
git clone https://github.com/usrname0/YaGGUF.git
cd YaGGUF
# Run the launcher script for Linux (runs a setup script if no venv detected):
./run_gui.shWindows:
- Double-click
.\run_gui.bat
Linux:
- Use terminal
./run_gui.sh
The GUI will automatically open in your browser on a free port like: http://localhost:8501
MIT License - see LICENSE file for details
- llama.cpp - GGUF format and conversion/quantization tools
- HuggingFace - Model hosting and transformers library
- Streamlit - Pythonic data apps
