Force 100% GPU usage for local AI models.
Read GPU_GUIDE_FOR_GITHUB.md for complete setup instructions.
Two methods:
- Ollama (5 minutes, easy)
- Direct CUDA (15 minutes, advanced)
- RTX 5090: 196 tokens/s
- Works with RTX 2000/3000/4000/5000 series
- 10x faster than CPU
- NVIDIA GPU with CUDA
- 8GB+ VRAM
- CUDA Toolkit 12.0+
GPU_GUIDE_FOR_GITHUB.md- Main guide (8,500+ words)LICENSE_GPU_GUIDE- MIT LicenseCONTRIBUTING_GPU_GUIDE.md- How to contribute
When working, you'll see:
nvidia-smishows 90-100% GPU usage- 100+ tokens/second
- Instant responses
- Stuck? Check
GPU_GUIDE_FOR_GITHUB.mdtroubleshooting section - Issues? Open a GitHub issue
This guide includes decision trees, validation commands, and common user mistakes.
See GPU_GUIDE_FOR_GITHUB.md → "For AI Assistants & Automated Systems"
Made to democratize AI for everyone.
Let's make local AI accessible. 🚀