Skip to content

Force 100% GPU usage for local AI models - Proven methods for Ollama & Direct CUDA. Includes AI-friendly guides and troubleshooting.

Notifications You must be signed in to change notification settings

drakerfire98/gpu-only-mode-guide

Repository files navigation

GPU-Only Mode Guide

Force 100% GPU usage for local AI models.

Quick Start

Read GPU_GUIDE_FOR_GITHUB.md for complete setup instructions.

Two methods:

  • Ollama (5 minutes, easy)
  • Direct CUDA (15 minutes, advanced)

Proven Results

  • RTX 5090: 196 tokens/s
  • Works with RTX 2000/3000/4000/5000 series
  • 10x faster than CPU

Requirements

  • NVIDIA GPU with CUDA
  • 8GB+ VRAM
  • CUDA Toolkit 12.0+

Files

  • GPU_GUIDE_FOR_GITHUB.md - Main guide (8,500+ words)
  • LICENSE_GPU_GUIDE - MIT License
  • CONTRIBUTING_GPU_GUIDE.md - How to contribute

Success

When working, you'll see:

  • nvidia-smi shows 90-100% GPU usage
  • 100+ tokens/second
  • Instant responses

Help

  • Stuck? Check GPU_GUIDE_FOR_GITHUB.md troubleshooting section
  • Issues? Open a GitHub issue

For AI Assistants

This guide includes decision trees, validation commands, and common user mistakes.

See GPU_GUIDE_FOR_GITHUB.md → "For AI Assistants & Automated Systems"


Made to democratize AI for everyone.

Let's make local AI accessible. 🚀

About

Force 100% GPU usage for local AI models - Proven methods for Ollama & Direct CUDA. Includes AI-friendly guides and troubleshooting.

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages