Skip to content
/ LLM Public

🧠 Lightweight Flask API for running pre-trained LLMs locally on CPU-only Windows machines using Hugging Face Transformers and PyTorch β€” perfect for prototyping.

Notifications You must be signed in to change notification settings

scouring/LLM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🧠 LLM CPU Flask API

A lightweight local API for running a pre-trained large language model (LLM) on a Windows machine without a GPU using Flask, Hugging Face Transformers, and PyTorch.

This project shows how to deploy a pre-trained model like mistralai/Mistral-7B-Instruct-v0.1 in a resource-constrained environment β€” ideal for prototyping or learning.


πŸš€ Features

  • Runs on CPU (no GPU required)
  • Flask REST API for easy integration
  • Uses Hugging Face transformers + pipeline
  • Clean project structure
  • Good starting point for local GenAI apps

πŸ› οΈ Tech Stack

  • Python 3.8+
  • Flask
  • Hugging Face Transformers
  • PyTorch (CPU)
  • Mistral-7B-Instruct or similar open LLM

πŸ“¦ Installation

1. Clone the repo

git clone https://github.com/scouring/LLM.git
cd LLM.git

About

🧠 Lightweight Flask API for running pre-trained LLMs locally on CPU-only Windows machines using Hugging Face Transformers and PyTorch β€” perfect for prototyping.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages