🧠 LLM CPU Flask API

A lightweight local API for running a pre-trained large language model (LLM) on a Windows machine without a GPU using Flask, Hugging Face Transformers, and PyTorch.

This project shows how to deploy a pre-trained model like mistralai/Mistral-7B-Instruct-v0.1 in a resource-constrained environment — ideal for prototyping or learning.

🚀 Features

Runs on CPU (no GPU required)
Flask REST API for easy integration
Uses Hugging Face transformers + pipeline
Clean project structure
Good starting point for local GenAI apps

🛠️ Tech Stack

Python 3.8+
Flask
Hugging Face Transformers
PyTorch (CPU)
Mistral-7B-Instruct or similar open LLM

📦 Installation

1. Clone the repo

git clone https://github.com/scouring/LLM.git
cd LLM.git

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
LLM-SMALL_distilgpt2		LLM-SMALL_distilgpt2
README.md		README.md
app.py		app.py
model.py		model.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🧠 LLM CPU Flask API

🚀 Features

🛠️ Tech Stack

📦 Installation

1. Clone the repo

About

Uh oh!

Releases

Packages

Languages

scouring/LLM

Folders and files

Latest commit

History

Repository files navigation

🧠 LLM CPU Flask API

🚀 Features

🛠️ Tech Stack

📦 Installation

1. Clone the repo

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages