A minimal, local-first desktop chat interface for Ollama models — built with Electron. All inference runs on your machine with no data leaving your device.
- Streams responses in real time with a live typing cursor
- Auto-discovers all locally installed Ollama models
- Full multi-turn conversation memory
- Markdown rendering — code blocks, tables, bold, lists, and more
- Copy button on every assistant message
- New Chat / Clear conversation
- Connection status indicator
- Dark theme
You need three things installed before running the app.
Download from nodejs.org or install via your package manager:
# Ubuntu / Debian
sudo apt install nodejs npm
# macOS (Homebrew)
brew install node
# Check your version
node --versionDownload from ollama.com or install via the official script:
curl -fsSL https://ollama.com/install.sh | shPull a model before launching the app. Some good options depending on your hardware:
# Fast and lightweight (~2 GB)
ollama pull llama3.2
# Strong general-purpose model (~5 GB)
ollama pull llama3.1:8b
# High quality, needs ~16 GB RAM
ollama pull mistral:7bOn a machine with 32 GB RAM you can comfortably run 7B–13B parameter models. Run
ollama listto see what you have installed.
# 1. Clone the repository
git clone https://github.com/your-username/ollama-chat.git
cd ollama-chat
# 2. Install dependencies
npm installOn most systems Ollama starts automatically as a background service after installation. Verify it is up:
curl http://localhost:11434
# Expected response: Ollama is runningIf you see address already in use when trying to run ollama serve, that is fine — the service is already running and you can skip to Step 2.
If Ollama is not running, start it manually:
ollama serveollama listIf the list is empty, pull a model before launching the app:
# Fast and lightweight (~2 GB) — good starting point
ollama pull llama3.2
# Better reasoning, still fast (~5 GB)
ollama pull llama3.1:8bWait for the download to complete. You can pull as many models as you like and switch between them inside the app.
npm startThe app will connect to Ollama at http://localhost:11434 and populate the model selector with everything returned by ollama list.
Opens the app with DevTools enabled:
npm run devollama-chat/
├── main.js # Electron main process — window creation, app lifecycle
├── preload.js # Context bridge — exposes markdown parser to renderer
├── package.json
└── src/
├── index.html # UI layout
├── styles.css # Dark theme styles
└── renderer.js # Ollama API calls, streaming, UI logic
"Ollama offline" in the status bar
Make sure ollama serve is running. Restart it and reload the app (Ctrl+R).
"No models installed"
Pull at least one model: ollama pull llama3.2
Blank window or crash on startup
Run npm run dev to open DevTools and check the console for errors.
Slow responses Response speed depends entirely on your hardware and the model size. Smaller models (3B–7B) will be significantly faster on CPU.