This setup provides a capable local LLM that's accessible via terminal, web UI, and remote clients across your network.
# Clone and setup
git clone git@github.com:oliveiraigorm/local-llm.git
cd local-llm
# Install dependencies and Ollama
./install.sh
# Start with network access
./setup-network.sh
# Install and start Web UI
./webui.sh install
./webui.sh start
# Test remote connection
./remote-client.sh chat# Stop any running Ollama instance
pkill ollama
# Start Ollama with external access
OLLAMA_HOST=0.0.0.0:11434 ollama serve &tailscale ip -4# From another device on Tailscale network
curl http://YOUR_TAILSCALE_IP:11434/api/tagsdeepseek-coder:6.7b- Excellent for programming, lightweight (4GB RAM)qwen2.5-coder:7b- Great coding performance, recent modelcodegemma:7b- Google's coding model, good balance
llama3.2:3b- Fast, lightweight (2GB RAM)qwen2.5:3b- Excellent small modelgemma2:9b- Good performance, moderate resources
# Using the manager script
./llm.sh start
./llm.sh chat deepseek-coder:6.7b
# Direct Ollama usage
ollama run deepseek-coder:6.7b "Explain this code: <paste code>"# Connect from another machine
./remote-client.sh 192.168.1.100:11434 chat
./remote-client.sh server.local ask "Explain quantum computing"
# Test connection
./remote-client.sh your-server-ip test# Start web interface
./webui.sh start --port 8080
# Access at http://localhost:8080
# Network access via Tailscale IP: http://your-tailscale-ip:8080# Make API requests
curl http://localhost:11434/api/generate -d '{
"model": "deepseek-coder:6.7b",
"prompt": "Write a Python hello world",
"stream": false
}'Claude Code works natively with Ollama:
# Claude Code will detect Ollama automatically
# No additional configuration needed# Automated installation
./webui.sh install
./webui.sh start
# Custom port/host
./webui.sh start --port 3000 --host 0.0.0.0
# Manage service
./webui.sh status
./webui.sh stop
./webui.sh restartCreate ~/Library/LaunchAgents/com.ollama.plist for auto-start:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.ollama</string>
<key>ProgramArguments</key>
<array>
<string>/usr/local/bin/ollama</string>
<string>serve</string>
</array>
<key>EnvironmentVariables</key>
<dict>
<key>OLLAMA_HOST</key>
<string>0.0.0.0:11434</string>
</dict>
<key>RunAtLoad</key>
<true/>
<key>KeepAlive</key>
<true/>
</dict>
</plist>Load with:
launchctl load ~/Library/LaunchAgents/com.ollama.plist- Models run efficiently on Apple Silicon
- Use quantized models (the :6.7b, :3b variants)
- Monitor memory usage with Activity Monitor
- DeepSeek Coder 6.7B: ~4GB RAM
- Qwen2.5 7B: ~5GB RAM
- Llama 3.2 3B: ~2GB RAM
- Connect to Ollama from any machine on your network
- Interactive chat mode with command history
- Model selection and management
- Connection testing and status checking
- Browser-based interface for chat and model management
- Supports multiple simultaneous users
- File upload and conversation history
- Mobile-friendly responsive design
- Tailscale: Secure VPN access across devices
- Local Network: Direct IP access on same network
- Firewall: Configure port 11434 (Ollama) and 8080 (Web UI)
- Ollama with
0.0.0.0:11434accepts connections from any device on your network - Tailscale provides encrypted VPN, keeping traffic secure
- Consider firewall rules if you need additional security
- Web UI runs on port 8080 by default - change with
--portoption
install.sh- One-click dependency installationsetup-network.sh- Configure Ollama for network accessllm.sh- Local LLM management (start/stop/chat/models)remote-client.sh- Remote terminal client for other machineswebui.sh- Web UI installation and management