Local model integration #10

jasielmacedo · 2025-11-06T01:44:25Z

Description

This PR implements complete local AI model integration for Browser-LLM using Ollama. Users can now download, manage, and chat with local LLMs directly within the browser, with full support for tool calling, vision models, and context-aware interactions.

Type of Change

New feature (non-breaking change that adds functionality)
Code quality improvement (refactoring, formatting, etc.)

Changes Made

Core Ollama Integration

Ollama Service Management (src/main/services/ollama.ts)
- Automatic process lifecycle management (start, stop, restart, force kill)
- Process monitoring with PID, memory, CPU, and uptime tracking
- Fixed race conditions in health checks with concurrency guards
- Proper cleanup of orphan processes on app start
- Support for both text and vision models (Qwen-VL spawns multiple worker processes)
- Advanced streaming parser with 3 strategies for handling different model formats (especially Qwen's tiny chunks)

AI Tool Calling System

Tool Framework (src/shared/tools.ts)
- search_history - Search browsing history
- get_bookmarks - Access saved bookmarks
- analyze_page_content - Extract and analyze webpage content
- capture_screenshot - Take screenshots for vision models
- get_page_metadata - Retrieve page metadata
- web_search - Perform Google searches
- Automatic tool result handling and error management

Context Management

Intelligent Context System (src/shared/contextManager.ts)
- Dynamic token estimation and budget management
- Automatic content optimization based on model type and available tokens
- Support for page content, screenshots, browsing history, and bookmarks
- Vision model detection and screenshot handling

Model Management UI

Enhanced Model Manager (src/renderer/components/Models/)
- Service status monitoring with expandable process details
- Download progress tracking with real-time updates
- Model installation and deletion
- Expanded model registry with 15+ pre-configured models
- Service control buttons (restart, stop, force kill)

Chat Interface

Advanced Chat System (src/renderer/store/chat.ts, src/renderer/components/Chat/ChatSidebar.tsx)
- Streaming responses with token-by-token display
- Planning Mode toggle for agentic tool use
- Page Context toggle for automatic context injection
- Tool execution visualization (shows tool calls and results)
- Chain-of-Thought Display - Collapsible "AI Reasoning Process" section for Qwen models
- Performance metrics (TTFT, total time)
- Context info badges (page content, screenshot, history, bookmarks, token count)
- Cancel generation support

System Prompt Configuration

System Prompt Settings (src/renderer/components/Settings/SystemPromptSettings.tsx)
- Comprehensive base system prompt (463 words)
- User can ADD custom instructions (not replace base prompt)
- User information field for personalization
- Custom instructions for preferences
- Automatic date/time injection

Browser Integration

Navigation Bar Enhancements (src/renderer/components/Browser/NavigationBar.tsx)
- "Ask AI about this page" quick actions
- "Explain selected text" context menu
- "Summarize page" functionality
- Fixed duplicate key warning in suggestions dropdown

IPC Communication

Secure IPC Handlers (src/main/ipc/handlers.ts, src/main/preload.ts)
- Channel whitelisting for security
- Handlers for model management, tool execution, settings
- Support for streaming events: ollama:chatToken, ollama:reasoning, ollama:toolCalls

Process Management Fixes

Multi-Process Handling
- Changed taskkill /PID to taskkill /IM ollama.exe to kill all instances (vision models spawn detached workers)
- Added process finding by name when PID reference is lost
- Proper cleanup on app close

Streaming & Parsing

Qwen Model Support
- Strategy 1: Parse complete JSON for small chunks (<500 bytes)
- Strategy 2: Split on }{ for concatenated JSON
- Strategy 3: Line-based parsing with newlines
- Separate handling for thinking field (chain-of-thought reasoning)
- Stream monitoring with diagnostic logging

State Management

Fixed Tool Execution Flow
- Dynamic message ID tracking for multi-turn tool conversations
- Proper streaming state reset after tool follow-up requests
- Token and thinking listeners correctly target current message

UI Improvements

Downloads status bar component with progress tracking
Model capabilities badges (vision, tool calling)
Collapsible thinking display with character count
Performance timing indicators
Context information badges

Testing

Tested locally in development mode
Tested production build
Manually tested affected features
- Model download and installation (Llama 3.2, Qwen2.5-VL, DeepSeek, etc.)
- Chat with streaming responses
- Tool calling (analyze_page_content, search_history, web_search)
- Vision model with screenshot analysis
- Planning Mode with multi-turn tool conversations
- System prompt customization
- Service monitoring and control
- Process cleanup on app close

Checklist

My code follows the project's code style (ESLint and Prettier)
I have performed a self-review of my code
I have commented my code where necessary
My changes generate no new warnings or errors
I have tested my changes locally

Additional Notes

Key Technical Decisions

Process Management Strategy: Using taskkill /IM instead of /PID ensures all Ollama processes are terminated, including detached workers spawned by vision models.
Streaming Parser: Implemented three parsing strategies to handle different model output formats. Qwen models send tiny chunks (~140 bytes) that don't trigger newline-based parsers, requiring complete JSON parsing.
Thinking Token Separation: Qwen models send internal reasoning in a thinking field before actual content. This is now captured separately and displayed in a collapsible UI section, preventing it from cluttering the main response.
Dynamic Message ID Tracking: Tool execution creates follow-up requests that need separate messages. Listeners now track currentMessageId dynamically to append tokens to the correct message.
System Prompt Architecture: Base prompt is always present with clear instructions about context usage and tool calling. User additions are appended, not replacing the base, preventing users from accidentally breaking the AI.

Performance Metrics

Model Support: 15+ models in registry (Llama 3.2, Qwen2.5-VL, DeepSeek R1, Phi-4, etc.)
Tool Count: 6 available tools for AI agents
Code Changes: 5,847 insertions across 30 files
Token Estimation: Automatic context budget management with vision/text model optimization

Known Limitations

Ollama service must be bundled with the app or installed separately
Vision models require more memory (spawn 3 processes)
Streaming performance depends on model size and hardware

Future Enhancements

Model performance benchmarking
Custom tool creation interface
Multi-modal input (audio, video)
Model fine-tuning support

…pt management

jasielmacedo added 4 commits November 5, 2025 16:01

add final ollama integration with small fixes to the UX

2e3b400

add more model options to download

00257ee

add better control for models

fd2dacb

handle preload and ollama initialization with planner and system prom…

0016e8a

…pt management

jasielmacedo merged commit af330c3 into main Nov 6, 2025
1 check passed

jasielmacedo deleted the local-model-integration branch November 6, 2025 01:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Local model integration #10

Local model integration #10

Uh oh!

jasielmacedo commented Nov 6, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Local model integration #10

Local model integration #10

Uh oh!

Conversation

jasielmacedo commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Changes Made

Core Ollama Integration

AI Tool Calling System

Context Management

Model Management UI

Chat Interface

System Prompt Configuration

Browser Integration

IPC Communication

Process Management Fixes

Streaming & Parsing

State Management

UI Improvements

Testing

Checklist

Additional Notes

Key Technical Decisions

Performance Metrics

Known Limitations

Future Enhancements

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jasielmacedo commented Nov 6, 2025 •

edited

Loading