An AI-powered intelligent browser built with Next.js and Electron. Features multi-modal AI task execution, scheduled tasks, social media integration, and advanced file management capabilities with support for multiple AI providers.
Built with Next.js and Electron.
- Frontend: Next.js 15 + React 19
- Desktop: Electron 33
- UI: Ant Design + Tailwind CSS
- State Management: Zustand
- Storage: IndexedDB (via electron-store)
- AI Agent: @jarvis-agent (based on Eko - production-ready agent framework)
- Build Tools: Vite + TypeScript
Node version: 20.19.3
Before running the application, you need to configure API keys:
# Copy configuration template
cp .env.template .env.local
# Edit .env.local and fill in your API keys
# Supported: DEEPSEEK_API_KEY, QWEN_API_KEY, GOOGLE_API_KEY, ANTHROPIC_API_KEY, OPENROUTER_API_KEYFor detailed configuration instructions, see CONFIGURATION.md.
First, run the development server:
# Install dependencies
pnpm install
# Build desktop application client for mac
pnpm run build:deps
# Build desktop application client for windows
pnpm run build:deps:win
# Start web development server
pnpm run next
# Start desktop application
pnpm run electronTo build the desktop application for distribution:
# Configure production API keys
# Edit .env.production file with your actual API keys
# Build the application for mac
pnpm run build
# Build the application for windows
pnpm run build:winThe built application will include your API configuration, so end users don't need to configure anything.
- Multiple AI Providers: Support for DeepSeek, Qwen, Google Gemini, Anthropic Claude, and OpenRouter
- UI Configuration: Configure AI models and API keys directly in the app, no file editing required
- Agent Configuration: Customize AI agent behavior with custom prompts and manage MCP tools
- Toolbox: Centralized hub for system features including agent configuration, scheduled tasks, and more
- AI-Powered Browser: Intelligent browser with automated task execution
- Multi-Modal AI: Vision and text processing capabilities
- Scheduled Tasks: Create and manage automated recurring tasks
- Speech & TTS: Voice recognition and text-to-speech integration
- File Management: Advanced file operations and management
v0.0.1 - v0.0.4: Core Functionality
- AI-powered browser with automated task execution
- Multiple AI provider support (DeepSeek, Qwen, Google Gemini, Claude, OpenRouter)
- Multi-modal AI capabilities (vision and text processing)
- Scheduled tasks system with custom intervals
- File management capabilities
- UI configuration for API keys and models
v0.0.5 - v0.0.7: UI/UX Enhancements
- Purple theme redesign with improved UI/UX
- Agent Configuration system (custom prompts, MCP tools management)
- Toolbox page (centralized feature hub)
- Internationalization support (English/Chinese)
- WebGL animated background with gradient fallback
- Improved modal sizes and layout optimization
v0.0.8 - v0.0.10: Advanced Features
- Human interaction support (AI can ask questions during execution)
- Task continuation with file attachment management
- Atomic fragment-based history playback with typewriter effects
- Advanced playback controls (play/pause/restart/speed adjustment)
- Context restoration and session management
- Optimized auto-scroll behavior for messages
- Enhanced message display and rendering
Phase 1: Enhanced User Experience
- Voice input support (speech-to-text integration)
- Theme customization system (multiple color schemes)
- Dark/Light mode toggle
- Enhanced accessibility features
Phase 2: Workflow Enhancement
- Workflow configuration export/import functionality
- Refactored scheduled task steps based on workflow configuration
- Visual workflow editor with drag-and-drop interface
- Step management (reorder, add, remove, edit workflow steps)
- Workflow templates and presets
Phase 3: Plugin Ecosystem
- MCP plugin marketplace
- Community plugin sharing platform
- Plugin version management system
- One-click plugin installation and updates
- Plugin development toolkit and documentation
Phase 4: Advanced Capabilities
- Multi-tab browser support
- Collaborative task execution
- Cloud sync for tasks and configurations
- Mobile companion app
- Performance optimization and caching improvements
Input tasks and let AI execute automatically.
Left: AI thinking and execution steps. Right: Real-time browser operation preview.
Create scheduled tasks with custom intervals and execution steps.
View past tasks with search and playback capabilities.
Centralized hub for accessing all system features and configurations.
Customize AI agent behavior with custom prompts and manage MCP tools for enhanced capabilities.
- DeepSeek: deepseek-chat, deepseek-reasoner
- Qwen (Alibaba Cloud): qwen-max, qwen-plus, qwen-vl-max
- Google Gemini: gemini-1.5-flash, gemini-2.0-flash, gemini-1.5-pro, and more
- Anthropic Claude: claude-3.7-sonnet, claude-3.5-sonnet, claude-3-opus, and more
- OpenRouter: Multiple providers (Claude, GPT, Gemini, Mistral, Cohere, etc.)
- Configuration Guide - Detailed API key setup instructions
Special thanks to Eko - A production-ready agent framework that powers the AI capabilities of this project.
⭐ If you find this project helpful, please consider giving it a star! Your support helps us grow and improve.
- Report issues on GitHub Issues
- Join discussions and share feedback
- Contribute to make AI Browser better
Please ensure all API keys are properly configured in development environment files only. Never commit actual API keys to the repository.
This project is licensed under the MIT License - see the LICENSE file for details.






