SAGE (Smart AI General-purpose Engine) is an intelligent desktop assistant that combines voice commands, AI orchestration, and automation to help you control your computer hands-free.
Features โข Demo โข Installation โข Usage โข API Keys
๐น Demo video coming soon! Record a video showing SAGE in action and add it here.
|
|
| Feature | Description |
|---|---|
| ๐ต Spotify Control | Play songs, skip tracks, control playback |
| ๐ Content Generation | Create documents, emails, invitations with AI |
| ๐๏ธ Screen Analysis | AI vision to understand what's on screen |
| ๐ Task Recording | Record and replay mouse/keyboard actions |
| โก Auto Tool Generation | Creates new automation tools on-demand |
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ SAGE Architecture โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ โโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Voice โโโโโถโ Orchestrator โโโโโถโ Tools โ โ
โ โ Input โ โ (Groq AI) โ โ โ โ
โ โโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โ โข System Control โ โ
โ โ โ โ โข Communication โ โ
โ โผ โผ โ โข Productivity โ โ
โ โโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โ โข Media Control โ โ
โ โ Text โ โ Code โ โ โข AI Tools โ โ
โ โ Input โ โ Generator โ โโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โโโโโโโโโโโโ โ (OpenRouter) โ โ โ
โ โโโโโโโโโโโโโโโโ โผ โ
โ โโโโโโโโโโโโโโโโ โ
โ โ Response โ โ
โ โ (TTS + UI) โ โ
โ โโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
- Python 3.10+
- Windows 10/11
- Microphone (for voice commands)
# 1. Clone the repository
git clone https://github.com/yourusername/sage.git
cd sage
# 2. Create virtual environment
python -m venv venv
venv\Scripts\activate
# 3. Install dependencies
pip install -r requirements.txt
# 4. Configure API keys (see below)
cp .env.example .env
# Edit .env with your API keys
# 5. Run SAGE
python main.pySAGE requires API keys to function. All keys have free tiers available.
| Service | Purpose | Get Key | Free Tier |
|---|---|---|---|
| Groq | Main AI (Llama 3.3 70B) | console.groq.com | โ Yes |
| Picovoice | Wake word detection | console.picovoice.ai | โ Yes |
| Service | Purpose | Get Key | Free Tier |
|---|---|---|---|
| OpenRouter | Code generation, Screen analysis | openrouter.ai | โ Limited |
| Gemini | Fallback AI provider | aistudio.google.com | โ Yes |
-
Copy the example file:
cp .env.example .env
-
Edit
.envand add your keys:GROQ_API_KEY=your_groq_key_here PICOVOICE_ACCESS_KEY=your_picovoice_key_here OPENROUTER_API_KEY=your_openrouter_key_here # Optional
python main.pySimply say "Hey SAGE" followed by your command:
| Category | Example Commands |
|---|---|
| Apps | "Open Chrome", "Close Notepad", "Open Spotify" |
| System | "Set volume to 50", "Lock the screen", "What time is it" |
| "Send email to manager about sick leave" | |
| "Send WhatsApp to John saying hello" | |
| Meetings | "Schedule meeting with Sarah tomorrow at 3 PM" |
| Music | "Play Shape of You on Spotify", "Next song" |
| Math | "What is 25 times 4", "Calculate 100 divided by 7" |
| Search | "Search downloads for PDF files" |
| Content | "Write a birthday invitation for Saturday" |
You can also type commands directly in the input box at the bottom of the UI.
| Component | Model | Provider | Purpose |
|---|---|---|---|
| Orchestrator | Llama 3.3 70B | Groq | Task planning & execution |
| Code Generator | Qwen 2.5 Coder 32B | OpenRouter | Auto-generate tools |
| Screen Analyzer | Qwen 2.5 VL 72B | OpenRouter | Vision analysis |
| Content Generator | Llama 3.3 70B | Groq | Documents & emails |
sage/
โโโ main.py # Entry point
โโโ config/ # Configuration
โ โโโ settings.py # Settings management
โ โโโ api_keys.py # API key handling
โโโ core/ # Core AI logic
โ โโโ orchestrator.py # Main AI orchestrator
โ โโโ task_executor.py # Task execution
โ โโโ code_generator.py # Auto tool generation
โ โโโ intent_parser.py # Intent classification
โโโ tools/ # All automation tools
โ โโโ system/ # System control
โ โโโ productivity/ # Productivity tools
โ โโโ communication/ # Email, WhatsApp
โ โโโ media/ # Spotify control
โ โโโ ai/ # AI-powered tools
โโโ voice/ # Voice modules
โ โโโ wake_word.py # Wake word detection
โ โโโ speech_to_text.py # Speech recognition
โ โโโ tts.py # Text-to-speech
โโโ ui/ # User interface
โ โโโ particle_window.py # Main GUI
โโโ data/ # Data files
โ โโโ contacts.json # Contact database
โโโ tests/ # Test files
โโโ examples/ # Demo scripts
# Run all tests
python -m pytest tests/
# Run specific test
python tests/test_all_functionalities.py- Create a new file in
tools/<category>/ - Define your function with proper docstring
- Register it in
core/orchestrator.py
See CONTRIBUTING.md for detailed guidelines.
groq>=0.4.0
requests>=2.31.0
python-dotenv>=1.0.0
pvporcupine>=3.0.0
speechrecognition>=3.10.0
pyttsx3>=2.90
pyaudio>=0.2.13
pyautogui>=0.9.54
pyperclip>=1.8.2
pynput>=1.7.6
Pillow>=10.0.0
Contributions are welcome! Please read CONTRIBUTING.md for guidelines.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Groq - Fast AI inference
- Picovoice - Wake word detection
- OpenRouter - AI model routing
- PyAutoGUI - Desktop automation
**Made by Parth **
โญ Star this repo if you find it useful!