Open-source desktop AI agent with screen awareness and GUI control. macOS first.
Built on pi-mono agent engine — gets TUI, agent loop, coding tools, session persistence, compaction, multi-provider support, and model switching for free.
- Coding tools — read, write, edit, bash, grep, find, ls (from pi-mono)
- Screenshot + VLM analysis — captures screen, analyzes with vision model
- Desktop automation — click, type, keyboard shortcuts, open/activate apps
- Web search — Tavily-powered web search
- Calendar — create events and list calendars via macOS Calendar.app
- Session persistence — resume conversations across restarts
- Multi-provider — Anthropic, OpenAI, Google, Groq, Mistral, DeepSeek, xAI
- Interactive TUI — full terminal UI with model switching (Ctrl+P) and thinking levels
# Clone
git clone https://github.com/FoundDream/PiDesk.git
cd PiDesk
# Install dependencies (requires Bun)
bun install
# For click actions
brew install cliclick
# Configure
cp .env.example .env
# Edit .env with your API keys# Interactive mode
bun run dev
# With initial message
bun run dev "take a screenshot and describe what you see"
# Quick commands
bun run dev "open Calculator"
bun run dev "search for TypeScript 5.0 features"
bun run dev "create a calendar event tomorrow at 3pm for team meeting"Create a .env file:
# Required: at least one LLM provider
ANTHROPIC_API_KEY=sk-ant-...PiDesk adds 4 custom tools on top of pi-mono's agent engine:
pi-mono agent engine
├── TUI (InteractiveMode)
├── Agent loop + compaction
├── Coding tools (read/write/edit/bash/grep/find/ls)
├── Session persistence
└── Multi-provider + model switching
PiDesk custom tools
├── screenshot — screencapture + VLM analysis
├── desktop_action — cliclick + AppleScript GUI control
├── web_search — Tavily API
└── calendar — macOS Calendar.app via AppleScript
- macOS (for screencapture, AppleScript, cliclick)
- Bun runtime
- cliclick (
brew install cliclick) for click actions
MIT