Open-Source Intelligent Command Layer
What if your computer could actually hear you, see what you see, and speak back?
Your voice. Your computer. Your rules.
One AI system that replaces the 15 apps you click through every day — drafts emails, runs research, writes code, controls your lights, reads your screen. 50+ skills. 12 autonomous agents. Zero tabs.
opencodec.org · AVA Digital LLC · Get Started · Support · Enterprise
You spend 4+ hours a day switching tabs, copy-pasting, formatting emails, checking calendars, re-reading Slack threads, Googling the same things, clicking through the same 15 steps to get one task done.
AI chatbots don't fix this. They're another tab. You still copy, paste, switch, click.
| Product | One-liner | |
|---|---|---|
| 1 | CODEC Core | 50+ voice skills, keyboard shortcuts, direct chat replies — your Mac's AI command layer |
| 2 | CODEC Dictate | Hold a key, speak, release — text appears in any app instantly |
| 3 | CODEC Instant | Select text, right-click — AI rewrites, translates, replies in-place. Instant. |
| 4 | CODEC Chat | 250K-context chat with file uploads, vision, web search, and 12 autonomous agent crews |
| 5 | CODEC Vibe | AI coding IDE — describe what you want, watch it build, live preview in browser |
| 6 | CODEC Voice | Call your AI like a phone call — speak to interrupt, live webcam feed, it acts while you talk |
| 7 | CODEC Overview | Your AI dashboard — every tool, every agent, one screen |
Reply to any message without typing. Slack pings. WhatsApp buzzes. iMessage lights up. You don't switch apps, don't open keyboards. Press a key, say "reply saying I'll be there in 10, casual tone" — CODEC reads the conversation, drafts the reply, and pastes it directly into the chat input. You hit send. Done.
Keyboard shortcuts that replace entire workflows. Hold F18 → speak → release. Your voice becomes a command. Double-tap * → CODEC screenshots your screen and tells you what it sees. Select any text → right-click → CODEC Instant rewrites it, translates it, proofreads it, explains it. No app to open. No tab to switch to. It's already there.
Your screen is the context. CODEC reads what's in front of you — your IDE, your browser, your email. "What's wrong with this code?" It sees the error. "Summarize this article" — it reads the page. "Fill in this form" — it types into the fields. No copy-paste. No explaining what you're looking at.
Live webcam, not just screenshots. Click the camera icon on any page — a draggable picture-in-picture feed appears in the corner. Point it at your whiteboard, your front door, a document on your desk. Hit the snapshot button and CODEC analyzes what it sees. The feed stays open while you keep working.
AI agents that work while you don't. Not chat responses — full autonomous workflows. "Deep research AI in healthcare" → 8 agents fan out, run 20+ searches, write a 10,000-word report with images, deliver it to your Google Docs. Schedule any crew to run on repeat — morning briefings, competitor analysis, inbox triage — all on cron.
Nothing leaves your machine. Run Qwen, Llama, or Mistral locally. Conversations stored in local SQLite. No cloud. No telemetry. No analytics. End-to-end encrypted between your browser and server (AES-256-GCM + ECDH key exchange). Your data is yours. Period.
| Instead of this... | With CODEC |
|---|---|
| Open Gmail → scan 47 emails → draft 3 replies → format → send | "Check my email, flag anything urgent, draft replies" |
| Open Google Docs → blank page → research → write 10 pages → images → format | "Deep research AI in healthcare, save to Docs" — 10,000-word report delivered |
| Slack notification → switch app → read thread → type reply → proofread → send | Press key → "Reply saying I'll review it tonight" → pasted into Slack |
| Copy error → browser → paste → Stack Overflow → try fix → repeat | "Read my screen and fix this error" — never leaves your editor |
| Open competitor site → notes → pricing → write analysis | "Run competitor analysis on [company]" — SWOT delivered to Docs |
| Open 6 tabs → read news → check industry → take notes | Runs automatically at 8am — briefing waiting in your Drive |
| Select text → copy → open translator → paste → copy result → paste back | Right-click → CODEC Translate — replaced in-place, one click |
| Write LinkedIn post → rewrite 4 times → check tone → proofread | "Write a LinkedIn post about [topic]" — polished, ready to post |

Chat — ask anything, drag & drop files, full conversation history

Deep Chat — upload files, select agents, get structured analysis

Voice Call — real-time conversation with live transcript

Vibe Code — describe what you want, get working code with live preview

Deep Research — multi-agent reports delivered to Google Docs

Scheduled automations — morning briefings, competitor analysis, on cron
More screenshots

Settings — LLM, TTS, STT, hotkeys, wake word configuration

Touch ID + PIN + 2FA authentication
git clone https://github.com/AVADSA25/codec.git
cd codec
./install.sh # one-line setup wizard
python3 codec.py # start CODECOr step by step:
pip3 install -r requirements.txt
python3 setup_codec.py # guided 9-step configuration
python3 codec.pyRequires macOS. Python 3.10+. Linux support planned.
You speak → Whisper STT → intent dispatch → skill / agent crew → action on your Mac
Triggers:
| Input | What happens |
|---|---|
| Hold F18, speak, release | Voice command — say it and it's done |
| Double-tap F18 | PTT Lock — hands-free recording, tap again to stop |
| F16 / F9 | Type a command instead of speaking |
Double-tap * * |
Screenshot + AI reads your screen |
Double-tap + + |
Analyze document in clipboard |
| Camera icon (any page) | Live webcam PIP — drag around, snapshot anytime |
| Select text → right-click | 8 AI services in context menu |
Grouped by what they do, not marketing categories:
Your day: Google Calendar, Gmail, Google Tasks, Google Keep, daily briefing, timer, pomodoro Your files: Google Drive, Google Docs, Google Sheets, Google Slides, file search, clipboard Your browser: open sites, search, read pages, fill forms, extract data, scroll, manage tabs, automate morning routines Your writing: draft emails, proofread, elevate, translate, explain, reply, LinkedIn posts Research: web search, URL summarizer, deep research (10,000-word multi-agent reports), competitor analysis Your Mac: process manager, network info, brightness, screenshot OCR, terminal commands, AX bridge (click any button in any app) Coding: Vibe IDE with Monaco editor, live preview, inspect mode, Skill Forge (auto-generate plugins) Smart home: Philips Hue lights — on/off, brightness, colors, scenes, room targeting Meta: memory search, skill marketplace (install/publish), scheduler (cron agents)
Not single prompts — full multi-step AI workflows that run autonomously:
| Crew | What you get |
|---|---|
| Deep Research | 10,000-word report with images → Google Docs |
| Daily Briefing | Morning industry news + your calendar → Google Docs |
| Competitor Analysis | SWOT + competitive positioning → Google Docs |
| Trip Planner | Full itinerary with hotels, flights, activities → Google Docs |
| Email Handler | Triage inbox, draft replies, summarize threads |
| Social Media | Platform-specific posts for Twitter, LinkedIn, Instagram |
| Code Review | Bug hunt + security audit + clean code suggestions |
| Data Analysis | Gather data, find trends, write insights report |
| Content Writer | Blog posts, articles, marketing copy |
| Meeting Summarizer | Extract action items from transcripts |
| Invoice Generator | Create and send professional invoices |
| Custom Agent | Build your own — define role, tools, task |
Schedule any crew: "Run competitor analysis every Monday at 9am"
Select text anywhere → right-click:
| Service | Result |
|---|---|
| Proofread | Grammar, spelling, clarity — fixed and replaced |
| Elevate | Rewritten at executive level |
| Translate | Translated to English (or configured language) |
| Explain | Plain-English explanation |
| Reply | Smart reply with optional :tone syntax |
| Prompt | Optimized as an LLM prompt |
| Read Aloud | Spoken via Kokoro TTS |
| Save | Saved to Google Keep or local notes |
CODEC exposes tools as an MCP server. Any MCP-compatible client can invoke CODEC skills directly:
{
"mcpServers": {
"codec": {
"command": "python3",
"args": ["/path/to/codec-repo/codec_mcp.py"]
}
}
}Then in Claude Desktop: "Use CODEC to check my calendar for tomorrow."
Skills opt-in to MCP exposure with SKILL_MCP_EXPOSE = True.
This isn't a marketing section. It's the architecture.
Your data never leaves. CODEC runs on your machine. Conversations, files, calendar data, memory — stored locally in SQLite. No cloud sync. No analytics endpoint. No telemetry. Check the source.
Run any LLM locally. Qwen, Llama, Mistral, Gemma — via MLX, Ollama, or LM Studio. Zero API calls if you want. Or use cloud APIs (OpenAI, Claude, Gemini) — your choice.
5-layer security stack for remote access. Cloudflare Zero Trust tunnel (or Tailscale VPN — no domain needed) → PIN → Touch ID biometrics → TOTP 2FA (Google Authenticator / Authy) → E2E encryption (AES-256-GCM + ECDH P-256 key exchange). Every request between browser and server is encrypted end-to-end on top of TLS.
Command safety. Dangerous command blocklist. Subprocess isolation with resource limits (512MB RAM, 120s CPU). Review-and-approve gate before any script runs. LLM-generated skills require human review.
Memory is yours. Full-text search (SQLite FTS5) across every conversation — but only on your machine. Parameterized queries prevent injection. No external memory service.
| Layer | Protection |
|---|---|
| Network | Cloudflare Zero Trust tunnel or Tailscale VPN, CORS restricted origins |
| Encryption | AES-256-GCM + ECDH P-256 key exchange, per-session keys |
| Auth | Touch ID + PIN + TOTP 2FA, timing-safe token comparison |
| Sessions | SameSite=Strict, CSRF tokens, conditional Secure flag |
| Execution | Subprocess isolation, resource limits, command blocklist |
| Skills | Blocked imports, human review gate, SHA-256 marketplace verification |
| Data | Local SQLite, parameterized queries, FTS5 sanitization |
| Model | How to run |
|---|---|
| Qwen 3.5 35B (recommended) | mlx-lm.server --model mlx-community/Qwen3.5-35B-A3B-4bit |
| Llama 3.3 70B | mlx-lm.server --model mlx-community/Llama-3.3-70B-Instruct-4bit |
| Mistral 24B | mlx-lm.server --model mlx-community/Mistral-Small-3.1-24B-Instruct-2503-4bit |
| Gemma 3 27B | mlx-lm.server --model mlx-community/gemma-3-27b-it-4bit |
| GPT-4o (cloud) | "llm_url": "https://api.openai.com/v1" |
| Claude (cloud) | OpenAI-compatible proxy |
| Ollama (any model) | "llm_url": "http://localhost:11434/v1" |
Configure in ~/.codec/config.json:
{
"llm_url": "http://localhost:8081/v1",
"model": "mlx-community/Qwen3.5-35B-A3B-4bit"
}Extended keyboard (F13-F18):
| Key | Action |
|---|---|
| F13 | Toggle CODEC ON/OFF |
| F18 (hold) | Record voice → release to send |
| F18 (double-tap) | PTT Lock — hands-free recording |
| F16 | Text input dialog |
* * |
Screenshot + AI analysis |
+ + |
Document mode |
Laptop (F1-F12): F5 = toggle, F8 = voice, F9 = text input
Custom shortcuts in ~/.codec/config.json. Restart after changes: pm2 restart open-codec
Keys don't work
- Laptop? Run
python3 setup_codec.py→ select "Laptop / Compact" in Step 4 - macOS stealing F-keys? System Settings → Keyboard → "Use F1, F2, etc. as standard function keys"
- After config change:
pm2 restart open-codec
Wake word doesn't trigger
- Check Whisper:
pm2 logs whisper-stt --lines 5 --nostream - Check mic permission: System Settings → Privacy → Microphone
- Say "Hey CODEC" clearly — 3 distinct syllables
- 4-layer noise gate handles most backgrounds, but loud music near the mic can interfere
No voice output
- Check Kokoro TTS:
curl http://localhost:8085/v1/models - Fallback:
"tts_engine": "say"in config.json (macOS built-in) - Disable:
"tts_engine": "none"
Dashboard not loading
- Check:
curl http://localhost:8090/ - Restart:
pm2 restart codec-dashboard - Remote via Cloudflare:
pm2 logs cloudflared --lines 3 --nostream - Remote via Tailscale: install Tailscale on your Mac and phone — access CODEC at
http://100.x.x.x:8090with no domain or tunnel setup needed
Skills not loading
- Check:
pm2 logs open-codec --lines 20 --nostream | grep -i skill - Count:
ls ~/.codec/skills/*.py | wc -l
Agents failing
- First run takes 2-5 min — multi-step research
- Check:
pm2 logs codec-dashboard --lines 30 --nostream | grep Agents - Agents run as background jobs — no Cloudflare timeout
codec.py — Entry point
codec_config.py — Configuration + transcript cleaning
codec_keyboard.py — Keyboard listener, PTT lock, wake word
codec_dispatch.py — Skill matching and dispatch
codec_agent.py — LLM session builder
codec_agents.py — Multi-agent crew framework (12 crews)
codec_voice.py — WebSocket voice pipeline
codec_voice.html — Voice call UI
codec_dashboard.py — Web API + dashboard (60+ endpoints)
codec_dashboard.html — Dashboard UI
codec_chat.html — Chat UI
codec_vibe.html — Vibe Code IDE
codec_auth.html — Authentication (Touch ID + PIN + TOTP 2FA)
codec_textassist.py — 8 right-click services
codec_search.py — DuckDuckGo + Serper search
codec_mcp.py — MCP server
codec_memory.py — FTS5 memory search
codec_heartbeat.py — Health monitoring + task auto-execution
codec_scheduler.py — Cron-like agent scheduling
codec_marketplace.py — Skill marketplace CLI
ax_bridge/ — Swift AX accessibility bridge
swift-overlay/ — SwiftUI status bar app
skills/ — 50+ built-in skills
tests/ — 212+ pytest tests
install.sh — One-line installer
setup_codec.py — Setup wizard (9 steps)
- Linux support
- Windows via WSL
- Multi-machine sync (skills + memory across devices)
- iOS app (dictation + remote dashboard)
- Streaming voice responses (first token plays while rest generates)
- Multi-LLM routing (fast model for simple, strong model for complex)
All skill contributions welcome. 50+ built-in, marketplace growing.
git clone https://github.com/AVADSA25/codec.git
cd codec && ./install.sh
python3 -m pytest # all tests must passSee CONTRIBUTING.md.
If CODEC saves you time:
- Star this repo
- Donate via PayPal — ava.dsa25@proton.me
- Enterprise setup: avadigital.ai
Need CODEC configured for your business, integrated with your tools, or deployed across a team?
Contact AVA Digital for professional setup and custom skill development.
Built by AVA Digital LLC · MIT License



