Skip to content

akhilyad/__OpenJarvis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

572 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
OpenJarvis

⚑ JARVIS β€” Just A Rather Very Intelligent System

"Sometimes you gotta run before you can walk." β€” Tony Stark

Python License Status Arc Reactor


   β–ˆβ–ˆβ•— β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•— β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•— β–ˆβ–ˆβ•—   β–ˆβ–ˆβ•—β–ˆβ–ˆβ•—β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—
   β–ˆβ–ˆβ•‘β–ˆβ–ˆβ•”β•β•β–ˆβ–ˆβ•—β–ˆβ–ˆβ•”β•β•β–ˆβ–ˆβ•—β–ˆβ–ˆβ•‘   β–ˆβ–ˆβ•‘β–ˆβ–ˆβ•‘β–ˆβ–ˆβ•”β•β•β•β•β•
   β–ˆβ–ˆβ•‘β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•‘β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•”β•β–ˆβ–ˆβ•‘   β–ˆβ–ˆβ•‘β–ˆβ–ˆβ•‘β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—
β–ˆβ–ˆ β–ˆβ–ˆβ•‘β–ˆβ–ˆβ•”β•β•β–ˆβ–ˆβ•‘β–ˆβ–ˆβ•”β•β•β–ˆβ–ˆβ•—β•šβ–ˆβ–ˆβ•— β–ˆβ–ˆβ•”β•β–ˆβ–ˆβ•‘β•šβ•β•β•β•β–ˆβ–ˆβ•‘
β•šβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•”β•β–ˆβ–ˆβ•‘  β–ˆβ–ˆβ•‘β–ˆβ–ˆβ•‘  β–ˆβ–ˆβ•‘ β•šβ–ˆβ–ˆβ–ˆβ–ˆβ•”β• β–ˆβ–ˆβ•‘β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•‘
 β•šβ•β•β•β•β• β•šβ•β•  β•šβ•β•β•šβ•β•  β•šβ•β•  β•šβ•β•β•β•  β•šβ•β•β•šβ•β•β•β•β•β•β•

  Just A Rather Very Intelligent System
  v1.0 β€” Stark Industries, R&D Division

OpenJarvis is a modular, open-source AI assistant backend built for people who think a chatbot is beneath them. It listens to your voice, sees your screen, knows your context, reacts to real-world events, and runs autonomous operators β€” all while you sip your scotch and work on the suit.


The Arc Reactor β€” Core Architecture

                        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                        β”‚          YOU  (Stark)           β”‚
                        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                       β”‚  voice / text / screen
                        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                        β”‚         JARVIS  CORE            β”‚
                        β”‚                                 β”‚
                  β”Œβ”€β”€β”€β”€β”€β”€  Intelligence   Engine          β”œβ”€β”€β”€β”€β”€β”
                  β”‚     β”‚  Agents         Memory          β”‚     β”‚
                  β”‚     β”‚  Learning       EventBus        β”‚     β”‚
                  β”‚     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β”‚
                  β”‚                    β”‚                         β”‚
         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”
         β”‚  Voice Loop   β”‚   β”‚  Operators / Cron  β”‚  β”‚   Channels     β”‚
         β”‚  Wake Word    β”‚   β”‚  Event Triggers    β”‚  β”‚   Telegram     β”‚
         β”‚  STT -> TTS   β”‚   β”‚  File/HTTP/Metric  β”‚  β”‚   Discord      β”‚
         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Five primitives β€” Intelligence, Engine, Agents, Memory, Learning β€” compose into anything from a voice-controlled desktop companion to a fleet of autonomous operators monitoring your infrastructure.


Suit Features

🎀 Voice Loop β€” "Jarvis, fire up the Mark V"

Full always-on voice pipeline. Say "Jarvis" β†’ it wakes, listens, thinks, speaks back.

jarvis listen                                       # wake-word mode
jarvis listen --no-wake-word                        # always listening
jarvis listen --once                                # one shot, exits cleanly
jarvis listen --screenshot --screenshot-ocr         # Jarvis sees your screen too

Pipeline: Mic β†’ Energy VAD β†’ STT (Whisper/Deepgram) β†’ Wake Word β†’ Agent β†’ TTS (Kokoro/OpenAI) β†’ Playback

πŸ‘οΈ Screen Awareness β€” "Enhance. Enhance."

Jarvis can see your displays. Capture full screen or a region, extract text with OCR, feed it to any LLM.

jarvis ask "what is this error?" --screenshot
jarvis ask "summarize the document on screen" --screenshot --screenshot-ocr
jarvis ask "describe the left monitor" --screenshot --screenshot-region 0,0,1920,1080

🧠 Personal Context β€” "I'm always online, sir"

A structured living profile that Jarvis always knows β€” your identity, contacts, active projects, preferences.

jarvis profile import                           # first-run wizard
jarvis profile show
jarvis profile set name "Tony Stark"
jarvis profile prefer "never send emails without my go-ahead"
jarvis profile contact add "Pepper" --role ceo --note "handles everything"
jarvis profile project add "Mark VIII" --status active --desc "repulsor upgrade"

⚑ Event-Driven Operators β€” "Alert protocol 7"

Autonomous agents that wake up and act when things happen in the real world β€” not just on a timer.

[[operator.event_triggers]]
type    = "file"
path    = "~/inbox"
pattern = "*.pdf"
events  = ["created"]

[[operator.event_triggers]]
type      = "system_metric"
metric    = "cpu_percent"
threshold = 85.0
operator  = ">"

[[operator.event_triggers]]
type           = "http_poll"
url            = "https://status.openai.com"
fire_on_change = true

[[operator.event_triggers]]
type         = "bus_event"
event_type   = "channel_message_received"
filter_key   = "channel"
filter_value = "telegram"

Four trigger types: file changes, system metrics (CPU/RAM/disk), HTTP content changes, internal event bus.

πŸ€– 9 Agent Types β€” "Deploy the drones"

From a simple one-shot responder to a full ReAct loop with tool use:

Agent What it does
simple Direct Q&A β€” fast, no tools
orchestrator Breaks tasks into subtasks, delegates
native_react ReAct loop with tool calls
operative Operator-grade autonomous executor
critic Self-critiques and revises output
planner Long-horizon planning
summarizer Distils and compresses
multimodal Vision + text
code Code generation and execution

πŸ’Ύ Multi-Layer Memory

SQLite (default) + FAISS vector search + BM25 + ColBERT reranking. Context is automatically injected into every query β€” Jarvis remembers.

πŸ”Œ 20+ Channel Integrations

Telegram, Discord, Slack, Gmail, Twitter/X, Reddit, Twilio, and more.

jarvis channel add telegram --token YOUR_BOT_TOKEN
jarvis channel add discord --token YOUR_BOT_TOKEN

Installation

# Clone the suit
git clone https://github.com/akhilyad/__OpenJarvis.git
cd __OpenJarvis

# Sync with uv (recommended)
uv sync

# Or pip
pip install -e .

Voice pipeline

uv sync --extra voice --extra speech
# Optional: better VAD
uv sync --extra voice-vad
# Optional: Kokoro local TTS
pip install kokoro

Screen awareness

pip install mss Pillow          # capture
pip install pytesseract         # OCR (also needs Tesseract binary)

Event-driven operators

uv sync --extra operators-events   # psutil + watchdog

First Boot

jarvis init                 # initialise ~/.openjarvis/
jarvis profile import       # tell Jarvis who you are
jarvis doctor               # health check
jarvis ask "hello, Jarvis"  # first contact

Quick Commands

# Ask
jarvis ask "what is the weather in Kolkata?"
jarvis ask "draft a reply to this email" --screenshot --screenshot-ocr
jarvis chat                                          # interactive session

# Voice
jarvis listen                                        # always-on voice loop
jarvis listen --no-wake-word --once                  # one command, done

# Agents + Tools
jarvis ask "search and summarise AI news" --agent orchestrator --tools web_search
jarvis ask "write and run this script"   --agent code --tools shell_exec

# Memory
jarvis memory search "project deadline"
jarvis memory add "Mark VIII repulsor upgrade due 2026-05-01"

# Operators
jarvis operators list
jarvis operators activate inbox-monitor
jarvis operators run-once inbox-monitor

# Profile
jarvis profile show
jarvis profile prefer "always use bullet points"

# System
jarvis doctor                    # health check
jarvis model list                # available models
jarvis serve                     # start REST API server

Configuration

Config lives at ~/.openjarvis/config.toml:

[intelligence]
default_model    = "gpt-4o"
preferred_engine = "openai"

[speech]
backend          = "faster_whisper"
wake_word        = "jarvis"
vad_engine       = "energy"
tts_backend      = "kokoro"
silence_timeout_ms = 1500

[memory]
default_backend = "sqlite"

[telemetry]
enabled = true

Optional Extras

Extra What you get
voice sounddevice + soundfile (mic + playback)
voice-vad webrtcvad (better speech detection in noise)
voice-wakeword openwakeword (hot-word model, no STT in hot path)
bundle-voice voice + speech + kokoro TTS
screen mss + Pillow (screen capture + resize)
screen-ocr + pytesseract (text extraction from screen)
operators-events psutil + watchdog (event-driven operators)
memory-faiss FAISS vector search
inference-cloud OpenAI + Anthropic
inference-mlx Apple MLX (macOS only)
inference-vllm vLLM (GPU server)
# Full suit β€” everything
uv sync --extra bundle-voice --extra screen --extra operators-events --extra memory-faiss

Roadmap β€” The Suit's Still Being Built

  • Feature 1 β€” Voice Loop (jarvis listen)
  • Feature 2 β€” Event-Driven Operators
  • Feature 3 β€” Personal Context Layer
  • Feature 4 β€” Screen Awareness
  • Feature 5 β€” HUD / Heads-Up Display
  • Feature 6 β€” Home Automation Bridge
  • Feature 7 β€” Minions Protocol (multi-agent swarm)
  • Feature 8 β€” Agent-to-Agent (A2A) communication
  • Feature 9 β€” Self-Improving Prompts
  • Feature 10 β€” Mobile Companion App

Known Issues

"Fortunately, I am Iron Man." β€” and even I have a punch list.

# Severity Issue
1 CRITICAL screen_capture not auto-registered in ToolRegistry
2 HIGH voice/loop.py imports from CLI layer (arch violation)
3 HIGH _can_fire() race condition in watcher threads
4 HIGH VAD has no max utterance duration limit
5 MEDIUM profile/store.py reads file without explicit UTF-8 encoding
6 MEDIUM Screen resize silently skipped if Pillow absent

Being fixed in the next sprint.


Contributing

Pull requests welcome. If you break the suit, fix the suit.

  1. Fork
  2. git checkout -b feature/repulsor-upgrade
  3. git commit -m 'add repulsor upgrade'
  4. git push origin feature/repulsor-upgrade
  5. Open a PR

License

MIT β€” "I prefer to think of it as liberating."


Built with arc reactor energy.

"Jarvis, sometimes I think you're the only one who gets me."

About

We all know Jarvis xD.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors