IronMic

Fully-local voice transcription and text-to-speech.
Speak, transcribe, polish, listen back — all on-device. Zero network. Zero telemetry.

Download · Features · Quick Start · Security · Audit · MIT License

IronMic captures your voice, transcribes it with Whisper, and copies the raw transcript to your clipboard or routes it to the active page — dictate, search, listen, notes, or timeline. Optionally polish text with a local LLM on-demand, or hear it read back with a built-in TTS engine. Everything runs entirely on your machine — no audio or text ever leaves the device.

Features

Core Dictation

Voice-to-clipboard — Press a global hotkey, speak, press again. Raw transcript lands in your clipboard instantly, ready to paste anywhere.
Page-aware recording — Voice input routes to the active page: dictate page inserts into the editor, search page fills the search bar, listen page queues for TTS playback, notes page appends to the active note.
Progressive feedback — See your transcript appear in the timeline immediately after recording stops, with a live progress indicator. No more waiting for the full pipeline to complete.
Whisper large-v3-turbo — State-of-the-art local speech recognition with GPU acceleration (Metal on macOS).
On-demand LLM text cleanup — Click "Polish now" on any timeline entry to clean up text with a local Mistral 7B model. Removes filler words, fixes grammar, preserves your meaning. Raw text is always shown first — you control when cleanup runs.
Custom dictionary — Add domain-specific terms, names, and jargon to improve transcription accuracy.

Text-to-Speech Read-Back

Kokoro 82M TTS — Hear your dictations read back through a local neural voice engine.
15 English voices — American and British accents, male and female. Preview and switch in Settings.
Speed control — 0.5x to 2.0x playback speed.
Word-level highlighting — Words highlight in sync as they're spoken (karaoke-style).
Auto read-back — Optionally read text aloud automatically after dictation completes.

AI Assistant

Built-in AI chat — Wrapper around GitHub Copilot CLI and Claude Code CLI.
Context-aware — Ask questions, refine text, brainstorm — powered by your existing AI subscriptions.
Streaming responses — Real-time token-by-token output.
Privacy-first — The AI feature is off by default. When enabled, it uses your own CLI tools and credentials.

On-Device Machine Learning ^{NEW in 1.1.0}

IronMic includes 5 TensorFlow.js-powered ML features that learn from your usage patterns — entirely on-device.

Voice Activity Detection — Filters silence and noise before Whisper, making transcription faster. Configurable sensitivity.
Turn Detection — Detects when you're done speaking. Three modes: push-to-talk, auto-detect (default 3s silence), and always-listening. Enables hands-free conversation loops with the AI assistant.
Voice Commands — Say "search for meeting notes", "open analytics", or "summarize today" — IronMic classifies the intent and executes the action. LLM fallback for complex commands. Voice correction: "no, I meant...".
Context-Aware Routing — Voice input automatically routes to dictation, AI conversation, or command classification based on your active screen. No manual mode switching.
Ambient Meeting Mode — Leave IronMic running during a meeting. It transcribes with speaker awareness, detects when the meeting ends, and generates a summary with action items.
Semantic Search — Search by meaning, not just keywords. "Find everything about the auth redesign" returns semantically related entries across all content.
Smart Notifications — Learns which notifications matter to you and ranks them by predicted importance. Starts with heuristics, improves over time.
Workflow Discovery — Detects repeating action patterns (e.g., "You check GitHub PRs, then update Jira, then dictate a status update every Thursday at 2 PM") and suggests automations.

All ML models are under 50MB total. All training data stays in SQLite on your machine. Toggle each feature independently in Settings > Voice AI.

Organization & Search

Unified search — Search across all content types from one place: dictation entries (FTS5), meeting notes, AI chat conversations, and written notes. Filter by category, see highlighted matches.
Timeline view — Scrollable card feed of all dictations, newest first. Entries show raw text by default with on-demand polishing.
Full-text search — Instant search across all transcriptions (SQLite FTS5) and meeting sessions (name, summary, transcript, action items).
Semantic search — AI-powered meaning-based search across all content (TF.js Universal Sentence Encoder).
Tags — Categorize entries with custom tags.
Pin & archive — Pin important entries to the top, archive old ones.
Raw vs. polished toggle — Switch between original transcript and LLM-cleaned version on any entry.
Auto-cleanup — Configure automatic deletion of entries older than N days.

Analytics ^{NEW in 1.0.15}

Dashboard — Daily word counts, recording time, words per minute, streaks.
Topic classification — LLM-powered topic tagging with trend charts.
Vocabulary richness — Type-Token Ratio and unique word tracking.
Productivity comparison — Period-over-period metrics.

Model Management

In-app downloads — Download Whisper, LLM, and TTS models directly from Settings.
Manual model import — If downloads are blocked by a corporate proxy, download models in your browser and import them into IronMic with a file picker. The app validates and copies them to the right location.
HTTP proxy support — Configure a proxy in Settings > Security for corporate networks (HTTP/HTTPS/SOCKS5).
Multiple Whisper sizes — Switch between tiny, base, small, medium, and large models.
GPU acceleration — Detect and enable Metal (macOS) or CUDA for faster inference.
Progress tracking — Download progress bars with percentage indicators.

Design & Customization

Dark / Light / System theme — Full theme support with auto mode that follows your OS preference.
Configurable hotkey — Visual key recorder with conflict detection.
Rich text editor — Bold, italic, headings, lists, blockquotes, code, highlights, and more.
Enterprise design system — Clean, professional UI built on Inter font with the IronMic blue accent.

Architecture

Electron UI ← IPC (contextBridge) → Rust Core (napi-rs)
  ├── React 18 + Zustand                ├── Audio capture (cpal)
  ├── TF.js ML Web Worker               ├── Whisper.cpp (speech-to-text)
  │   ├── VAD (Silero)                   ├── llama.cpp (text cleanup)
  │   ├── Intent classifier              ├── Kokoro ONNX (text-to-speech)
  │   ├── Semantic search (USE)          ├── Audio playback (cpal)
  │   ├── Notification ranker            ├── SQLite (storage + FTS5)
  │   └── Workflow predictor             └── Clipboard (arboard)
  └── Web Audio API (AudioWorklet)

Layer	Tech	Purpose
UI	Electron + React 18 + Tailwind CSS + Zustand	Desktop shell, component UI, state management
Editor	TipTap (ProseMirror)	Rich text editing
Bridge	napi-rs (N-API)	Typed Rust ↔ Node.js communication
Audio	cpal + Web Audio API	Cross-platform mic input, real-time VAD frames
STT	whisper-rs (whisper.cpp)	Local speech-to-text
LLM	llama-cpp-rs (llama.cpp)	Local text cleanup
TTS	ort (ONNX Runtime) + Kokoro 82M	Local neural text-to-speech
ML	TensorFlow.js (Web Worker)	VAD, intent, search, notifications, workflows
Storage	rusqlite (SQLite + FTS5)	Entries, settings, ML data, embeddings
Clipboard	arboard	Cross-platform clipboard

Prerequisites

Rust stable (1.88+) + cargo
Node.js 20+ + npm
CMake (for whisper.cpp / llama.cpp compilation)
espeak-ng (for TTS phonemization): brew install espeak-ng
Platform-specific:
- macOS: Xcode Command Line Tools
- Windows: Visual Studio Build Tools (C++ workload)
- Linux: build-essential, libasound2-dev, libsqlite3-dev

Download

Pre-built packages are available on the Releases page.

Platform	Artifact
macOS (Apple Silicon)	`.dmg`
Windows	`.exe` installer
Linux	`.AppImage`

macOS note: The app is not code-signed. After downloading, run this in Terminal before opening:

xattr -cr ~/Downloads/IronMic-*.dmg

Then open the .dmg and drag IronMic to Applications. On first launch you may need to right-click → Open → Open.

After installing, open IronMic and go to Settings > Models to download the speech recognition model (~1.5 GB). The app will walk you through setup on first launch.

Models run entirely on your machine. Nothing is sent externally.

Quick Start (from source)

# Clone
git clone https://github.com/greenpioneersolutions/IronMic.git
cd IronMic

# One-command dev mode (builds Rust, installs deps, launches everything)
./scripts/dev.sh

The dev script:

Builds the Rust native addon with Metal GPU + TTS support
Installs npm dependencies (if needed)
Compiles Electron main & preload TypeScript
Launches Vite dev server + Electron concurrently

Manual Setup

# Build Rust core
cd rust-core
cargo build --release --features metal,tts
cp target/release/libironmic_core.dylib ironmic-core.node
cd ..

# Install and run frontend
cd electron-app
npm install
npx concurrently "npx vite" "sleep 3 && npx electron ."

Download Models

Models are downloaded through the Settings UI inside the app:

Whisper large-v3-turbo (~1.5 GB) — speech recognition
Mistral 7B Instruct Q4 (~4.4 GB) — text cleanup
Kokoro 82M (~163 MB + ~7.5 MB voices) — text-to-speech

Running Tests

# Rust tests (225 tests)
cd rust-core
cargo test --no-default-features

# Frontend build verification
cd electron-app
npx vite build

Privacy & Security

These are hard architectural constraints, not policies:

No network calls. All outbound requests blocked. Model downloads are the only exception, triggered explicitly by you.
Audio never hits disk. Mic input lives in memory only. Buffers explicitly zeroed after use.
No telemetry. No analytics, crash reporting, or usage tracking.
Local-only storage. Single SQLite file in your app data directory.
Sandboxed renderer. contextIsolation: true, nodeIntegration: false, sandbox: true.
Model download integrity. SHA-256 verified, HTTPS-only, domain-validated.

For the full security model, threat analysis, and configuration guide, see SECURITY.md.

Project Structure

IronMic/
├── rust-core/                 # Rust native addon (napi-rs)
│   ├── src/
│   │   ├── audio/             # Mic capture + resampling
│   │   ├── transcription/     # Whisper integration + dictionary
│   │   ├── llm/               # LLM text cleanup + chat + prompts
│   │   ├── tts/               # Kokoro TTS + playback + timestamps
│   │   ├── storage/           # SQLite CRUD + FTS5 + migrations (v3)
│   │   │   ├── entries.rs     # Dictation entry CRUD
│   │   │   ├── analytics.rs   # Analytics snapshots + topics
│   │   │   ├── notifications.rs # ML notification system
│   │   │   ├── actions.rs     # Action log for workflow discovery
│   │   │   ├── workflows.rs   # Discovered workflow patterns
│   │   │   ├── embeddings.rs  # Semantic search embeddings
│   │   │   ├── ml_models.rs   # ML model weight persistence
│   │   │   ├── vad.rs         # VAD training samples
│   │   │   ├── intents.rs     # Intent + voice routing logs
│   │   │   └── meetings.rs    # Meeting session management
│   │   ├── clipboard/         # Clipboard management
│   │   ├── hotkey/            # Pipeline state machine
│   │   ├── error.rs           # Unified error types
│   │   └── lib.rs             # N-API exports (~100 functions)
│   └── models/                # Downloaded model weights (gitignored)
├── electron-app/              # Electron + React frontend
│   └── src/
│       ├── main/              # Electron main process + IPC handlers
│       │   └── ai/            # AI chat adapter (Copilot/Claude CLI)
│       ├── preload/           # Typed contextBridge API
│       └── renderer/          # React UI
│           ├── components/    # 30+ components
│           ├── stores/        # Zustand state (11 stores)
│           ├── services/tfjs/ # TensorFlow.js ML services
│           │   ├── VADService.ts         # Voice activity detection
│           │   ├── TurnDetector.ts       # End-of-turn detection
│           │   ├── IntentClassifier.ts   # Voice command classification
│           │   ├── VoiceRouter.ts        # Context-aware routing
│           │   ├── MeetingDetector.ts    # Ambient meeting mode
│           │   ├── SemanticSearch.ts     # USE embeddings + similarity
│           │   ├── NotificationRanker.ts # ML notification ranking
│           │   ├── WorkflowMiner.ts      # Action pattern mining
│           │   ├── AudioBridge.ts        # Web Audio API integration
│           │   └── TFJSRuntime.ts        # TF.js initialization
│           ├── workers/       # ML Web Worker (TF.js inference)
│           └── hooks/         # Custom hooks (bridge, theme, search)
├── scripts/
│   └── dev.sh                 # One-command dev launcher
└── CLAUDE.md                  # Architecture documentation

Feature Flags (Rust)

The Rust core uses Cargo feature flags to conditionally compile heavy dependencies:

Feature	Dependencies	Purpose
`napi-export` (default)	—	Enables N-API function exports
`whisper`	whisper-rs	Speech-to-text
`metal`	whisper + whisper-rs/metal	GPU acceleration on macOS
`llm`	llama_cpp_rs	LLM text cleanup
`tts`	ort, ndarray	Kokoro TTS inference

Tests run with --no-default-features to avoid N-API linker dependencies.

Security Audit

We publish a comprehensive, code-referenced self-audit that verifies every security claim we make. It points to exact files, line numbers, and code snippets — so you can confirm everything yourself.

Read the full audit

The audit covers: network isolation, audio zero-on-drop, model download integrity (SHA-256 + HTTPS + domain validation), HuggingFace source verification, Electron sandbox configuration, IPC validation, environment variable scoping, SQL injection protection, XSS prevention, and more.

We encourage you to verify our claims. If you don't trust our self-audit, please engage an independent security professional to review the codebase. We welcome it.

License

MIT — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
.claude		.claude
.github/workflows		.github/workflows
assets		assets
electron-app		electron-app
ideas		ideas
rust-core		rust-core
scripts		scripts
.gitignore		.gitignore
AUDIT.md		AUDIT.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
ENTERPRISE.md		ENTERPRISE.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IronMic

Features

Core Dictation

Text-to-Speech Read-Back

AI Assistant

On-Device Machine Learning ^{NEW in 1.1.0}

Organization & Search

Analytics ^{NEW in 1.0.15}

Model Management

Design & Customization

Architecture

Prerequisites

Download

Quick Start (from source)

Manual Setup

Download Models

Running Tests

Privacy & Security

Project Structure

Feature Flags (Rust)

Security Audit

License

About

Uh oh!

Releases 35

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

IronMic

Features

Core Dictation

Text-to-Speech Read-Back

AI Assistant

On-Device Machine Learning NEW in 1.1.0

Organization & Search

Analytics NEW in 1.0.15

Model Management

Design & Customization

Architecture

Prerequisites

Download

Quick Start (from source)

Manual Setup

Download Models

Running Tests

Privacy & Security

Project Structure

Feature Flags (Rust)

Security Audit

License

About

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 35

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

On-Device Machine Learning ^{NEW in 1.1.0}

Analytics ^{NEW in 1.0.15}

Packages