GitHub

TranscribeAI is a cross-platform desktop app that uses Large Language Models to transcribe audio files and scanned images/pages. It features:

LLM-powered transcription (e.g. Google Gemini via google-generativeai)
Resumable image workflows (skips already-done files, cleans up partial outputs)
Real-time logs & progress in the UI
Persistent settings (stores your API key with electron-store)
Drag-resizable, searchable sidebar for managing transcripts

Note: A “headless” version (no UI) is also available and can be integrated into your own system—see
https://github.com/Minitex/TranscribeAI

Technologies

Electron: Main process for file I/O and IPC
React + TypeScript: Renderer UI, bundled with Vite for fast HMR
Vite: Modern build tool for instant feedback and optimized production builds
google-generativeai SDK: Interfaces with LLMs for high-quality transcription
electron-store: Simple JSON storage for your Gemini API key

Setup & Installation

Obtain your API keys
- Gemini (required for audio/Gemini OCR): Sign in to AI Studio and create a key at the Google AI Studio API Key Console. Copy the generated key.
- Mistral (only for Mistral OCR): Create an API key in your Mistral account—follow the Mistral quickstart at docs.mistral.ai/getting-started/quickstart.
Download TranscribeAI
- Go to the TranscribeAI Releases page.
- Choose the installer for your OS:
  - macOS: .dmg
  - Windows: .exe
  - Linux: .AppImage or .tar.gz
- Download and run the installer. Because TranscribeAI is an open-source project and we don’t bundle a paid code-signing certificate, you may see a security warning the first time you run it:
  - macOS Gatekeeper (“Unidentified Developer” or "damaged and can't be opened"): open System Preferences → Security & Privacy, then click Open Anyway next to the TranscribeAI entry.
  - Windows SmartScreen (“Windows protected your PC”): click More info, then Run anyway.
- Follow the installer prompts to complete installation.
Configure your API key(s)
- Launch TranscribeAI.
- Click the gear icon in the top-right corner to open Settings.
- Paste your Google Gemini API key into the “API Key” field.
- If you plan to use Mistral OCR, paste your Mistral API key into the “Mistral API Key” field.
- Click Save.
Run your first transcription
- Click the file picker button to select an audio file or image folder, then choose the output folder for your transcripts.
- Click Transcribe to begin transcription.
- Monitor progress and logs in real time.

Using Mistral OCR (including batch mode)

In Settings, enter your Mistral API key (leave Gemini key as well if you use Gemini features).
In the Image tab, pick the mistral-ocr-latest model.
Select an input folder of images/PDFs and an output folder.
Enable Batch mode to process files in batches, then start transcription. The app will call Mistral’s batch OCR and write outputs to your chosen folder.
Adjust Batch size with the +/− controls (default 50, range 10–500) to balance throughput vs. request size.
Note: single-image Mistral OCR calls are free; batch OCR requires a paid Mistral account.

Updating to a New Version

macOS: Download the latest .dmg from Releases, open it, and drag the app to Applications. Choose “Replace” if prompted. If Gatekeeper blocks, right-click → Open once.
Windows: Download the new installer .exe from Releases and run it; it overwrites the existing install. Tip: after first launch, right-click the TranscribeAI icon on the taskbar → Pin to taskbar so it’s easy to find next time.
Linux (AppImage): Download the new *.AppImage, chmod +x if needed, and replace your old AppImage file.
No uninstall needed: Install over the top; your existing data and settings remain.

Screenshots

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
src		src
.gitignore		.gitignore
COPYRIGHT		COPYRIGHT
LICENSE		LICENSE
README.md		README.md
defaultPrompts.ts		defaultPrompts.ts
desktopIcon.png		desktopIcon.png
desktopIcon.svg		desktopIcon.svg
electron-builder.json		electron-builder.json
eslint.config.js		eslint.config.js
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
transcribeAIMainView.jpeg		transcribeAIMainView.jpeg
transcribeAISettingsView.jpeg		transcribeAISettingsView.jpeg
tsconfig.app.json		tsconfig.app.json
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Technologies

Setup & Installation

Using Mistral OCR (including batch mode)

Updating to a New Version

Screenshots

About

Uh oh!

Releases 7

Packages

Contributors 2

Uh oh!

Languages

License

Minitex/TranscribeAIUI

Folders and files

Latest commit

History

Repository files navigation

Technologies

Setup & Installation

Using Mistral OCR (including batch mode)

Updating to a New Version

Screenshots

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 7

Packages 0

Contributors 2

Uh oh!

Languages

Packages