VibeType

VibeType is a voice-driven coding and text manipulation assistant designed for seamless integration into your workflow. It uses local, privacy-focused tools for transcription (Whisper) and AI processing (Ollama) to turn your voice into code, commands, and corrected text.

Core Philosophy

The goal of VibeType is to provide a powerful, hands-free interface for interacting with your computer, with a strong emphasis on:

Local First: Whenever possible, processing happens locally on your machine. Your voice and data never leave your computer unless you explicitly configure an external API.
Flexibility: A powerful "AI Toolkit" allows you to switch between different AI-powered tasks on the fly.
Pluggable Providers: VibeType is designed to be extensible. You can choose between different AI and Text-to-Speech (TTS) providers to suit your needs, whether you prefer local models for privacy or powerful external APIs for quality.
Customization: Hotkeys, AI models, and system prompts are all fully customizable to fit your workflow.

Key Features

VibeType is more than just a dictation tool. It's a suite of voice-powered utilities. For a full list of features and their hotkeys, please see the Features Document.

Standard Dictation: Quickly transcribe your speech into any text field.
Multi-Provider AI Processing: Use a local LLM (via Ollama) or a powerful external API (like Cohere) to process your speech for tasks like code generation, rephrasing, or command execution.
Clipboard Processing: Apply AI transformations (like summarization or correction) to any text on your clipboard using your selected AI provider.
Advanced Multi-Language TTS: Get audible feedback using the built-in Windows voice, an external API (like OpenAI), or the powerful, local Kokoro TTS engine. Kokoro TTS features:
- Automatic Language Detection: Speak in multiple languages and have the TTS engine automatically switch voices.
- Wide Language Support: High-quality voices for English, Japanese, Spanish, French, Chinese, and more.
- Voice Blending: Create unique, custom voices by blending existing ones.
Customizable AI Modes: Switch between different AI "personalities" (like Assistant, Corrector, or Summarizer) instantly.

MCP Server & GUI

A lightweight local MCP server provides simple HTTP endpoints to enqueue speech for sequential playback (no overlapping).

Read: docs/MCP.md
Settings → 🛠️ MCP tab includes:
- Start/Stop/Restart MCP
- Live log panel + Clear Logs
- Ping MCP /health
- One-click Test Speak (queues a short hello)
- Auto-start MCP on launch

Getting Started

Ensure you have a local Ollama server running if you wish to use it.
Install the required Python dependencies from requirements.txt.
Run python VibeType.py.
Configure your hotkeys, AI provider, and TTS provider from the Settings window.

Troubleshooting

Multi-Language TTS Issues

Kokoro TTS relies on a few external dependencies for its powerful multi-language capabilities. If you encounter issues with certain languages, please check the following:

Japanese or Chinese: If you encounter an error in the logs related to MeCab, fugashi, or a RuntimeError mentioning unidic or mecabrc not found, the required dictionary was not installed correctly. To fix this, run the following command in your terminal:
```
python -m unidic download
```
Other Languages (French, Spanish, etc.): Many other languages are supported via espeak-ng. If you have issues with these languages, ensure you have espeak-ng installed and accessible in your system's PATH.

Voice Agent Integration (MCP Speak)

All agent responses must use the MCP speak function for voice feedback, in addition to any text output.
The MCP speak tool is required for every agent reply, ensuring accessibility and hands-free operation.
If the MCP or TTS is unavailable, log the failure and continue, but always attempt to speak first.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
MCP		MCP
api		api
assets/screenshots		assets/screenshots
config		config
core		core
dev		dev
docs		docs
gui		gui
kokoro_tts		kokoro_tts
piper_tts		piper_tts
tests		tests
.gitignore		.gitignore
IDEAS_AND_PLANS.md		IDEAS_AND_PLANS.md
README.md		README.md
Rules.md		Rules.md
VibeType.py		VibeType.py
agent.md		agent.md
check_cuda.py		check_cuda.py
full_requirements.txt		full_requirements.txt
gpu-test.py		gpu-test.py
requirements-py311.txt		requirements-py311.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VibeType

Core Philosophy

Key Features

MCP Server & GUI

Getting Started

Troubleshooting

Multi-Language TTS Issues

Voice Agent Integration (MCP Speak)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

VibeType

Core Philosophy

Key Features

MCP Server & GUI

Getting Started

Troubleshooting

Multi-Language TTS Issues

Voice Agent Integration (MCP Speak)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages