Professional text-to-speech and voice input tools for Linux systems. Multi-engine TTS, voice recording, and cross-platform compatibility.
curl -fsSL https://raw.githubusercontent.com/pablopda/linux-speech-tools/main/installer.sh | bash- Edge TTS: High-quality cloud-based synthesis with 22-country LATAM regional voice support
- Kokoro TTS: Offline neural voice synthesis
- Festival TTS: Local fallback engine
- Graceful fallbacks: Automatic engine switching for maximum reliability
- Toggle recording: Press once to start, again to stop (default mode)
- Speech-to-text: Powered by OpenAI Whisper for accurate transcription
- Auto-clipboard: Transcription automatically copied to clipboard
- GNOME integration: Global hotkey (Ctrl+Alt+V) for system-wide voice input
- Smart detection: Terminal vs GUI application handling
- Continuous playback: Eliminates gaps between audio chunks
- Professional quality: Broadcast-level smooth TTS streaming
- Smart concatenation: Uses ffmpeg/sox for seamless audio joining
- Multiple modes: Continuous, buffered, and original streaming options
- Drop-in replacement: Enhanced versions of existing commands
- Desktop media controls: Play/pause/stop from notification panel
- Real-time progress: Visual progress tracking for reading sessions
- Native integration: Professional media player experience for TTS
- Document information: Display source title and reading status
- Notification controls: Never lose control of long reading sessions
say- Text-to-speech with file output supportsay-local- Local TTS using Festival/Kokorosay-read- Read URLs, PDFs, and documents with TTSsay-read-es- Spanish language content readertalk2claude- Voice input with transcription
- Ubuntu 20.04, 22.04
- Debian 11, 12
- Fedora 38, 39
- Automatic dependency detection and installation
- XDG-compliant configuration management
# Simple speech
say "Hello from Linux Speech Tools!"
# Spanish voice
say -v es-ES-AlvaroNeural "ยกHola mundo!"
# Save to file
say -o greeting.mp3 "Welcome to our application"
# Show available options
say --helpGNOME Integration (Recommended):
# Install GNOME integration
./install-gnome-integration.sh
# Use system-wide hotkey: Ctrl+Alt+V
# Press once โ Start recording
# Press again โ Stop and transcribeCommand Line:
# Toggle mode (default)
./toggle-speech.sh toggle # Start/stop recording
./toggle-speech.sh start # Start only
./toggle-speech.sh stop # Stop only
# Fixed duration mode
./simple-speech.sh 5 # 5-second recording
# Original talk2claude (advanced)
talk2claude # 8-second recording
talk2claude start # Background recording
talk2claude stop # Stop and transcribe๐ต Enhanced: Continuous Streaming (NEW)
# Smooth, gap-free audio streaming
./say-read-continuous https://example.com/article
# Professional-quality playback for long content
./say-read-smooth --buffered https://en.wikipedia.org/wiki/Linux
# Interactive demo showing improvement
./demo-audio-streaming.sh๐ฎ GNOME Media Controls (LATEST)
# Reading with desktop media controls
./say-read-gnome https://www.bbc.com/news/technology
# Control playback from notification panel:
# โธ๏ธ Pause - Click to pause reading
# โถ๏ธ Resume - Click to resume reading
# โน๏ธ Stop - Click to stop completely
# Setup GNOME integration (first time)
./say-read-gnome --setup
# Interactive demo and testing
./demo-gnome-media-integration.sh๐ Standard Reading
# Read web articles
say-read https://example.com/article
# Read PDF documents
say-read document.pdf
# Read with Spanish voice
say-read-es https://elpais.com/tecnologia/curl -fsSL https://raw.githubusercontent.com/pablopda/linux-speech-tools/main/installer.sh | bashgit clone https://github.com/pablopda/linux-speech-tools.git
cd linux-speech-tools
./installer.shDownload packages from Releases:
Ubuntu/Debian:
wget https://github.com/pablopda/linux-speech-tools/releases/download/v1.0.0/linux-speech-tools_1.0.0.deb
sudo dpkg -i linux-speech-tools_1.0.0.debFedora/RHEL:
wget https://github.com/pablopda/linux-speech-tools/releases/download/v1.0.0/linux-speech-tools-1.0.0-1.noarch.rpm
sudo rpm -i linux-speech-tools-1.0.0-1.noarch.rpmCreate ~/.config/speech-tools/config:
# Default voice for Edge TTS
EDGE_VOICE=en-US-EmmaMultilingualNeural
# Voice input settings
ASR_LANG=en
WHISPER_MODEL=large-v3# List Edge TTS voices
edge-tts --list-voices | grep -E "(Male|Female)"
# Test different voices
say -v en-GB-SoniaNeural "British English"
say -v es-MX-DaliaNeural "Mexican Spanish"
say -v pt-BR-AntonioNeural "Brazilian Portuguese"# Test audio output
say "Audio test"
# Check audio devices
pactl list short sinks
# Install audio dependencies
sudo apt install pulseaudio-utils # Ubuntu/Debian
sudo dnf install pulseaudio-utils # Fedora# Install Python dependencies manually
pip3 install edge-tts pyaudio speechrecognition
# Install system dependencies
sudo apt install python3-pip ffmpeg espeak-ng portaudio19-dev # Ubuntu/Debian
sudo dnf install python3-pip ffmpeg espeak-ng portaudio-devel # Fedora# Make scripts executable
chmod +x ~/.local/bin/{say,say-local,talk2claude}
# Add to PATH if needed
echo 'export PATH="$HOME/.local/bin:$PATH"' >> ~/.bashrc
source ~/.bashrc# Run full test suite
python3 tests/test_speech_tools.py
# Quick validation
./scripts/quick-release-check.sh
# Comprehensive validation
./scripts/pre-release-check.sh# Patch release (1.0.0 -> 1.0.1)
./release.sh patch
# Minor release (1.0.0 -> 1.1.0)
./release.sh minor
# Preview release
./release.sh patch --dry-runWe welcome contributions! Please see our Contributing Guide for details.
git clone https://github.com/pablopda/linux-speech-tools.git
cd linux-speech-tools
# Install development dependencies
./installer.sh
# Run tests
python3 tests/test_speech_tools.py
# Submit changes
git checkout -b feature/your-feature
# Make changes
./scripts/quick-release-check.sh
git commit -m "Add your feature"
git push origin feature/your-feature
# Create pull request- OS: Linux (Ubuntu 20.04+, Debian 11+, Fedora 38+)
- Python: 3.7+
- Audio: PulseAudio or ALSA
- Network: Internet connection for Edge TTS
python3-pipffmpegespeak-ngportaudio19-dev(Ubuntu/Debian) orportaudio-devel(Fedora)
All dependencies are automatically installed by the installer script.
- Installation Guide
- API Documentation (coming soon)
- Voice Configuration Guide (coming soon)
- Troubleshooting Guide (coming soon)
- โ Production Ready: Comprehensive testing across multiple distributions
- โ Actively Maintained: Regular updates and improvements
- โ Community Driven: Open to contributions and feature requests
- โ Professional Quality: Enterprise-grade CI/CD and release automation
- Repository: https://github.com/pablopda/linux-speech-tools
- Releases: https://github.com/pablopda/linux-speech-tools/releases
- Issues: https://github.com/pablopda/linux-speech-tools/issues
- Discussions: https://github.com/pablopda/linux-speech-tools/discussions
This project is licensed under the MIT License - see the LICENSE file for details.
- OpenAI Whisper for speech recognition
- Microsoft Edge TTS for cloud synthesis
- Kokoro ONNX for offline synthesis
- Festival Speech Synthesis System
- The open-source Linux community
Made with โค๏ธ for the Linux community
Professional speech tools that just work. ๐ง๐๏ธ