Skip to content

Add Voice Command Interface to CompUse#1

Draft
codegen-sh[bot] wants to merge 1 commit intomainfrom
gen/64eaad53-6aa9-4507-929d-134e1708ebb1
Draft

Add Voice Command Interface to CompUse#1
codegen-sh[bot] wants to merge 1 commit intomainfrom
gen/64eaad53-6aa9-4507-929d-134e1708ebb1

Conversation

@codegen-sh
Copy link

@codegen-sh codegen-sh bot commented Mar 18, 2025

This PR adds a voice command interface to CompUse, allowing users to control their computer using voice commands in addition to text input.

Features Added

  • Voice Recognition: Uses the SpeechRecognition library with Google's speech recognition API
  • Wake Word Detection: Configurable wake word (default: "computer") to activate voice listening
  • Push-to-Talk Mode: Optional push-to-talk mode with customizable hotkey (Ctrl+Space)
  • Audio Feedback: Text-to-speech responses when commands are recognized
  • Voice Command CLI: Extended CLI interface with voice command support
  • Configuration Options: Command-line options to customize voice recognition behavior

Implementation Details

  • Added voice_tools.py with voice recognition and processing classes
  • Created voice_cli.py that extends the existing CLI with voice capabilities
  • Updated requirements.txt to include necessary dependencies
  • Updated README.md with voice command documentation

How to Use

Run the voice-enabled CLI:

python voice_cli.py

With custom options:

python voice_cli.py --wake-word "assistant" --push-to-talk --verbose

Dependencies Added

  • SpeechRecognition: For voice recognition
  • pyttsx3: For text-to-speech feedback
  • keyboard: For push-to-talk hotkey support

This implementation completes the "Voice (TODO)" item from the features list in the README.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants