Skip to content

Buddy is an AI-powered virtual assistant that integrates gesture-based mouse and keyboard control with voice commands to streamline user interaction. It can perform tasks like web searches, application management, WhatsApp automation, email composition, music control, and AI-based code writing and image generation.

License

Notifications You must be signed in to change notification settings

Rathore-Rajpal/HeyBuddy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

113 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ€– Buddy - AI Virtual Assistant

Your Intelligent Desktop Companion with Gesture Control & Voice Commands

Python OpenCV License Platform Stars

Features β€’ Quick Start β€’ Demo β€’ Documentation β€’ Contributing


πŸ“– About

Buddy is an all-in-one AI virtual assistant that revolutionizes how you interact with your computer. Control your mouse and keyboard with hand gestures, execute commands with your voice, and automate tasks with AI-powered features.

Author: Rajpal Singh Rathore

Features

🎯 Core Components

  1. Virtual Mouse - Hand gesture-based mouse control
  2. Virtual Keyboard - On-screen keyboard with gesture typing
  3. Voice Assistant "Buddy" - AI-powered voice commands

πŸ€– Assistant Capabilities

  • βœ… Face authentication
  • βœ… Voice command recognition
  • βœ… Spotify integration (play music, search artists)
  • βœ… YouTube control (play videos, search)
  • βœ… WhatsApp automation (messages, calls)
  • βœ… Phone integration (calls, SMS via Phone Link)
  • βœ… Email composition (Gmail)
  • βœ… Web search (Google, product search on 20+ sites)
  • βœ… Note taking (Sticky Notes, file-based)
  • βœ… Reminders (Windows Task Scheduler)
  • βœ… Screenshots
  • βœ… AI chatbot (HuggingChat)
  • βœ… Code generation
  • βœ… Image generation (Stable Diffusion)
  • βœ… Google Maps routes

πŸ–±οΈ Virtual Mouse Gestures

Gesture Action
Index finger movement Move cursor
Index finger bent + middle straight Left click
Middle finger bent + index straight Right click
Both fingers bent (thumb far) Double click
Both fingers bent (thumb close) Screenshot
Thumb + index touching + move up/down Scroll
Thumb + pinky touching Drawing mode
Thumb + ring finger (1 sec) Close window
Thumb + middle finger (1 sec) Minimize window

πŸš€ Quick Start

Prerequisites

  • Windows 10/11
  • Python 3.8 or higher
  • Webcam
  • Microphone
  • Internet connection

Installation

  1. Clone or download this repository

    cd C:\VirtualMouseProject
    
  2. Run setup script

    setup.bat
    

    This will:

    • Create virtual environment
    • Install all dependencies
    • Verify installation
  3. Configure API keys (Optional) Create a .env file in the project root:

    CLIENT_ID=your_spotify_client_id
    CLIENT_SECRET=your_spotify_client_secret
    HuggingFaceApiKey=your_huggingface_api_key
    
  4. Test components

    python test_components.py
    
  5. Launch the assistant

    start.bat
    

    Or directly:

    python run.py
    

πŸ“– Usage

Starting the Assistant

  1. Run start.bat
  2. Complete face authentication when prompted
  3. Wait for "Ready to help" confirmation
  4. Use voice commands or click the mic button
  5. Press Alt + J for quick voice activation

Voice Command Examples

  • "Open YouTube"
  • "Play Despacito on YouTube"
  • "Search for laptop on Amazon"
  • "Send a message to [contact name] on WhatsApp"
  • "Set a reminder for tomorrow at 3 PM to call mom"
  • "Take a screenshot"
  • "Generate an image of a sunset over mountains"
  • "Write a code to sort a list in Python"
  • "What's the weather like?"

Launching Virtual Mouse/Keyboard

  • Voice: "Start virtual mouse" / "Start virtual keyboard"
  • Or use Flask API endpoints (if running app.py)

πŸ› οΈ Troubleshooting

Camera not working

  • Check if camera is being used by another application
  • Grant camera permissions to Python

Voice recognition not responding

  • Check microphone permissions
  • Ensure internet connection (uses Google Speech API)
  • Adjust r.pause_threshold in commands.py if needed

Face authentication fails

  • Ensure good lighting
  • Train your face using assist/Engine/auth/sample.py
  • Run assist/Engine/auth/trainer.py to generate trainer.yml

Module not found errors

  • Activate virtual environment: envjarvis\Scripts\activate
  • Reinstall: pip install -r requirements.txt

Spotify not working

πŸ“ Project Structure

VirtualMouseProject/
β”œβ”€β”€ run.py              # Main launcher (multiprocessing)
β”œβ”€β”€ main.py             # Assistant initialization
β”œβ”€β”€ app.py              # Flask API server
β”œβ”€β”€ virtualMouse.py     # Gesture-based mouse
β”œβ”€β”€ virtual_ketboard.py # Gesture-based keyboard
β”œβ”€β”€ requirements.txt    # Dependencies
β”œβ”€β”€ setup.bat           # Installation script
β”œβ”€β”€ start.bat           # Launch script
β”œβ”€β”€ test_components.py  # Component testing
β”œβ”€β”€ assist/
β”‚   β”œβ”€β”€ Engine/
β”‚   β”‚   β”œβ”€β”€ commands.py      # Command handler
β”‚   β”‚   β”œβ”€β”€ features.py      # Feature implementations
β”‚   β”‚   β”œβ”€β”€ config.py        # Configuration
β”‚   β”‚   β”œβ”€β”€ db.py            # Database operations
β”‚   β”‚   β”œβ”€β”€ spotify.py       # Spotify integration
β”‚   β”‚   β”œβ”€β”€ auth/            # Face authentication
β”‚   β”‚   β”œβ”€β”€ ImageBot/        # Image generation UI
β”‚   β”‚   └── CodingBuddy/     # Code assistant UI
β”‚   └── www/                 # Web interface
β”‚       β”œβ”€β”€ index.html
β”‚       β”œβ”€β”€ main.js
β”‚       └── style.css
└── envjarvis/          # Virtual environment

πŸŽ₯ Demo

Coming Soon: Full demo video showcasing all features

Screenshots

🎭 Face Authentication

Secure login with facial recognition

πŸ–±οΈ Virtual Mouse Control

Control cursor with hand gestures

πŸ—£οΈ Voice Assistant Interface

Beautiful web-based UI with voice commands

🎨 AI Image Generation

Create images from text prompts

Quick Feature Preview

✨ Say "Hey Buddy" to activate
🎡 "Play [song name] on Spotify"
πŸ“§ "Send email to [contact]"
🌐 "Search Google for [query]"
🎨 "Generate image of [description]"
πŸ’» "Write code to [task]"

πŸ”§ Development

Adding New Voice Commands

Edit assist/Engine/commands.py and add to allCommands() function

Adding New Contacts

Use the web UI contact form or edit database directly

Training Face Recognition

Run the easy setup script:

.\setup_face_auth.bat

Or manually:

  1. Run python assist/Engine/auth/sample.py to capture face samples
  2. Run python assist/Engine/auth/trainer.py to train the model

🀝 Contributing

Contributions are welcome! Here's how:

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

Development Setup

See DEPLOYMENT.md for complete deployment guide.


πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


πŸ™ Acknowledgments

  • OpenCV for computer vision capabilities
  • MediaPipe for hand tracking
  • Eel for Python-JavaScript bridge
  • HuggingFace for AI models
  • All open-source contributors

πŸ“§ Contact

Rajpal Singh Rathore


⭐ Star this repo if you find it useful!

Made with ❀️ by Rajpal Singh Rathore

πŸ“ License

This project is open source and available for educational purposes.

🀝 Contributing

Contributions, issues, and feature requests are welcome!

⚠️ Important Notes

  • Some features require API keys (Spotify, HuggingFace)
  • WhatsApp automation may require WhatsApp Desktop app
  • Phone features require Windows Phone Link app
  • Face authentication model needs to be trained with your face

πŸ“ž Support

For issues or questions, please create an issue in the repository.

About

Buddy is an AI-powered virtual assistant that integrates gesture-based mouse and keyboard control with voice commands to streamline user interaction. It can perform tasks like web searches, application management, WhatsApp automation, email composition, music control, and AI-based code writing and image generation.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors