A.D.A. is an advanced, real-time digital assistant built with Google's Gemini-live-2.5-flash-preview model. It features a responsive graphical user interface (GUI) using PySide6, real-time audio communication, and the ability to process live video from either a webcam or a screen share. A.D.A. is equipped with powerful tools for searching, code execution, and managing your local file system.
FOR FULL VIDEO TUTORIAL: https://www.youtube.com/watch?v=aooylKf-PeA
-
🗣️ Real-time Conversation: Seamless, low-latency voice-to-voice interaction powered by Google Gemini and ElevenLabs TTS.
-
👀 Live Visual Input: A.D.A. can see what you see, with the ability to switch between a live webcam feed and a screen share. This allows it to answer questions about on-screen content, debug code visually, or provide guidance as you work.
-
🛠️ Integrated Tooling: The assistant can perform a variety of actions by invoking powerful tools, including:
-
Google Search: For real-time information retrieval.
-
Code Execution: To run and debug Python code.
-
File System Management: Create, edit, read, and list files and folders on your computer.
-
System Actions: Open applications and websites.
-
-
🎨 Dynamic UI: A responsive and visually appealing GUI built with PySide6, featuring a 3D animated avatar that pulses when the assistant is speaking.
-
💻 Cross-Platform: Designed to work on Windows, macOS, and Linux.
Follow these steps to get A.D.A. up and running on your local machine.
Before you begin, ensure you have the following installed:
- Python 3.9+
- Git: Download Git
- Gemini API Key: Get your key from Google AI Studio.
- ElevenLabs API Key: Get your key from the ElevenLabs website(Affiliate Link Helps Me Out).
Clone this project's repository from GitHub:
git clone https://github.com/your-username/your-repo-name.git
cd your-repo-nameIt's highly recommended to use a virtual environment to manage dependencies cleanly.
On Windows:
python -m venv venv
venv\Scripts\activateOn macOS/Linux:
python -m venv venv
source venv/bin/activateWith your virtual environment active, install all the required Python packages with a single command:
pip install google-genai python-dotenv elevenlabs PySide6 opencv-python Pillow numpy websockets pyaudioNote: On some systems,
PyAudiocan be tricky to install. If you encounter issues, you may need to install system-level development libraries first (e.g.,portaudio). Please refer to the PyAudio documentation for platform-specific instructions.
Create a file named .env in the project's root directory to store your API keys securely.
Add your API keys to the .env file:
GEMINI_API_KEY="YOUR_GEMINI_API_KEY_HERE"
ELEVENLABS_API_KEY="YOUR_ELEVENLABS_API_KEY_HERE"
Important: Do not share or commit your .env file to GitHub. The project's .gitignore file is configured to ignore it.
Ensure your virtual environment is active, then run the main Python script:
python ada.pyYou can specify the initial video mode when launching the application:
-
--mode camera: Starts with the webcam feed active.
-
--mode screen: Starts with screen sharing active.
-
--mode none: Starts without a video feed (default).
Example:
python ada.py --mode camera-
Voice: The application listens in real-time. Simply speak to the assistant to begin a conversation.
-
Text: Use the input box to type commands or questions.
-
Video Mode Buttons: Use the "WEBCAM", "SCREEN", and "OFFLINE" buttons on the right panel to change the visual input source.
A.D.A. can answer questions, run code, manage files, open applications, and analyze content on your screen.
If you encounter any issues, here are some common problems and their solutions:
- Symptom: The application closes immediately after starting, with an error message like
Error: GEMINI_API_KEY not found. - Solution:
- Check
.envfile location: Ensure your.envfile is in the root directory of the project, alongsideada.py. - Verify Key Names: Make sure the variable names in your
.envfile are exactlyGEMINI_API_KEYandELEVENLABS_API_KEY. - Check Key Values: Confirm that you have correctly pasted your API keys without any extra spaces or characters.
- Check
- Symptom: A.D.A. does not respond to your voice commands.
- Solution:
- Grant Permissions: Your operating system may be blocking microphone access.
- Windows: Go to
Settings > Privacy & security > Microphoneand ensure "Let desktop apps access your microphone" is enabled. - macOS: Go to
System Settings > Privacy & Security > Microphoneand make sure your terminal or code editor has permission.
- Windows: Go to
- Set Default Device: The application uses your system's default input device. Check your OS sound settings to ensure the correct microphone is selected as the default.
- PyAudio Installation: If you see errors related to
PyAudioorPortAudioon startup, you may need to reinstall it or install its system dependencies as mentioned in the setup guide.
- Grant Permissions: Your operating system may be blocking microphone access.
- Symptom: The video panel on the right is black when "WEBCAM" mode is active.
- Solution:
- Grant Permissions: Just like the microphone, your OS may be blocking camera access. Check your system's privacy settings for the camera.
- Camera In Use: Make sure no other application (like Zoom, Teams, OBS, etc.) is currently using your webcam.
- Correct Device: The script defaults to the first available camera (index 0). If you have multiple cameras, you may need to adjust the
cv2.VideoCapture(0)line inada.pyto use a different index (e.g.,cv2.VideoCapture(1)).