🌱 SYLVA-navigator

Smart Yield-Logic Vision Agent (I created this project for the purposes of entering the Google Gemini Live Agent Challenge hackathon)

Demo Video

SYLVA-navigator Demo

Overview

SYLVA is an autonomous interactive agent developed for the Gemini Live Agent Challenge. Departing from traditional DOM-based automation, SYLVA leverages Gemini's multimodal vision to understand screen layouts and infer user intent via its proprietary Yield-Logic. This system acts as a high-precision guide for complex enterprise interfaces, balancing technical reliability with innovative user experience.

Project Category

UI Navigator: Autonomous navigation and user guidance through visual analysis.

Key Features

Yield-Logic: An advanced algorithm that evaluates the functional value (yield) of UI elements based on visual pixel data.
Real-time Intent Inference: Predictive analysis of cursor velocity and dwell time to anticipate user goals.
Enterprise-grade Security: Secure data paths between local environments and Google Cloud via ADC and Service Account roles.
Visual Reasoning Log: Real-time visualization of Gemini 3 Flash's thought process rendered directly onto the target browser.

Tech Stack

AI Model: Gemini 2.5 Flash Lite (Vertex AI)
Backend: FastAPI (Python 3.11+)
UI: Integrated Driver Overlay (HTML/CSS/JS)
Automation: Playwright
Infrastructure: Google Cloud Run, ADC

🛠 Setup and Installation

1. Prerequisites

Python 3.11 or higher
Google Cloud SDK installed and configured.
Chrome/Chromium (Managed via Playwright).
Google Cloud Project with Vertex AI API enabled.

2. Environment Variables

SYLVA requires specific environment variables for cloud integration. Refer to .env.example for the template.

Bridge (Cloud):

GCP_PROJECT_ID: Your Google Cloud Project ID.
GCP_LOCATION: region (e.g., us-central1).

Driver (Local):

BRIDGE_URL: The URL of your deployed Cloud Run service.

# Example setup
cp .env.example .env
# Edit .env with your specific project details

3. Installation

Bridge (Cloud Core):

cd bridge
pip install -r requirements.txt
cd ..

Driver (Local Sentinel):

cd driver
pip install -r requirements.txt
playwright install chromium
cd ..

🚀 How to Run

1. Deploy the Bridge (Cloud)

Ensure you are authenticated with Application Default Credentials (ADC):

gcloud auth application-default login
chmod +x deploy.sh
./deploy.sh

2. Run the Driver (Local)

Once the bridge is deployed, update the BRIDGE_URL in your .env and run:

python driver/main.py --headed

🎯 Sample Input & Test Scenarios

To verify SYLVA's autonomous navigation, you can use the following test cases:

Wikipedia Knowledge Fetch: "Search for 'Gemini (chatbot)' on Wikipedia and find the 'History' section."
- Initial URL: https://www.wikipedia.org
- To change the agent's response language, modify the AGENT_LANGUAGE variable in the .env file.
Google Maps Location Search: "Search for 'Googleplex' on Google Maps and find the 'Directions' button."
- Initial URL: https://www.google.com/maps

📑 Documentation

GEMINI.md: Technical architecture and Yield-Logic details.
Architecture Details: Visual system diagram.
Medium Blog Post: Detailed development journey.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
bridge		bridge
driver		driver
.env.example		.env.example
.gitignore		.gitignore
GEMINI.md		GEMINI.md
README.md		README.md
architecture.md		architecture.md
architecture_diagram.png		architecture_diagram.png
deploy.sh		deploy.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🌱 SYLVA-navigator

Demo Video

Overview

Project Category

Key Features

Tech Stack

🛠 Setup and Installation

1. Prerequisites

2. Environment Variables

3. Installation

🚀 How to Run

1. Deploy the Bridge (Cloud)

2. Run the Driver (Local)

🎯 Sample Input & Test Scenarios

📑 Documentation

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🌱 SYLVA-navigator

Demo Video

Overview

Project Category

Key Features

Tech Stack

🛠 Setup and Installation

1. Prerequisites

2. Environment Variables

3. Installation

🚀 How to Run

1. Deploy the Bridge (Cloud)

2. Run the Driver (Local)

🎯 Sample Input & Test Scenarios

📑 Documentation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages