Spatial Engine AI 💡 v1.0.0

DeepTech Autonomous Agent for Optical Physics & Energy Optimization Powered by Gemini 3 Pro & Google GenAI SDK

Spatial Engine is a multimodal AI agent designed to act as a Senior Optical Physicist. Unlike standard chatbots, it combines Generative AI's vision capabilities with a deterministic physics engine to audit rooms, calculate lighting deficits, and project energy ROI.

🎥 Demo

🚀 Key Features

1. The Physics Core (Deterministic)

The agent does not "guess" math. It delegates calculations to a rigorous Python engine.

Illuminance Calculation: Uses the Inverse Square Law ($E=I/d^2$) and Beam Angle geometry to calculate exact Lux levels at specific points.
Health Compliance (ISO/SanPiN): Automatically checks if lighting levels meet health standards for offices (500 Lux), living rooms, etc., and warns of safety deficits.
Unit Tested: All physics formulas are covered by unittest to ensure 100% reliability.

2. The Market & Economic Engine (Real-Time)

The agent connects physics to the real economy.

Live Market Search: Finds real-world products (prices, specs) and local electricity rates (USD/kWh) via Google Search.
ROI & Energy Calculator: Computes financial savings (USD) and CO2 reduction when switching lighting technologies (e.g., Incandescent to LED).
Search Verification: "Trust but Verify" logic. The agent reads product specs to ensure a lamp is truly "dimmable" or "smart" before recommending it.
Fallback Resilience: Continues working offline using averaged market data if the internet connection fails.

3. The Vision System (Multimodal)

The agent can "see" and audit a room from a single photograph using Gemini Vision.

3x3 Grid Analysis: Mentally divides the image into sectors to pinpoint features (e.g., "Window in Sector 3").
Material Detection: Analyzes wall textures (Concrete vs. Paint) to estimate Albedo (reflection coefficients).
Shadow Detection: Identifies under-lit zones requiring optimization.
Scale Estimation: Uses Reference Object Inference (e.g., comparing room width to standard door frames) to estimate floor area without user input.

4. Spatial State Memory (Stateful)

The agent possesses a "Short-term Memory" via the SpatialState class.

Persistence: It remembers room geometry and light sources across multiple reasoning steps.
Layering: Can combine visual data (from a photo) with technical data (from a PDF) into a single simulation model.

5. Technical Document Parsing

PDF Analysis: Capable of reading datasheets and blueprints to extract technical specifications (Lumens, Watts, CRI).
Simulation: Can "virtually install" a lamp found in a catalog into the scanned room to predict the final Lux level.

6. Standards & Compatibility (Expert System)

The agent acts as a certified engineer, not just a salesperson.

Knowledge Base (RAG): Consults internal standards (Zigbee, Matter, Philips Hue) to ensure hardware compatibility.
Config Generator: Automatically generates JSON configuration files for Home Assistant/HomeKit based on the designed lighting scenes.

7. Agentic Workflow

Tool Use: Autonomous Function Calling (The agent decides when to calculate, when to search, and when to read standards).
Streaming CLI: Real-time "Thinking" logs showing Tool Calls and arguments in the terminal.

🛠️ Tech Stack

Component	Technologies
Frontend	React 19, Vite, TailwindCSS, TypeScript
Backend	Python 3.12, FastAPI, Uvicorn
AI Core	Google GenAI SDK, Gemini 3.0 Pro, Gemini Live API
Infrastructure	Docker, Google Cloud Run, UV (Package Manager)

8. New in v1.0.0

Gemini Live API: Real-time multimodal interaction.
Live Persona: Customized voice and personality for the Live API.
Cloud Run Ready: Fully configured for serverless deployment on Google Cloud.

🏗️ Architecture

For detailed documentation on system design and data flows, see ARCHITECTURE.md.

graph TD
    User[User] -->|Interactions| FE["Frontend (React/Vite)"]
    
    subgraph "Client Side"
        FE -->|REST| GAI[Google Gemini API]
        FE -->|WebSocket/RTP| Live[Gemini Multimodal Live API]
    end
    
    subgraph "Server Side (Python/FastAPI)"
        FE -->|HTTP Requests| BE[Backend API]
        
        BE -->|Calculations| PE[Physics Engine]
        BE -->|Market/Search| MA[Market Agent]
        BE -->|Generation| RG[Report Generator]
        
        subgraph "Agent Core"
            AC[Agent Runtime] -->|Tools| PE
            AC -->|Tools| MA
            AC -->|Tools| KB["Knowledge Base (RAG)"]
            AC -->|State| SS[Spatial State]
        end
        
        BE -->|Invokes| AC
    end
    
    MA -->|Search| Web[Google Search]
    RG -->|Outputs| PDF["PDF/HTML Reports"]

🛠️ Project Structure

spatial-engine/
├── backend/                # FastAPI Backend
│   ├── main.py             # API Entry Points
│   ├── report_generator.py # HTML Report Logic
│   └── pdf_generator.py    # PDF Export Logic
├── frontend/               # React Frontend (Vite)
│   ├── src/
│   │   ├── components/     # UI Components (VisionAudit, EconomicEngine, etc.)
│   │   └── App.tsx         # Main UI Layout
├── my_agent/               # The AI Core
│   ├── agent.py            # The "Brain"
│   ├── market_agent.py     # The "Hands"
│   ├── physics_engine.py   # The "Core"
│   └── spatial_state.py    # The "Memory"
├── data/
│   └── smart_home_standards.md # RAG Knowledge Base
├── tests/                  # Unit Tests
├── .env                    # Configuration
├── pyproject.toml          # Python Dependencies
└── README.md               # Documentation

⚡ Quick Start

Prerequisites

Python 3.12+
uv (modern Python package manager)
Google Gemini API Key

Installation

Clone & Sync:

git clone https://github.com/vero-code/spatial-engine.git
cd spatial-engine
uv sync

Configure Environment:

Create a .env file:

# for backend
GOOGLE_API_KEY=your_gemini_key_here

# for frontend
VITE_GEMINI_API_KEY=your_gemini_key_here

Run the Agent:

# Start the Backend
uv run uvicorn backend.main:app --reload

# Start the Frontend (in a new terminal)
npm run dev --prefix frontend

Run Tests:

# Verify physics engine integrity
uv run python -m unittest discover tests

🗺️ Roadmap Status

🚩 Sprint 1: The Core (Completed)

Status: Fully Operational. 100% Test Coverage.

Infrastructure: Environment setup (uv), Project structure, Basic ADK integration.
Physics Engine: Deterministic calculations for Illuminance ($E = I/d^2$) and Energy ROI.
Reliability: Pydantic typing for tools, unittest suite coverage, Chain of Thought logging.
Persona: Senior Optical Engineer system prompt configuration.

👁️ Sprint 2: The Vision (Completed)

Status: Implemented. Agent "sees" geometry and materials, "reads", and "remembers".

Multimodality: Binary File Handler for image uploads.
Visual Analysis: 3x3 Grid decomposition, Shadow Detection, Material/Albedo identification.
Spatial Reasoning: Scale estimation via Reference Object Inference (no user input needed).
Advanced Features: PDF Parser for blueprints, Persistent Spatial State class.

🛒 Sprint 3: The Market & Intelligence (Completed)

Status: Implemented. Connecting Physics to Economics, Standards & Safety.

Market Agent: Multi-threaded Google Search for products and electricity rates.
Search Verification: Agent verifies technical specs (e.g., is_dimmable, protocol) before recommending to ensure compatibility.
Health Checks: ISO/SanPiN compliance tool (Pass/Fail verdicts for Lux levels).
Smart Standards (RAG): Knowledge Base for Zigbee/Matter/Hue compatibility.
Config Generator: JSON output for Home Assistant scenes (Focus/Relax/Movie).
Robustness: Fallback Mode logic for offline operation.

🎨 Sprint 4: The Interface (Completed)

Status: Fully Operational. Generative UI and Reporting live.

Visualization: Heatmaps for Vision Audit and Physics Engine.
Reporting: HTML and PDF report generation.
Generative UI: Interactive React Frontend with Budget Slider and real-time updates.

🏆 Sprint 5: The Pitch (Completed)

Goal: Polish and Submission.

Gemini Live API: Real-time active reasoning.
Documentation: Architecture diagrams, Demo video script, Final submission text.

🔮 Future Roadmap

Optimization: Latency reduction, Error handling, End-to-End testing.
Hardware: Gemini 3 reasoning with Nano Banana Pro.
Synthesized Video: Gemini Live API to synthesize live video for real-time recommendations.
Voice Chat: Bi-directional voice recognition in the chat interface.

Built for the Gemini 3 Hackathon.

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
backend		backend
data		data
docs		docs
frontend		frontend
my_agent		my_agent
scripts		scripts
src		src
tests		tests
.dockerignore		.dockerignore
.gcloudignore		.gcloudignore
.gitignore		.gitignore
.python-version		.python-version
ARCHITECTURE.md		ARCHITECTURE.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
cloudbuild.yaml		cloudbuild.yaml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spatial Engine AI 💡 v1.0.0

🎥 Demo

🚀 Key Features

1. The Physics Core (Deterministic)

2. The Market & Economic Engine (Real-Time)

3. The Vision System (Multimodal)

4. Spatial State Memory (Stateful)

5. Technical Document Parsing

6. Standards & Compatibility (Expert System)

7. Agentic Workflow

🛠️ Tech Stack

8. New in v1.0.0

🏗️ Architecture

🛠️ Project Structure

⚡ Quick Start

Prerequisites

Installation

🗺️ Roadmap Status

🚩 Sprint 1: The Core (Completed)

👁️ Sprint 2: The Vision (Completed)

🛒 Sprint 3: The Market & Intelligence (Completed)

🎨 Sprint 4: The Interface (Completed)

🏆 Sprint 5: The Pitch (Completed)

🔮 Future Roadmap

About

Uh oh!

Releases

Packages

Languages

License

vero-code/spatial-engine

Folders and files

Latest commit

History

Repository files navigation

Spatial Engine AI 💡 v1.0.0

🎥 Demo

🚀 Key Features

1. The Physics Core (Deterministic)

2. The Market & Economic Engine (Real-Time)

3. The Vision System (Multimodal)

4. Spatial State Memory (Stateful)

5. Technical Document Parsing

6. Standards & Compatibility (Expert System)

7. Agentic Workflow

🛠️ Tech Stack

8. New in v1.0.0

🏗️ Architecture

🛠️ Project Structure

⚡ Quick Start

Prerequisites

Installation

🗺️ Roadmap Status

🚩 Sprint 1: The Core (Completed)

👁️ Sprint 2: The Vision (Completed)

🛒 Sprint 3: The Market & Intelligence (Completed)

🎨 Sprint 4: The Interface (Completed)

🏆 Sprint 5: The Pitch (Completed)

🔮 Future Roadmap

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages