Skip to content

Spatial Engine AI: A DeepTech Agent that sees like a designer and calculates like a physicist. Powered by Gemini 3 Pro & Vision to optimize energy efficiency in built environments.

License

Notifications You must be signed in to change notification settings

vero-code/spatial-engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

57 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Spatial Engine AI ๐Ÿ’ก v1.0.0

DeepTech Autonomous Agent for Optical Physics & Energy Optimization Powered by Gemini 3 Pro & Google GenAI SDK

Python FastAPI React Vite Gemini Antigravity License

Spatial Engine is a multimodal AI agent designed to act as a Senior Optical Physicist. Unlike standard chatbots, it combines Generative AI's vision capabilities with a deterministic physics engine to audit rooms, calculate lighting deficits, and project energy ROI.

๐ŸŽฅ Demo

Spatial Engine AI Demo


๐Ÿš€ Key Features

1. The Physics Core (Deterministic)

The agent does not "guess" math. It delegates calculations to a rigorous Python engine.

  • Illuminance Calculation: Uses the Inverse Square Law ($E=I/d^2$) and Beam Angle geometry to calculate exact Lux levels at specific points.
  • Health Compliance (ISO/SanPiN): Automatically checks if lighting levels meet health standards for offices (500 Lux), living rooms, etc., and warns of safety deficits.
  • Unit Tested: All physics formulas are covered by unittest to ensure 100% reliability.

Physics Engine

2. The Market & Economic Engine (Real-Time)

The agent connects physics to the real economy.

  • Live Market Search: Finds real-world products (prices, specs) and local electricity rates (USD/kWh) via Google Search.
  • ROI & Energy Calculator: Computes financial savings (USD) and CO2 reduction when switching lighting technologies (e.g., Incandescent to LED).
  • Search Verification: "Trust but Verify" logic. The agent reads product specs to ensure a lamp is truly "dimmable" or "smart" before recommending it.
  • Fallback Resilience: Continues working offline using averaged market data if the internet connection fails.

Economic Engine

3. The Vision System (Multimodal)

The agent can "see" and audit a room from a single photograph using Gemini Vision.

  • 3x3 Grid Analysis: Mentally divides the image into sectors to pinpoint features (e.g., "Window in Sector 3").
  • Material Detection: Analyzes wall textures (Concrete vs. Paint) to estimate Albedo (reflection coefficients).
  • Shadow Detection: Identifies under-lit zones requiring optimization.
  • Scale Estimation: Uses Reference Object Inference (e.g., comparing room width to standard door frames) to estimate floor area without user input.

Vision Audit

4. Spatial State Memory (Stateful)

The agent possesses a "Short-term Memory" via the SpatialState class.

  • Persistence: It remembers room geometry and light sources across multiple reasoning steps.
  • Layering: Can combine visual data (from a photo) with technical data (from a PDF) into a single simulation model.

5. Technical Document Parsing

  • PDF Analysis: Capable of reading datasheets and blueprints to extract technical specifications (Lumens, Watts, CRI).
  • Simulation: Can "virtually install" a lamp found in a catalog into the scanned room to predict the final Lux level.

6. Standards & Compatibility (Expert System)

The agent acts as a certified engineer, not just a salesperson.

  • Knowledge Base (RAG): Consults internal standards (Zigbee, Matter, Philips Hue) to ensure hardware compatibility.
  • Config Generator: Automatically generates JSON configuration files for Home Assistant/HomeKit based on the designed lighting scenes.

7. Agentic Workflow

  • Tool Use: Autonomous Function Calling (The agent decides when to calculate, when to search, and when to read standards).
  • Streaming CLI: Real-time "Thinking" logs showing Tool Calls and arguments in the terminal.

๐Ÿ› ๏ธ Tech Stack

Component Technologies
Frontend React 19, Vite, TailwindCSS, TypeScript
Backend Python 3.12, FastAPI, Uvicorn
AI Core Google GenAI SDK, Gemini 3.0 Pro, Gemini Live API
Infrastructure Docker, Google Cloud Run, UV (Package Manager)

8. New in v1.0.0

  • Gemini Live API: Real-time multimodal interaction.
  • Live Persona: Customized voice and personality for the Live API.
  • Cloud Run Ready: Fully configured for serverless deployment on Google Cloud.

๐Ÿ—๏ธ Architecture

For detailed documentation on system design and data flows, see ARCHITECTURE.md.

graph TD
    User[User] -->|Interactions| FE["Frontend (React/Vite)"]
    
    subgraph "Client Side"
        FE -->|REST| GAI[Google Gemini API]
        FE -->|WebSocket/RTP| Live[Gemini Multimodal Live API]
    end
    
    subgraph "Server Side (Python/FastAPI)"
        FE -->|HTTP Requests| BE[Backend API]
        
        BE -->|Calculations| PE[Physics Engine]
        BE -->|Market/Search| MA[Market Agent]
        BE -->|Generation| RG[Report Generator]
        
        subgraph "Agent Core"
            AC[Agent Runtime] -->|Tools| PE
            AC -->|Tools| MA
            AC -->|Tools| KB["Knowledge Base (RAG)"]
            AC -->|State| SS[Spatial State]
        end
        
        BE -->|Invokes| AC
    end
    
    MA -->|Search| Web[Google Search]
    RG -->|Outputs| PDF["PDF/HTML Reports"]
Loading

๐Ÿ› ๏ธ Project Structure

spatial-engine/
โ”œโ”€โ”€ backend/                # FastAPI Backend
โ”‚   โ”œโ”€โ”€ main.py             # API Entry Points
โ”‚   โ”œโ”€โ”€ report_generator.py # HTML Report Logic
โ”‚   โ””โ”€โ”€ pdf_generator.py    # PDF Export Logic
โ”œโ”€โ”€ frontend/               # React Frontend (Vite)
โ”‚   โ”œโ”€โ”€ src/
โ”‚   โ”‚   โ”œโ”€โ”€ components/     # UI Components (VisionAudit, EconomicEngine, etc.)
โ”‚   โ”‚   โ””โ”€โ”€ App.tsx         # Main UI Layout
โ”œโ”€โ”€ my_agent/               # The AI Core
โ”‚   โ”œโ”€โ”€ agent.py            # The "Brain"
โ”‚   โ”œโ”€โ”€ market_agent.py     # The "Hands"
โ”‚   โ”œโ”€โ”€ physics_engine.py   # The "Core"
โ”‚   โ””โ”€โ”€ spatial_state.py    # The "Memory"
โ”œโ”€โ”€ data/
โ”‚   โ””โ”€โ”€ smart_home_standards.md # RAG Knowledge Base
โ”œโ”€โ”€ tests/                  # Unit Tests
โ”œโ”€โ”€ .env                    # Configuration
โ”œโ”€โ”€ pyproject.toml          # Python Dependencies
โ””โ”€โ”€ README.md               # Documentation

โšก Quick Start

Prerequisites

  • Python 3.12+

  • uv (modern Python package manager)

  • Google Gemini API Key

Installation

  1. Clone & Sync:

    git clone https://github.com/vero-code/spatial-engine.git
    cd spatial-engine
    uv sync
    
  2. Configure Environment:

    Create a .env file:

    # for backend
    GOOGLE_API_KEY=your_gemini_key_here
    
    # for frontend
    VITE_GEMINI_API_KEY=your_gemini_key_here
    
  3. Run the Agent:

    # Start the Backend
    uv run uvicorn backend.main:app --reload
    
    # Start the Frontend (in a new terminal)
    npm run dev --prefix frontend
    
  4. Run Tests:

    # Verify physics engine integrity
    uv run python -m unittest discover tests
    

๐Ÿ—บ๏ธ Roadmap Status

๐Ÿšฉ Sprint 1: The Core (Completed)

Status: Fully Operational. 100% Test Coverage.

  • Infrastructure: Environment setup (uv), Project structure, Basic ADK integration.
  • Physics Engine: Deterministic calculations for Illuminance ($E = I/d^2$) and Energy ROI.
  • Reliability: Pydantic typing for tools, unittest suite coverage, Chain of Thought logging.
  • Persona: Senior Optical Engineer system prompt configuration.

๐Ÿ‘๏ธ Sprint 2: The Vision (Completed)

Status: Implemented. Agent "sees" geometry and materials, "reads", and "remembers".

  • Multimodality: Binary File Handler for image uploads.
  • Visual Analysis: 3x3 Grid decomposition, Shadow Detection, Material/Albedo identification.
  • Spatial Reasoning: Scale estimation via Reference Object Inference (no user input needed).
  • Advanced Features: PDF Parser for blueprints, Persistent Spatial State class.

๐Ÿ›’ Sprint 3: The Market & Intelligence (Completed)

Status: Implemented. Connecting Physics to Economics, Standards & Safety.

  • Market Agent: Multi-threaded Google Search for products and electricity rates.
  • Search Verification: Agent verifies technical specs (e.g., is_dimmable, protocol) before recommending to ensure compatibility.
  • Health Checks: ISO/SanPiN compliance tool (Pass/Fail verdicts for Lux levels).
  • Smart Standards (RAG): Knowledge Base for Zigbee/Matter/Hue compatibility.
  • Config Generator: JSON output for Home Assistant scenes (Focus/Relax/Movie).
  • Robustness: Fallback Mode logic for offline operation.

๐ŸŽจ Sprint 4: The Interface (Completed)

Status: Fully Operational. Generative UI and Reporting live.

  • Visualization: Heatmaps for Vision Audit and Physics Engine.
  • Reporting: HTML and PDF report generation.
  • Generative UI: Interactive React Frontend with Budget Slider and real-time updates.

๐Ÿ† Sprint 5: The Pitch (Completed)

Goal: Polish and Submission.

  • Gemini Live API: Real-time active reasoning.
  • Documentation: Architecture diagrams, Demo video script, Final submission text.

๐Ÿ”ฎ Future Roadmap

  • Optimization: Latency reduction, Error handling, End-to-End testing.
  • Hardware: Gemini 3 reasoning with Nano Banana Pro.
  • Synthesized Video: Gemini Live API to synthesize live video for real-time recommendations.
  • Voice Chat: Bi-directional voice recognition in the chat interface.

Built for the Gemini 3 Hackathon.

About

Spatial Engine AI: A DeepTech Agent that sees like a designer and calculates like a physicist. Powered by Gemini 3 Pro & Vision to optimize energy efficiency in built environments.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published