Skip to content

DHRUVXJANI/EdgeTune

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

8 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

EdgeTune

EdgeTune

A self-tuning, local-first AI video analytics runtime that adapts to your hardware in real time.

Python TypeScript FastAPI Next.js YOLOv8 Ollama License


What is EdgeTune?

EdgeTune is a privacy-focused video analytics system that runs entirely on your local machine β€” no cloud, no API keys required. It pairs YOLOv8 object detection with a finite-state autopilot that continuously monitors your GPU/CPU and dynamically adjusts inference parameters (precision, resolution, frame skipping, model variant) to squeeze the best possible performance from your hardware.

An optional LLM analyst (via local Ollama or Google Gemini) explains every autopilot decision in plain language, so you always understand why the system made a specific optimization.


✨ Key Features

Feature Description
πŸ”’ Fully Local Video never leaves your machine. All inference and analysis run on local hardware.
βš™οΈ Self-Tuning Autopilot A 4-state FSM (Stable β†’ Soft β†’ Balanced β†’ Aggressive) monitors GPU utilisation, FPS drops, and VRAM pressure, then auto-tunes inference parameters with hysteresis and cooldown to prevent oscillation.
🧠 Dual-Brain Architecture Fast Brain: YOLOv8 for real-time detection. Slow Brain: Local LLM for semantic explanations of system decisions.
πŸ“Š Real-Time Dashboard Live GPU, VRAM, FPS, and latency charts streamed over WebSockets at low latency.
πŸŽ›οΈ Hot-Reconfigurable Switch models (YOLOv8n/s/m), change autopilot mode (Speed / Balanced / Accuracy), or upload custom .pt models β€” all without restarting.
πŸ“¦ 3 Models Included YOLOv8n (nano), YOLOv8s (small), and YOLOv8m (medium) are bundled out-of-the-box β€” no separate downloads needed.
πŸŽ₯ Flexible Input Webcam feed or uploaded video files with full playback controls (pause, seek, speed).
πŸ“₯ Export & Reporting Download a CSV report of hardware telemetry, autopilot decisions, and LLM explanations.
πŸ–₯️ Hardware-Aware Auto-detects NVIDIA GPUs via pynvml, reads VRAM and compute capability, and classifies into performance tiers (Low / Mid / High / CPU-only). Falls back gracefully to CPU.

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Frontend  (Next.js 16 Β· React 19 Β· TypeScript Β· Tailwind CSS)         β”‚
β”‚                                                                          β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚Video Feedβ”‚ β”‚GPU Chart β”‚ β”‚FPS Graph β”‚ β”‚VRAM Chart β”‚ β”‚Autopilot Log β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚Source Selectorβ”‚ β”‚Model Selectorβ”‚ β”‚LLM Feed    β”‚ β”‚Analysis Export β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚                                                                          β”‚
β”‚                        useWebSocket (custom hook)                        β”‚
β”‚                              β–²  WebSocket                                β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                               β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Backend  (Python Β· FastAPI Β· Uvicorn)                                   β”‚
β”‚                              β”‚                                           β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚ REST API       β”‚    β”‚ WebSocket Handler  β”‚    β”‚ Pipeline          β”‚   β”‚
β”‚  β”‚ /api/health    β”‚    β”‚ /ws                β”‚    β”‚ Orchestrator      β”‚   β”‚
β”‚  β”‚ /api/hardware  β”‚    β”‚ Β· telemetry        β”‚    β”‚                   β”‚   β”‚
β”‚  β”‚ /api/inference β”‚    β”‚ Β· decisions        β”‚    β”‚  VideoSource      β”‚   β”‚
β”‚  β”‚ /api/source    β”‚    β”‚ Β· llm_explanation  β”‚    β”‚       ↓           β”‚   β”‚
β”‚  β”‚ /api/models    β”‚    β”‚ Β· video_frame      β”‚    β”‚  InferenceEngine  β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚ Β· source_progress  β”‚    β”‚       ↓           β”‚   β”‚
β”‚                        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚  TelemetryMonitor β”‚   β”‚
β”‚                                                  β”‚       ↓           β”‚   β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                         β”‚  Autopilot FSM    β”‚   β”‚
β”‚  β”‚ Hardware Profiler   β”‚                         β”‚       ↓           β”‚   β”‚
β”‚  β”‚ GPU/CPU detection   β”‚                         β”‚  LLM Analyst      β”‚   β”‚
β”‚  β”‚ VRAM / Tier / FP16  β”‚                         β”‚  (Ollama/Gemini)  β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Data Flow

  1. VideoSource captures frames from a webcam or uploaded file.
  2. InferenceEngine runs YOLOv8 detection with the current parameter set.
  3. TelemetryMonitor samples GPU utilisation, VRAM, FPS, and latency at 500 ms intervals.
  4. AutopilotController evaluates the telemetry snapshot against mode-specific thresholds and transitions the FSM, applying parameter changes (precision, resolution, frame skip, model swap).
  5. LLMAnalyst (optional) explains each state transition in 1–3 sentences via Ollama or Gemini.
  6. WebSocket Handler broadcasts annotated video frames, telemetry, decisions, and explanations to the dashboard.

πŸ“‚ Project Structure

EdgeTune/
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ main.py                     # FastAPI entrypoint & pipeline orchestrator
β”‚   β”œβ”€β”€ requirements.txt
β”‚   β”œβ”€β”€ config/
β”‚   β”‚   └── settings.yaml           # All runtime configuration
β”‚   β”œβ”€β”€ core/
β”‚   β”‚   β”œβ”€β”€ inference_engine.py     # YOLO wrapper with hot-reconfiguration
β”‚   β”‚   β”œβ”€β”€ autopilot_controller.py # 4-state FSM optimisation engine
β”‚   β”‚   β”œβ”€β”€ telemetry_monitor.py    # GPU/CPU/FPS sampling
β”‚   β”‚   β”œβ”€β”€ hardware_profiler.py    # GPU detection & tier classification
β”‚   β”‚   └── video_source.py         # Camera & file input with playback
β”‚   β”œβ”€β”€ llm/
β”‚   β”‚   β”œβ”€β”€ analyst.py              # LLM decision explainer (Ollama/Gemini)
β”‚   β”‚   └── discovery.py            # Auto-detect available Ollama models
β”‚   └── api/
β”‚       β”œβ”€β”€ routes.py               # REST endpoints
β”‚       └── websocket.py            # WebSocket manager & broadcast helpers
β”‚
β”œβ”€β”€ frontend/
β”‚   β”œβ”€β”€ package.json
β”‚   └── src/
β”‚       β”œβ”€β”€ app/
β”‚       β”‚   β”œβ”€β”€ layout.tsx
β”‚       β”‚   └── page.tsx            # Main dashboard page
β”‚       β”œβ”€β”€ components/
β”‚       β”‚   β”œβ”€β”€ video-feed.tsx      # Live annotated video stream
β”‚       β”‚   β”œβ”€β”€ gpu-chart.tsx       # GPU utilisation chart
β”‚       β”‚   β”œβ”€β”€ vram-chart.tsx      # VRAM usage chart
β”‚       β”‚   β”œβ”€β”€ fps-graph.tsx       # FPS over time graph
β”‚       β”‚   β”œβ”€β”€ autopilot-timeline.tsx  # Autopilot decision log
β”‚       β”‚   β”œβ”€β”€ llm-feed.tsx        # LLM explanation feed
β”‚       β”‚   β”œβ”€β”€ source-selector.tsx # Webcam / file input picker
β”‚       β”‚   β”œβ”€β”€ model-selector.tsx  # YOLO model switcher + upload
β”‚       β”‚   β”œβ”€β”€ mode-selector.tsx   # Speed / Balanced / Accuracy toggle
β”‚       β”‚   β”œβ”€β”€ playback-controls.tsx   # Video seek, pause, speed
β”‚       β”‚   β”œβ”€β”€ hardware-info.tsx   # GPU/CPU hardware card
β”‚       β”‚   β”œβ”€β”€ analysis-export.tsx # CSV download
β”‚       β”‚   └── connection-status.tsx   # WebSocket status indicator
β”‚       β”œβ”€β”€ hooks/
β”‚       β”‚   └── useWebSocket.ts     # WebSocket client with auto-reconnect
β”‚       └── lib/
β”‚           β”œβ”€β”€ api.ts              # REST API client
β”‚           └── types.ts            # Shared TypeScript interfaces
β”‚
β”œβ”€β”€ .gitignore
└── README.md

οΏ½ Getting Started

Prerequisites

Requirement Notes
Python 3.10+ Backend runtime
Node.js 18+ Frontend build
NVIDIA GPU Recommended; falls back to CPU automatically
Ollama Optional β€” only needed for local LLM explanations

1 Β· Backend

cd backend

# Create and activate a virtual environment
python -m venv .venv
# Windows
.venv\Scripts\activate
# macOS/Linux
source .venv/bin/activate

# Install dependencies (includes PyTorch with CUDA via Ultralytics)
pip install -r requirements.txt

# Start the API server
python main.py

The API will be available at http://localhost:8000. API docs at /docs.

2 Β· Frontend

cd frontend

npm install
npm run dev

Open http://localhost:3000 in your browser.

3 Β· LLM (Optional)

If you want AI-powered explanations of autopilot decisions:

# Install Ollama from https://ollama.com
ollama pull phi3:mini     # lightweight, fast
# or: ollama pull llama3 / mistral

EdgeTune auto-discovers available Ollama models at startup. You can also configure Gemini in settings.yaml by setting a GEMINI_API_KEY.


πŸ–₯️ Usage

  1. Open the dashboard at http://localhost:3000.
  2. Pick a video source β€” webcam or upload a video file.
  3. (Optional) Select a YOLO model or upload a custom .pt file.
  4. Choose an autopilot mode β€” Speed, Balanced, or Accuracy.
  5. Click Start Inference.
  6. Watch the dashboard in real time:
    • Video Feed β€” annotated detections overlaid on the live stream.
    • Performance Cards β€” GPU %, FPS, VRAM, and latency at a glance.
    • Charts β€” GPU utilisation, VRAM, and FPS history graphs.
    • Autopilot Timeline β€” every FSM transition with reason and applied parameters.
    • LLM Insights β€” plain-language explanations of optimisation decisions.
  7. When finished, Export Analysis as a CSV.

πŸ“Έ Screenshots

Dashboard with Real-Time Detection

The main dashboard displays live video streams with YOLOv8 object detection overlays, showing confidence scores for detected objects. The interface features real-time performance metrics and a three-panel layout: autopilot decisions on the left, advisor explanations in the center, and analysis data on the right.

EdgeTune Dashboard - Live Detection Feed

Performance Monitoring & Analytics

Real-time performance charts show GPU utilisation, VRAM usage, latency, and FPS tracking. The system displays historical trends with interactive graphs and provides detailed telemetry alongside autopilot state transitions and LLM-powered explanations of optimization decisions.

EdgeTune Performance Metrics

Accuracy Mode with Dense Detection

In Accuracy mode, EdgeTune provides maximum detection coverage with comprehensive object identification across the entire frame β€” detecting multiple object classes simultaneously while maintaining high precision on complex street scenes.

EdgeTune Accuracy Mode - Comprehensive Detection


βš™οΈ Configuration

All settings live in backend/config/settings.yaml.

Key configuration options
Section Key Default Description
source type camera camera or file
source processing_mode paced paced (real-time) or benchmark (max speed)
inference model_variant yolov8n yolov8n, yolov8s, or yolov8m
inference device auto auto, cuda:0, or cpu
inference backend pytorch pytorch, onnx, or tensorrt
autopilot mode balanced speed, balanced, or accuracy
autopilot escalate_gpu_threshold 90 GPU % to trigger escalation
autopilot cooldown_seconds 5.0 Min seconds between state changes
llm provider ollama ollama or gemini
llm enabled true Toggle LLM explanations on/off
server port 8000 Backend API port

οΏ½ API Reference

REST Endpoints

Method Endpoint Description
GET /api/health System health check (GPU, inference, LLM status)
GET /api/hardware Detected hardware profile
GET /api/telemetry Latest telemetry snapshot
GET /api/telemetry/history?n=60 Rolling telemetry history
GET /api/autopilot Current autopilot state and mode
PUT /api/autopilot/mode Change autopilot mode
POST /api/inference/start Start the inference pipeline
POST /api/inference/stop Stop the inference pipeline
POST /api/source/upload Upload a video file
GET /api/source/files List available source files
GET /api/source/info Current source metadata
POST /api/source/playback Playback control (pause, seek, speed)
GET /api/models List available YOLO models
POST /api/models/upload Upload a custom .pt model
POST /api/models/switch Hot-swap the active model

WebSocket (/ws)

The single WebSocket connection streams these message types:

Type Payload Direction
telemetry GPU %, FPS, VRAM, latency, CPU % Server β†’ Client
autopilot_decision State transition, action, reason, params Server β†’ Client
llm_explanation Plain-text explanation of a decision Server β†’ Client
video_frame Base64-encoded JPEG frame Server β†’ Client
source_progress File progress, frame number, paused state Server β†’ Client
status Toast notifications (errors, info) Server β†’ Client
ping / pong Keep-alive heartbeat Both

🧩 Tech Stack

Layer Technology
Backend Python 3.10+, FastAPI, Uvicorn, Ultralytics YOLOv8, PyTorch, OpenCV
Frontend Next.js 16, React 19, TypeScript, Tailwind CSS 4
Communication WebSocket (real-time) + REST (control plane)
GPU Monitoring pynvml, psutil
LLM Integration Ollama (local) Β· Google Gemini (optional cloud)
Configuration YAML (settings.yaml)

🀝 Contributing

Contributions are welcome! Please fork the repository and submit a pull request.

πŸ“„ License

MIT License β€” see LICENSE for details.

About

A self-tuning, local-first AI video analytics runtime that adapts to your hardware in real time.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors