EdgeTune

A self-tuning, local-first AI video analytics runtime that adapts to your hardware in real time.

What is EdgeTune?

EdgeTune is a privacy-focused video analytics system that runs entirely on your local machine — no cloud, no API keys required. It pairs YOLOv8 object detection with a finite-state autopilot that continuously monitors your GPU/CPU and dynamically adjusts inference parameters (precision, resolution, frame skipping, model variant) to squeeze the best possible performance from your hardware.

An optional LLM analyst (via local Ollama or Google Gemini) explains every autopilot decision in plain language, so you always understand why the system made a specific optimization.

✨ Key Features

Feature	Description
🔒 Fully Local	Video never leaves your machine. All inference and analysis run on local hardware.
⚙️ Self-Tuning Autopilot	A 4-state FSM (Stable → Soft → Balanced → Aggressive) monitors GPU utilisation, FPS drops, and VRAM pressure, then auto-tunes inference parameters with hysteresis and cooldown to prevent oscillation.
🧠 Dual-Brain Architecture	Fast Brain: YOLOv8 for real-time detection. Slow Brain: Local LLM for semantic explanations of system decisions.
📊 Real-Time Dashboard	Live GPU, VRAM, FPS, and latency charts streamed over WebSockets at low latency.
🎛️ Hot-Reconfigurable	Switch models (YOLOv8n/s/m), change autopilot mode (Speed / Balanced / Accuracy), or upload custom `.pt` models — all without restarting.
📦 3 Models Included	YOLOv8n (nano), YOLOv8s (small), and YOLOv8m (medium) are bundled out-of-the-box — no separate downloads needed.
🎥 Flexible Input	Webcam feed or uploaded video files with full playback controls (pause, seek, speed).
📥 Export & Reporting	Download a CSV report of hardware telemetry, autopilot decisions, and LLM explanations.
🖥️ Hardware-Aware	Auto-detects NVIDIA GPUs via `pynvml`, reads VRAM and compute capability, and classifies into performance tiers (Low / Mid / High / CPU-only). Falls back gracefully to CPU.

🏗️ Architecture

┌──────────────────────────────────────────────────────────────────────────┐
│  Frontend  (Next.js 16 · React 19 · TypeScript · Tailwind CSS)         │
│                                                                          │
│  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌───────────┐ ┌──────────────┐ │
│  │Video Feed│ │GPU Chart │ │FPS Graph │ │VRAM Chart │ │Autopilot Log │ │
│  └──────────┘ └──────────┘ └──────────┘ └───────────┘ └──────────────┘ │
│  ┌──────────────┐ ┌──────────────┐ ┌────────────┐ ┌────────────────┐   │
│  │Source Selector│ │Model Selector│ │LLM Feed    │ │Analysis Export │   │
│  └──────────────┘ └──────────────┘ └────────────┘ └────────────────┘   │
│                                                                          │
│                        useWebSocket (custom hook)                        │
│                              ▲  WebSocket                                │
└──────────────────────────────┼───────────────────────────────────────────┘
                               │
┌──────────────────────────────┼───────────────────────────────────────────┐
│  Backend  (Python · FastAPI · Uvicorn)                                   │
│                              │                                           │
│  ┌────────────────┐    ┌─────┴─────────────┐    ┌───────────────────┐   │
│  │ REST API       │    │ WebSocket Handler  │    │ Pipeline          │   │
│  │ /api/health    │    │ /ws                │    │ Orchestrator      │   │
│  │ /api/hardware  │    │ · telemetry        │    │                   │   │
│  │ /api/inference │    │ · decisions        │    │  VideoSource      │   │
│  │ /api/source    │    │ · llm_explanation  │    │       ↓           │   │
│  │ /api/models    │    │ · video_frame      │    │  InferenceEngine  │   │
│  └────────────────┘    │ · source_progress  │    │       ↓           │   │
│                        └────────────────────┘    │  TelemetryMonitor │   │
│                                                  │       ↓           │   │
│  ┌─────────────────────┐                         │  Autopilot FSM    │   │
│  │ Hardware Profiler   │                         │       ↓           │   │
│  │ GPU/CPU detection   │                         │  LLM Analyst      │   │
│  │ VRAM / Tier / FP16  │                         │  (Ollama/Gemini)  │   │
│  └─────────────────────┘                         └───────────────────┘   │
└──────────────────────────────────────────────────────────────────────────┘

Data Flow

VideoSource captures frames from a webcam or uploaded file.
InferenceEngine runs YOLOv8 detection with the current parameter set.
TelemetryMonitor samples GPU utilisation, VRAM, FPS, and latency at 500 ms intervals.
AutopilotController evaluates the telemetry snapshot against mode-specific thresholds and transitions the FSM, applying parameter changes (precision, resolution, frame skip, model swap).
LLMAnalyst (optional) explains each state transition in 1–3 sentences via Ollama or Gemini.
WebSocket Handler broadcasts annotated video frames, telemetry, decisions, and explanations to the dashboard.

📂 Project Structure

EdgeTune/
├── backend/
│   ├── main.py                     # FastAPI entrypoint & pipeline orchestrator
│   ├── requirements.txt
│   ├── config/
│   │   └── settings.yaml           # All runtime configuration
│   ├── core/
│   │   ├── inference_engine.py     # YOLO wrapper with hot-reconfiguration
│   │   ├── autopilot_controller.py # 4-state FSM optimisation engine
│   │   ├── telemetry_monitor.py    # GPU/CPU/FPS sampling
│   │   ├── hardware_profiler.py    # GPU detection & tier classification
│   │   └── video_source.py         # Camera & file input with playback
│   ├── llm/
│   │   ├── analyst.py              # LLM decision explainer (Ollama/Gemini)
│   │   └── discovery.py            # Auto-detect available Ollama models
│   └── api/
│       ├── routes.py               # REST endpoints
│       └── websocket.py            # WebSocket manager & broadcast helpers
│
├── frontend/
│   ├── package.json
│   └── src/
│       ├── app/
│       │   ├── layout.tsx
│       │   └── page.tsx            # Main dashboard page
│       ├── components/
│       │   ├── video-feed.tsx      # Live annotated video stream
│       │   ├── gpu-chart.tsx       # GPU utilisation chart
│       │   ├── vram-chart.tsx      # VRAM usage chart
│       │   ├── fps-graph.tsx       # FPS over time graph
│       │   ├── autopilot-timeline.tsx  # Autopilot decision log
│       │   ├── llm-feed.tsx        # LLM explanation feed
│       │   ├── source-selector.tsx # Webcam / file input picker
│       │   ├── model-selector.tsx  # YOLO model switcher + upload
│       │   ├── mode-selector.tsx   # Speed / Balanced / Accuracy toggle
│       │   ├── playback-controls.tsx   # Video seek, pause, speed
│       │   ├── hardware-info.tsx   # GPU/CPU hardware card
│       │   ├── analysis-export.tsx # CSV download
│       │   └── connection-status.tsx   # WebSocket status indicator
│       ├── hooks/
│       │   └── useWebSocket.ts     # WebSocket client with auto-reconnect
│       └── lib/
│           ├── api.ts              # REST API client
│           └── types.ts            # Shared TypeScript interfaces
│
├── .gitignore
└── README.md

� Getting Started

Prerequisites

Requirement	Notes
Python 3.10+	Backend runtime
Node.js 18+	Frontend build
NVIDIA GPU	Recommended; falls back to CPU automatically
Ollama	Optional — only needed for local LLM explanations

1 · Backend

cd backend

# Create and activate a virtual environment
python -m venv .venv
# Windows
.venv\Scripts\activate
# macOS/Linux
source .venv/bin/activate

# Install dependencies (includes PyTorch with CUDA via Ultralytics)
pip install -r requirements.txt

# Start the API server
python main.py

The API will be available at http://localhost:8000. API docs at /docs.

2 · Frontend

cd frontend

npm install
npm run dev

Open http://localhost:3000 in your browser.

3 · LLM (Optional)

If you want AI-powered explanations of autopilot decisions:

# Install Ollama from https://ollama.com
ollama pull phi3:mini     # lightweight, fast
# or: ollama pull llama3 / mistral

EdgeTune auto-discovers available Ollama models at startup. You can also configure Gemini in settings.yaml by setting a GEMINI_API_KEY.

🖥️ Usage

Open the dashboard at http://localhost:3000.
Pick a video source — webcam or upload a video file.
(Optional) Select a YOLO model or upload a custom .pt file.
Choose an autopilot mode — Speed, Balanced, or Accuracy.
Click Start Inference.
Watch the dashboard in real time:
- Video Feed — annotated detections overlaid on the live stream.
- Performance Cards — GPU %, FPS, VRAM, and latency at a glance.
- Charts — GPU utilisation, VRAM, and FPS history graphs.
- Autopilot Timeline — every FSM transition with reason and applied parameters.
- LLM Insights — plain-language explanations of optimisation decisions.
When finished, Export Analysis as a CSV.

📸 Screenshots

Dashboard with Real-Time Detection

The main dashboard displays live video streams with YOLOv8 object detection overlays, showing confidence scores for detected objects. The interface features real-time performance metrics and a three-panel layout: autopilot decisions on the left, advisor explanations in the center, and analysis data on the right.

Performance Monitoring & Analytics

Real-time performance charts show GPU utilisation, VRAM usage, latency, and FPS tracking. The system displays historical trends with interactive graphs and provides detailed telemetry alongside autopilot state transitions and LLM-powered explanations of optimization decisions.

Accuracy Mode with Dense Detection

In Accuracy mode, EdgeTune provides maximum detection coverage with comprehensive object identification across the entire frame — detecting multiple object classes simultaneously while maintaining high precision on complex street scenes.

⚙️ Configuration

All settings live in backend/config/settings.yaml.

Key configuration options

Section	Key	Default	Description
`source`	`type`	`camera`	`camera` or `file`
`source`	`processing_mode`	`paced`	`paced` (real-time) or `benchmark` (max speed)
`inference`	`model_variant`	`yolov8n`	`yolov8n`, `yolov8s`, or `yolov8m`
`inference`	`device`	`auto`	`auto`, `cuda:0`, or `cpu`
`inference`	`backend`	`pytorch`	`pytorch`, `onnx`, or `tensorrt`
`autopilot`	`mode`	`balanced`	`speed`, `balanced`, or `accuracy`
`autopilot`	`escalate_gpu_threshold`	`90`	GPU % to trigger escalation
`autopilot`	`cooldown_seconds`	`5.0`	Min seconds between state changes
`llm`	`provider`	`ollama`	`ollama` or `gemini`
`llm`	`enabled`	`true`	Toggle LLM explanations on/off
`server`	`port`	`8000`	Backend API port

� API Reference

REST Endpoints

Method	Endpoint	Description
`GET`	`/api/health`	System health check (GPU, inference, LLM status)
`GET`	`/api/hardware`	Detected hardware profile
`GET`	`/api/telemetry`	Latest telemetry snapshot
`GET`	`/api/telemetry/history?n=60`	Rolling telemetry history
`GET`	`/api/autopilot`	Current autopilot state and mode
`PUT`	`/api/autopilot/mode`	Change autopilot mode
`POST`	`/api/inference/start`	Start the inference pipeline
`POST`	`/api/inference/stop`	Stop the inference pipeline
`POST`	`/api/source/upload`	Upload a video file
`GET`	`/api/source/files`	List available source files
`GET`	`/api/source/info`	Current source metadata
`POST`	`/api/source/playback`	Playback control (pause, seek, speed)
`GET`	`/api/models`	List available YOLO models
`POST`	`/api/models/upload`	Upload a custom `.pt` model
`POST`	`/api/models/switch`	Hot-swap the active model

WebSocket (`/ws`)

The single WebSocket connection streams these message types:

Type	Payload	Direction
`telemetry`	GPU %, FPS, VRAM, latency, CPU %	Server → Client
`autopilot_decision`	State transition, action, reason, params	Server → Client
`llm_explanation`	Plain-text explanation of a decision	Server → Client
`video_frame`	Base64-encoded JPEG frame	Server → Client
`source_progress`	File progress, frame number, paused state	Server → Client
`status`	Toast notifications (errors, info)	Server → Client
`ping` / `pong`	Keep-alive heartbeat	Both

🧩 Tech Stack

Layer	Technology
Backend	Python 3.10+, FastAPI, Uvicorn, Ultralytics YOLOv8, PyTorch, OpenCV
Frontend	Next.js 16, React 19, TypeScript, Tailwind CSS 4
Communication	WebSocket (real-time) + REST (control plane)
GPU Monitoring	pynvml, psutil
LLM Integration	Ollama (local) · Google Gemini (optional cloud)
Configuration	YAML (`settings.yaml`)

🤝 Contributing

Contributions are welcome! Please fork the repository and submit a pull request.

📄 License

MIT License — see LICENSE for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EdgeTune

What is EdgeTune?

✨ Key Features

🏗️ Architecture

Data Flow

📂 Project Structure

� Getting Started

Prerequisites

1 · Backend

2 · Frontend

3 · LLM (Optional)

🖥️ Usage

📸 Screenshots

Dashboard with Real-Time Detection

Performance Monitoring & Analytics

Accuracy Mode with Dense Detection

⚙️ Configuration

� API Reference

REST Endpoints

WebSocket (`/ws`)

🧩 Tech Stack

🤝 Contributing

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
backend		backend
frontend		frontend
screenshots		screenshots
.gitignore		.gitignore
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

EdgeTune

What is EdgeTune?

✨ Key Features

🏗️ Architecture

Data Flow

📂 Project Structure

� Getting Started

Prerequisites

1 · Backend

2 · Frontend

3 · LLM (Optional)

🖥️ Usage

📸 Screenshots

Dashboard with Real-Time Detection

Performance Monitoring & Analytics

Accuracy Mode with Dense Detection

⚙️ Configuration

� API Reference

REST Endpoints

WebSocket (/ws)

🧩 Tech Stack

🤝 Contributing

📄 License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

WebSocket (`/ws`)

Packages