Satellite Structure Finder 🛰️

title	Satellite Structure Finder
emoji	🛰️
colorFrom	blue
colorTo	green
sdk	docker
app_port	7860

Satellite Structure Finder 🛰️

🚀 Try the Live Demo on Hugging Face Spaces

Interactive structure detection in satellite imagery using DINOv3 vision transformers and similarity search. Click on example structures you want to find, and the system will highlight similar regions across the entire map using similarities between DINOv3 embeddings.

Features

🎯 Point-and-click similarity search: Mark positive/negative examples directly on the map
🗺️ Multi-city support: Vienna and Graz satellite imagery included
🔍 Deep zoom viewer: Explore high-resolution satellite imagery with OpenSeadragon
🧠 DINOv3 vision transformer: State-of-the-art self-supervised feature extraction
⚡ Pre-computed embeddings: Instant search results (no runtime extraction delay)
📊 Interactive heatmap overlay: Visualize similarity scores across the map
🎨 DINO grid overlay: See the 16×16 pixel patch boundaries used by the model
📷 Download functionality: Export your current view with overlays

Demo

Finding Cars in Vienna


1. Select Examples Click on cars you want to find	2. View DINO Grid See the 16×16px patches	3. Find Similar Heatmap highlights matches

Finding Pools in Graz


1. Select Examples Click on swimming pools	2. Find Similar Algorithm finds all pools

How It Works

Architecture

Satellite Image → DINOv3 Backbone → Dense Embeddings → Similarity Search → Heatmap Overlay
                   (frozen)          (16×16 patches)     (cosine sim)      (Gradio UI)

Algorithm

Pre-computation: Satellite images are divided into 16×16 pixel patches and encoded using DINOv3 to produce a dense grid of 384-dimensional embeddings
User interaction: User clicks on structures they want to find (positive examples) and structures to avoid (negative examples)
Query construction: Selected patch embeddings are averaged to create a query vector (negative examples are subtracted with 0.5 weight)
Similarity search: Cosine similarity is computed between the query and all patch embeddings
Visualization: Similarity scores are displayed as a red heatmap overlay on the map

DINOv3: Self-Supervised Vision Transformer

This project uses DINOv3 (Caron et al., 2023), a state-of-the-art self-supervised vision transformer from Meta AI Research. Specifically, we use the satellite-adapted variant vit_large_patch16_dinov3.sat493m which was fine-tuned on 493 million satellite/aerial images.

Key advantages for satellite imagery:

Self-supervised learning: Trained without labels, learns general visual features
Dense features: Produces embeddings for every 16×16 pixel patch (not just image-level)
Satellite adaptation: Fine-tuned on aerial/satellite imagery for domain-specific features
Strong performance: Excellent at capturing structural patterns (buildings, roads, pools, etc.)

Model details:

Architecture: Vision Transformer (ViT-L/16)
Input resolution: 224×224 pixels
Patch size: 16×16 pixels
Embedding dimension: 1024 → 384 (after projection)
Parameters: ~304M (backbone)

Data Sources

Vienna Orthofoto 2024

Source: Stadt Wien Open Government Data
License: CC BY 4.0
Attribution: "Datenquelle: Stadt Wien – data.wien.gv.at"
Resolution: ~15cm native, ~2.5m/pixel at our zoom level
Coverage: Central Vienna area (5888×7168 pixels)
WMTS URL: https://mapsneu.wien.gv.at/wmtsneu/1.0.0/WMTSCapabilities.xml

Graz Orthofoto 2024

Source: Stadt Graz Open Government Data
License: CC BY 4.0
Attribution: "Datenquelle: Stadt Graz – data.graz.gv.at"
Resolution: ~15cm native, ~2.5m/pixel at our zoom level
Coverage: Central Graz area (6656×6912 pixels)

Both datasets are publicly available orthophotos (aerial photographs corrected for topographic relief and camera tilt).

Quick Start

Installation

# Clone repository
git clone https://github.com/yourusername/sat-finder.git
cd sat-finder

# Install dependencies
pip install -e ".[dev]"

# Download satellite data and pre-compute embeddings
make download-vienna-stitch    # Download Vienna orthofoto (~16MB)
make download-graz-stitch      # Download Graz orthofoto (~18MB)
make precompute-embeddings     # Generate DINOv3 embeddings (~627MB)

Run the App

# Start Gradio app (production mode)
make app

# Or use development mode with auto-reload
make app-dev

Open your browser to http://localhost:7860

Usage

Select a city from the dropdown (Vienna or Graz)
Choose point type: Positive (+) or Negative (-)
Click on structures in the map to mark examples
- Green markers = positive examples (what you want to find)
- Red markers = negative examples (what to avoid)
Click "Find Similar" to compute similarity
Similar areas will be highlighted in the red heatmap
Toggle overlays:
- Show Heatmap: Display/hide similarity scores
- Show DINO Grid: Visualize 16×16 patch boundaries
Adjust Heatmap Opacity for better visualization
Use "Download View" to save current view as PNG
Click "Clear Points" to reset and start over

Development

Project Structure

sat-finder/
├── app.py                      # Main entry point (FastAPI + Gradio)
├── src/satfinder/              # Source code
│   ├── api.py                  # FastAPI application factory
│   ├── config.py               # Configuration constants
│   ├── similarity.py           # Similarity search engine
│   ├── state.py                # Gradio state management
│   └── ui/                     # Gradio UI components
│       ├── controls.py         # Sidebar controls
│       ├── layout.py           # Main Blocks layout
│       └── viewer.py           # OpenSeadragon viewer
├── static/                     # Static assets
│   ├── viewer.html             # OpenSeadragon viewer iframe
│   ├── js/                     # JavaScript libraries
│   ├── tiles/                  # DeepZoom tiles (Vienna)
│   └── tiles_graz/             # DeepZoom tiles (Graz)
├── assets/                     # Pre-computed data
│   ├── vienna_embeddings.npz   # Vienna DINOv3 features
│   ├── graz_embeddings.npz     # Graz DINOv3 features
│   ├── vienna.jpg              # Vienna source image
│   └── graz.jpg                # Graz source image
├── scripts/                    # Data preparation scripts
│   ├── download_vienna_data.py
│   ├── download_graz_data.py
│   └── precompute_embeddings.py
└── docs/                       # Documentation & figures

Key Commands

make help                  # Show all available commands
make app                   # Start Gradio app (production)
make app-dev               # Start Gradio app (dev mode with auto-reload)
make tiles                 # Generate DeepZoom tiles from images
make precompute-embeddings # Pre-compute DINOv3 embeddings
make tests                 # Run tests
make lint                  # Run linter
make format                # Format code

DevContainer

The project includes a complete VS Code DevContainer setup with GPU support:

# All dependencies are automatically installed on container creation:
# - Python packages (from pyproject.toml)
# - OpenSeadragon JavaScript library
# - Jupyter kernel setup
# - Pre-commit hooks

See .devcontainer/post-create.sh for details.

Technical Details

DINOv3 Feature Extraction

from satfinder.similarity import load_embeddings, compute_similarity

# Load pre-computed embeddings (cached)
embeddings = load_embeddings("vienna")  # Shape: (448, 368, 384)
# 448×368 grid of 384-dimensional embeddings

# Compute similarity between query points and all patches
pos_points = [(100, 200), (150, 250)]  # (x, y) coordinates
neg_points = [(300, 400)]

heatmap = compute_similarity(
    embeddings,
    pos_points=pos_points,
    neg_points=neg_points,
    img_w=5888,
    img_h=7168,
    grid_w=368,
    grid_h=448
)

Similarity Scoring

The similarity score for each patch is computed as:

query = mean(positive_embeddings) - 0.5 × mean(negative_embeddings)
similarity = cosine_similarity(query, patch_embedding)

Where cosine similarity is:

cos_sim(A, B) = (A · B) / (||A|| × ||B||)

Scores range from -1 (opposite) to +1 (identical). The heatmap visualizes scores above a threshold (typically 0.5).

Performance

Embedding pre-computation: ~5-10 minutes per city (one-time cost)
Search latency: <100ms for similarity computation (CPU)
Memory usage: ~2GB RAM (embeddings + model)
Model size: ~1.2GB (DINOv3 ViT-L/16)

Limitations

Fixed patch size (16×16 pixels) may miss very small or very large structures
Performance depends on visual similarity (struggles with abstract patterns)
Pre-computed embeddings are city-specific (requires re-computation for new images)
Requires significant disk space for embeddings (~627MB per city)

Citation

If you use this project, please cite:

@misc{simeoni_dinov3_2025,
	title = {{DINOv3}},
	url = {http://arxiv.org/abs/2508.10104},
	doi = {10.48550/arXiv.2508.10104},
	urldate = {2025-09-25},
	publisher = {arXiv},
	author = {Siméoni, Oriane and Vo, Huy V. and Seitzer, Maximilian and Baldassarre, Federico and Oquab, Maxime and Jose, Cijo and Khalidov, Vasil and Szafraniec, Marc and Yi, Seungeun and Ramamonjisoa, Michaël and Massa, Francisco and Haziza, Daniel and Wehrstedt, Luca and Wang, Jianyuan and Darcet, Timothée and Moutakanni, Théo and Sentana, Leonel and Roberts, Claire and Vedaldi, Andrea and Tolan, Jamie and Brandt, John and Couprie, Camille and Mairal, Julien and Jégou, Hervé and Labatut, Patrick and Bojanowski, Piotr},
	month = aug,
	year = {2025},
}

For the satellite-adapted DINOv3 variant, see: https://huggingface.co/timm/vit_large_patch16_dinov3.sat493m

License

MIT License - see LICENSE for details.

Data sources (Vienna and Graz orthophotos) are licensed under CC BY 4.0 by their respective cities.

Acknowledgments

DINOv2/DINOv3: Meta AI Research (Simeoni et al., 2025)
Satellite DINOv3: timm library and Ross Wightman
Vienna Orthofoto: Stadt Wien Open Government Data
Graz Orthofoto: Stadt Graz Open Government Data
OpenSeadragon: Deep zoom viewer for high-resolution imagery
Gradio: Interactive web interface framework

Contact

For questions or issues, please open an issue on GitHub.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.claude		.claude
.devcontainer		.devcontainer
.vscode		.vscode
assets		assets
config		config
docs		docs
logs		logs
notebooks		notebooks
scripts		scripts
src/satfinder		src/satfinder
static		static
.gitattributes		.gitattributes
.gitignore		.gitignore
.hfignore		.hfignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
app.py		app.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Satellite Structure Finder 🛰️

Features

Demo

Finding Cars in Vienna

Finding Pools in Graz

How It Works

Architecture

Algorithm

DINOv3: Self-Supervised Vision Transformer

Data Sources

Vienna Orthofoto 2024

Graz Orthofoto 2024

Quick Start

Installation

Run the App

Usage

Development

Project Structure

Key Commands

DevContainer

Technical Details

DINOv3 Feature Extraction

Similarity Scoring

Performance

Limitations

Citation

License

Acknowledgments

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages