🎬 Subject Frame Extractor

An AI-powered powerhouse for extracting, analyzing, and filtering high-quality frames from video.

Designed for content creators, dataset builders (LoRA/Dreambooth), and researchers. This tool bridges the gap between raw video footage and curated, high-quality image datasets using state-of-the-art AI.

✨ Overview

Traditional frame extraction is noisy. Subject Frame Extractor uses advanced segmentation and quality heuristics to ensure you only keep the frames that matter.

Intelligent Extraction: Beyond simple intervals—use scene detection and keyframe awareness.
Multi-Class Tracking: Automatically find and track any of 80 COCO objects (people, cars, animals, etc.) using YOLO26 and SAM3.
Scene-Level Deduplication: Automated extraction of the single best frame per shot.
Quality First: Filter by sharpness, contrast, and perceptual quality (NIQE).
Face Matching: Find every frame of a specific person using InsightFace.

🚀 Key Features

🎯 Smart Extraction

Extraction Strategies: Keyframes, fixed intervals, scene-based, or every Nth frame.
YouTube Integration: Direct URL processing with resolution control.
Scene Intelligence: Automatically segments video into shots to optimize analysis.

🧠 Advanced AI Analysis

SAM 3 Integration: Precise subject segmentation and tracking across scenes.
Open-Vocabulary Detection: Describe what you want to find (e.g., "a golden retriever") and let the AI find it.
Face Analysis: Similarity matching, blink detection, and head pose estimation (yaw/pitch/roll).
Perceptual Metrics: Real-time quality scoring to surface the "best" frames automatically.

🔍 Filtering & Export

Interactive Sliders: Filter thousands of frames in real-time based on AI-calculated metrics.
Smart Deduplication: Uses pHash and LPIPS to remove near-identical frames.
AR-Aware Cropping: Export subject-centered crops in 1:1, 9:16, 16:9, or custom ratios.

📸 Photo Mode (Culling)

RAW Support: Extract high-resolution embedded previews from RAW files (CR2, NEF, ARW, DNG, ORF, etc.) using ExifTool. No demosaicing required for ultra-fast ingestion.
Quality Culling: AI-powered scoring for focus, composition, and technical quality.
- Sharpness: Laplacian variance based edge-detection to identify focused shots.
- Naturalness (NIQE): Perceptual quality score that measures how "natural" an image looks without needing a reference.
- Information (Entropy): Measures the complexity/detail density of the image.
- Face Prominence: Uses InsightFace to detect faces and score them based on confidence and size.
Lightroom/C1 Interop: Export internal scores as 1-5 star ratings directly to non-destructive XMP sidecars.

🛠️ Tech Stack

Segmentation: Segment Anything Model 3 (SAM 3)
Face Analysis: InsightFace
UI Framework: Gradio 6.x
Data Science: PyTorch, NumPy, OpenCV, Pydantic
Media Handling: FFmpeg, yt-dlp
Database: SQLite (for lightning-fast metadata filtering)

💻 Installation & Setup

Prerequisites

Python 3.10+ (3.12 recommended)
FFmpeg installed and in your system PATH.
CUDA-capable GPU (highly recommended; ~8GB VRAM for SAM 3).

Quick Start (using `uv`)

We highly recommend uv for its speed and reliability.

Clone with Submodules

git clone --recursive https://github.com/tazztone/subject-frame-extractor.git
cd subject-frame-extractor

Note: Use git submodule update --init --recursive if already cloned.

Sync Environment
```
uv sync
```
Launch
```
uv run python app.py
```
Alternatively, on Linux, use: ./scripts/linux_run_app.sh

Access the UI at http://127.0.0.1:7860.

Manual Setup (vEnv)

python -m venv venv
Activate: . venv/bin/activate (Linux/Mac) or . venv\Scripts\activate.ps1 (Windows)
pip install -r requirements.txt
pip install -e SAM3_repo

⌨️ CLI Usage

The application provides a powerful CLI for automated extraction, analysis, and headless operation. Always use uv run to ensure the correct environment.

Extraction

Extract thumbnails and detect scenes from a video:

uv run python cli.py extract --video path/to/video.mp4 --output ./results --nth-frame 10

Caching: Subsequent runs with identical settings will skip automatically using fingerprints.
Force: Use --force to re-extract even if a fingerprint match is found.
Clean: Use --clean to delete the output directory before starting.

Analysis

Run the full AI pipeline (seeding, tracking, metrics) on an existing extraction:

uv run python cli.py analyze --session ./results --video path/to/video.mp4 --face-ref person.png --resume

Resume: Use --resume to skip already completed scenes (uses progress.json).

Full Pipeline

Run extraction and analysis in one command:

uv run python cli.py full --video video.mp4 --output ./results --face-ref person.png

Status

Check the progress and metadata of a session:

uv run python cli.py status --session ./results

Photo Mode (Culling)

Process image folders and sync ratings to sidecars:

# 1. Ingest folder (crawls images, extracts RAW previews)
uv run python cli.py photo ingest --folder /path/to/raws --output ./photo_session

# 2. Score photos (sharpness, naturalness, face prominence, etc.)
uv run python cli.py photo score --session ./photo_session

# 3. Export XMP sidecars (Ratings & Labels compatible with Lightroom)
uv run python cli.py photo export --session ./photo_session

📖 Usage Guide

Source: Upload a video or paste a YouTube URL. Choose your extraction resolution.
Extract: Run the extraction. The tool identifies scenes and generates thumbnails.
Define Subject:
- Hybrid Seeding: Combine face reference with text descriptions and YOLO mask prompts for robust initialization.
- By Face: Upload a reference photo for similarity matching.
- By Text: Enter a description (e.g., "cat", "person in red").
- Auto: Let the AI select the most prominent subject.
Analyze: Review "Scene Seeds". Run Propagation to track subjects through the video.
Filter: Use sliders in the Metrics & Filtering tab to curate your dataset.
Export: Select your crop settings and aspect ratio, then hit Export.

For detailed information on architecture, critical rules (Agent Memory), development workflows, and testing, please refer to the AGENTS.md.

⚙️ Configuration Reference

See core/config.py for the full schema.

Category	Key Fields	Default
Paths	`logs_dir`, `models_dir`, `downloads_dir`	`logs`, `models`, `downloads`
Models	`face_model_name`, `tracker_model_name`	`buffalo_l`, `sam3`
Performance	`analysis_default_workers`, `cache_size`	`4`, `200`
Quality	`quality_weights_*`	(Variable Weights)

📄 License

MIT License. See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 1,032 Commits
.agents/skills/huggingface-gradio		.agents/skills/huggingface-gradio
.github/workflows		.github/workflows
SAM3_repo @ bfbed07		SAM3_repo @ bfbed07
core		core
docs		docs
scripts		scripts
tests		tests
ui		ui
.env_example		.env_example
.geminiignore		.geminiignore
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
app.py		app.py
cli.py		cli.py
pyproject.toml		pyproject.toml
skills-lock.json		skills-lock.json
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎬 Subject Frame Extractor

✨ Overview

🚀 Key Features

🎯 Smart Extraction

🧠 Advanced AI Analysis

🔍 Filtering & Export

📸 Photo Mode (Culling)

🛠️ Tech Stack

💻 Installation & Setup

Prerequisites

Quick Start (using `uv`)

Manual Setup (vEnv)

⌨️ CLI Usage

Extraction

Analysis

Full Pipeline

Status

Photo Mode (Culling)

📖 Usage Guide

⚙️ Configuration Reference

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🎬 Subject Frame Extractor

✨ Overview

🚀 Key Features

🎯 Smart Extraction

🧠 Advanced AI Analysis

🔍 Filtering & Export

📸 Photo Mode (Culling)

🛠️ Tech Stack

💻 Installation & Setup

Prerequisites

Quick Start (using uv)

Manual Setup (vEnv)

⌨️ CLI Usage

Extraction

Analysis

Full Pipeline

Status

Photo Mode (Culling)

📖 Usage Guide

⚙️ Configuration Reference

📄 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Quick Start (using `uv`)

Packages