reelsmith

Turn a folder of photos and videos into a polished highlight reel — with one command.

AI plans the edit, FFmpeg renders locally at full resolution. Your raw media never leaves your machine — only compressed thumbnails and preview clips are sent to Gemini for planning. Rendering happens entirely on your GPU at 4K60 if you want. About $0.10/run with --model fast, $0.50 with balanced.

Features

AI plans, local renders — Gemini only sees 400px thumbnails and 480p 1 fps preview clips. Your original 4K photos and videos stay local. FFmpeg renders the final output from source files at any resolution you choose.
Sees and hears everything — despite the compression, Gemini sees every photo and watches every video clip with audio. It selects by visual and aural judgment, not metadata.
Per-segment AI music — Lyria RealTime generates mood-matched background tracks, crossfaded into one composite. Dynamic ducking around speech via sidechaincompress.
Beat-synced transitions — cuts snap to music beats via BPM detection. Speech segments are preserved without snapping.
GPU-accelerated — NVENC (Linux/Windows) and VideoToolbox (macOS) for encoding and decoding. Automatic fallback to CPU.
Rich terminal UI — live progress panel with per-stage status, sub-stage bars, cost tracking, and a summary table on completion.
Iterate fast — thumbnails and previews are cached. Re-planning is a single Gemini call. Re-rendering at a different resolution without another API call.

Quick Start

Prerequisites: Python 3.11+, FFmpeg, Gemini API key

git clone https://github.com/Guoyuer/reelsmith.git && cd reelsmith
python -m venv venv && source venv/bin/activate  # Windows: venv\Scripts\activate
pip install -e .
cp .env.example .env   # then add your GEMINI_API_KEY

Run the full pipeline:

reelsmith full -n my-trip -p ./photos --duration 60 --model balanced -r 1080p30

That's it. Output lands in workspace/runs/my-trip/output/.

Iteration Workflow

# 1. Fast draft
reelsmith full -n trip -p ./photos --duration 120 --model fast -r 720p30

# 2. Re-plan with tweaks
reelsmith plan -n trip --duration 90 --model balanced --style cinematic \
  --focus "street food close-ups; temple serenity"

# 3. Final render
reelsmith assemble -n trip -r 4k60

Commands

Command	What it does
`reelsmith full`	End-to-end: prepare → plan → music → assemble
`reelsmith prepare`	Scan media folder + generate thumbnails and previews
`reelsmith plan`	Re-plan with Gemini (reuses cached media)
`reelsmith assemble`	Re-render from existing EDL
`reelsmith config`	Show saved run config
`reelsmith workspace`	Disk usage and cleanup

Key Flags

Flag	Required	Description
`-n` / `--name`	yes	Run name (isolates workspace)
`-p` / `--path`	yes	Path to photos/videos folder
`--duration`	yes	Target length in seconds
`--model`	yes	`fast` ($0.24), `balanced` ($0.48), `quality` ($1.92)
`-r` / `--resolution`	yes	`4k60`, `1080p30`, `720p30`, or `WxHxFPS`
`--style`	no	`upbeat` (default), `cinematic`, `reflective`, `energetic`
`--trip-type`	no	`general` (default), `family`, `solo`, `food`, `adventure`, `architecture`. Recommended — improves narrative quality
`--focus`	no	Creative focus: `"family joy; exotic street markets"`
`--instruct`	no	Free-form Gemini instructions: `"no text overlays"`
`--lang`	no	`en` (default), `cn`, `both` — for titles and overlays
`--music`	no	`auto` (default), `none`, or `/path/to/track.mp3`

Run reelsmith full --help for all options.

Architecture

See docs/architecture.md for the full data flow diagram — inputs, caches, EDL, and render artifact paths across all 4 stages.

How It Works

prepare ──▸ plan ──▸ generate_music ──▸ assemble
  │          │          │            │                │
  │          │          │            │                ├─ per-segment FFmpeg render
  │          │          │            │                ├─ TS concat (no re-encode)
  │          │          │            ├─ Lyria music   ├─ beat sync + music ducking
  │          │          ├─ Gemini    │  per segment   └─ validation (6 checks)
  │          │          │  sees all  │
  │          ├─ thumbs  │  photos +  │
  ├─ scan    ├─ ffprobe │  watches   │
  │  folder  ├─ preview │  videos    │
  │          │  clips   │            │

Plan stage — Gemini receives photo thumbnails inline + a concatenated video preview (480p, with audio) via Files API. One API call returns a structured EDL (JSON) with narrative arc, item selection, trim points, transitions, effects, text overlays, and music moods. Postprocessing validates paths, clamps trim points, and deduplicates.

Assemble stage — Each segment rendered as a single FFmpeg filter_complex_script. Photos get cosine-eased Ken Burns effects with blurred background fill. Videos are trimmed and speed-ramped per the EDL. Segments concatenated via TS demuxer (no re-encode), then music mixed with sidechaincompress ducking (500ms release).

Requirements

	macOS	Linux	Windows
GPU encode	VideoToolbox	NVENC	NVENC
GPU decode	VideoToolbox	CUDA	CUDA
HEIC photos	native	pillow-heif	pillow-heif
CPU fallback	libx264	libx264	libx264

No local AI models needed — all inference runs via Gemini API.

Development

pip install -e ".[dev]"
pytest                       # unit tests (default)
pytest -m integration        # FFmpeg integration tests

Pre-commit hooks: ruff check --fix, ruff format, pytest.

License

Apache 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 799 Commits
.github/workflows		.github/workflows
docs		docs
pipeline		pipeline
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

reelsmith

Features

Quick Start

Iteration Workflow

Commands

Key Flags

Architecture

How It Works

Requirements

Development

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

reelsmith

Features

Quick Start

Iteration Workflow

Commands

Key Flags

Architecture

How It Works

Requirements

Development

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages