AI Video Reframe

Batch convert landscape to portrait videos using AI.

forked from https://github.com/Sagar-lab03/AI-Content-Aware-Video-Cropping

Intro

This software will detect motion in the video and crop the content dynamically keeping the main subject in the frame. There are many settings, but the main ones are:

Speed vs Quality : You can select how many frames to use AI on. The more you do, the better the tracking but the longer it takes.
Aspect Ratio : Specify the size of the reframe.
FFMPEG : Used to keep audio and do custom functions like trimming the footage.

Requirements

Python 3
FFMPEG

This runs linux/macos shell scripts - but it easily can be run in windows if needed. Just read the run.sh file and run commands by hand.

Installation

Clone this repository:

git clone https://github.com/IORoot/AI_Video_Reframe

Setup virtual environment

cd AI_Video_Reframe
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:

pip install -r requirement.txt
#or individually:
pip install ultralytics opencv-python numpy tqdm

Run a single file

To accept the defaults and convert landscape 16:9 videos to 3:4 videos, do this:

./run.sh /Location/of/MP4/file.mp4

Changing the settings

The main line to run the code is:

python main.py --input "${FILENAME}" --output "${PROCESSED_FILENAME}" --model_size m --skip_frames 3 --smoothing_window 30 --conf_threshold 0.5 --use_saliency --max_workers 6 --target_ratio 0.75

Parameters

Parameter	Description	Default
`--input`	Path to input video file	Required
`--output`	Path to output video file	Required
`--target_ratio`	Target aspect ratio (width/height)	0.5625 (9:16)
`--model_size`	YOLOv8 model size (n, s, m, l, x)	n (nano)
`--skip_frames`	Process every Nth frame for detection	10
`--smoothing_window`	Number of frames for temporal smoothing	30
`--conf_threshold`	Confidence threshold for detections	0.5
`--use_saliency`	Enable saliency detection	False
`--max_workers`	Maximum number of worker threads	4

Model Size Selection Guide

Model Size	Flag	Description	Use Case
Nano (n)	`--model_size n`	Smallest and fastest model	Testing or low-power devices
Small (s)	`--model_size s`	Good balance for mobile devices	Mobile applications
Medium (m)	`--model_size m`	Balanced model	General purpose detection
Large (l)	`--model_size l`	Higher accuracy, slower speed	When accuracy is more important
XLarge (x)	`--model_size x`	Highest accuracy, slowest speed	When maximum accuracy is required

Scenarios:

I've found that if you want highly accurate (but very slow processing) video reframing, you need to do the following flags:

--skip_frames 0   # Use AI to detect movement on EVERY Frame
--model_size x    # Use biggest AI Model
--max_workers 8   # Or the number of CPU cores you have

Do a good job of tracking, but fast movements might not be caught:

--skip_frames 3   # Skip 3 frames, and then use AI on 1. Repeat.
--model_size m    # use the medium AI model
--max_workers 6   # 75% of All cores

Fast, but inaccurate tracking - good for low movement or interview videos:

--skip_frames 30  # on a 30fps video, use 1 frame per second.
--model_size s    # use a small AI Model
--max_workers 6   # 75% of cores

There are a lot more settings and the python code can be changed to make even more alterations too.

Batch Runs

Use like so:

./run_batch.sh /folder/with/videos/in/

This will find all mp4 files in any subdirectory within that folder. It will then create a new file called run_all_found_files.sh which lists every file and the run command against each one.

The reason that I prefer using this method rather than just a loop over each file and running it is because you can open up the run_all_found_files.sh file and check how far the batch has got through. It also allow you to cancel the process at any time and then start again (it will skip any already done) without a problem.

Once all videos are converted, the run_all_found_files.sh file is removed.

Output

The reframed videos will be in a subfolder within the directory of the found video file.

FFMPEG is used to copy the audio from the original to the reframed version since the main python code does not do that.

Credit

This code was originally from https://github.com/Sagar-lab03/AI-Content-Aware-Video-Cropping and all the AI work is theirs. I've slightly adapted it to include the bash scripts and FFMPEG bits for my own usage.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
docs		docs
.gitignore		.gitignore
README.md		README.md
crop_calculator.py		crop_calculator.py
download_models.py		download_models.py
main.py		main.py
object_detector.py		object_detector.py
object_tracker.py		object_tracker.py
requirements.txt		requirements.txt
run.sh		run.sh
run_GUI.sh		run_GUI.sh
run_batch.sh		run_batch.sh
run_with_trim.sh		run_with_trim.sh
smoothing.py		smoothing.py
utils.py		utils.py
video_processor.py		video_processor.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Video Reframe

Intro

Requirements

Installation

Run a single file

Changing the settings

Parameters

Model Size Selection Guide

Scenarios:

Batch Runs

Output

Credit

About

Releases

Packages

Languages

IORoot/AI_Video_Reframe

Folders and files

Latest commit

History

Repository files navigation

AI Video Reframe

Intro

Requirements

Installation

Run a single file

Changing the settings

Parameters

Model Size Selection Guide

Scenarios:

Batch Runs

Output

Credit

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages