Skip to content

Conversation

Copy link

Copilot AI commented Dec 21, 2025

Implements automated workflow for ripping DVDs, Blu-Ray, and UltraHD discs using makemkvcon, with intelligent content detection, track selection, online disc identification, metadata-based file naming, and secure API key management via environment variables.

Core Implementation

  • rip_disc.py (900+ lines): Main Python script handling disc scanning, content identification, track selection, encoding, and file naming

    • Parses makemkvcon output to extract disc structure and track metadata
    • Distinguishes movies (45+ min) from TV episodes (20-60 min) using duration thresholds
    • Selects highest resolution video, English audio (stereo + surround), and English subtitles
    • Copies streams without re-encoding to preserve quality
    • YAML configuration support with command-line overrides
    • Online disc identification via MusicBrainz
    • Online metadata lookup via TMDB (The Movie Database)
    • Secure API key management via environment variables
  • Shell wrappers: rip_movie.sh, rip_tv.sh - Simplified interfaces for common use cases

  • Batch scripts: batch_rip_movies.sh, batch_rip_tv_season.sh - Process multiple discs sequentially

Online Identification Features

OnlineDiscIdentifier Class:

  • Disc ID calculation for MusicBrainz disc lookups
  • TMDB movie metadata search with year filtering
  • TMDB TV series metadata search with accurate naming
  • Automatic metadata retrieval: titles, years, genres, ratings, plot summaries, IMDB/TMDB IDs
  • Safe date parsing with validation
  • Graceful fallback to disc title if online services unavailable

Benefits:

  • Accurate titles with proper capitalization (e.g., "inception" → "Inception (2010).mkv")
  • Correct release years even when not specified
  • Rich metadata stored (genres, ratings, descriptions)
  • Professional file naming using official titles

Configuration

YAML-based configuration with sensible defaults:

disc:
  default_path: "disc:0"
  default_temp_dir: "/tmp/makemkv"

encoding:
  include_english_subtitles: true
  include_stereo_audio: true
  include_surround_audio: true
  min_movie_duration_seconds: 2700
  min_episode_duration_seconds: 1200

metadata:
  lookup_enabled: true
  # API key via environment variable recommended (see below)
  tmdb_api_key: ""

API Key Configuration (Secure Method)

Recommended: Environment Variable

# Linux/macOS - Add to ~/.bashrc or ~/.zshrc for persistence
export TMDB_API_KEY="your_api_key_here"

# Windows (Command Prompt)
set TMDB_API_KEY=your_api_key_here

# Windows (PowerShell)
$env:TMDB_API_KEY="your_api_key_here"

Alternative: Config File (Less Secure)

metadata:
  tmdb_api_key: "your_api_key_here"

The environment variable takes precedence over the config file. Using environment variables prevents accidental commits of API keys to version control.

Usage

# Set API key via environment variable (recommended)
export TMDB_API_KEY="your_api_key_here"

# Single movie with online metadata lookup
./rip_movie.sh --title "inception" --output ~/Movies
# Output: Inception (2010).mkv (with correct formatting from TMDB)

# TV series disc with online lookup
./rip_tv.sh --title "breaking bad" --season 1 --output ~/TV
# Output: Breaking Bad - S01E01.mkv (properly formatted)

# With configuration file
./rip_disc.py --config config.yaml --output ~/Movies --title "Movie"

Security & Quality

  • Secure API key management: Environment variables preferred over config files
  • Cross-platform dependency checking via shutil.which()
  • Array-based command execution (no eval)
  • Module-level constants for duration thresholds
  • Proper filename sanitization (Windows/Linux reserved + control chars)
  • Empty title list handling to prevent ValueError
  • Safe date parsing with validation for TMDB responses
  • Proper error handling for network requests
  • Backward compatible: Config file API keys still supported with migration warnings

Dependencies

  • MakeMKV (makemkvcon)
  • FFmpeg (ffmpeg, ffprobe)
  • Python 3.7+ with PyYAML, requests, and tmdbsimple
  • TMDB API key (free, optional but recommended for metadata lookup)
Original prompt

Generate scripts to use makemkvcon to rip DVDs, Blu-Ray discs and/or Blu-Ray UltraHD discs. It should determine which tracks are the main movie if the disc is a movie disc or find all the episodes if it is for a TV series. It should then re-encode the files to MKV at the highest resolution available and include english subtitles if they exist. Audio tracks should include the stereo and surround sound if available. It should attempt to look up the media files metadata online and rename them .


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI and others added 5 commits December 21, 2025 05:56
Co-authored-by: mapitman <4205286+mapitman@users.noreply.github.com>
Co-authored-by: mapitman <4205286+mapitman@users.noreply.github.com>
Co-authored-by: mapitman <4205286+mapitman@users.noreply.github.com>
Co-authored-by: mapitman <4205286+mapitman@users.noreply.github.com>
…aceholders

Co-authored-by: mapitman <4205286+mapitman@users.noreply.github.com>
Copilot AI changed the title [WIP] Add scripts for ripping DVDs and Blu-Ray discs with metadata Add makemkvcon disc ripping and encoding automation with YAML configuration Dec 21, 2025
Copilot AI requested a review from mapitman December 21, 2025 06:13
@mapitman
Copy link
Owner

@copilot how about disc and track identification using an online database?

Copilot AI and others added 2 commits December 21, 2025 06:24
Co-authored-by: mapitman <4205286+mapitman@users.noreply.github.com>
Co-authored-by: mapitman <4205286+mapitman@users.noreply.github.com>
Copy link
Author

Copilot AI commented Dec 21, 2025

@copilot how about disc and track identification using an online database?

Implemented online disc and track identification! The solution includes:

Disc Identification:

  • MusicBrainz disc ID calculation and lookup
  • Works with disc devices to calculate unique disc IDs

Track/Metadata Identification:

  • TMDB (The Movie Database) integration for movies and TV series
  • Automatic metadata retrieval: titles, years, genres, ratings, plot summaries
  • Intelligent search with year filtering for better accuracy

Usage:

  1. Get a free TMDB API key from https://www.themoviedb.org/settings/api
  2. Add to config.yaml:
metadata:
  lookup_enabled: true
  tmdb_api_key: "your_key"
  1. Run normally - metadata is fetched automatically:
./rip_disc.py --config config.yaml --title "inception" --output ~/Movies
# Output: Inception (2010).mkv (with correct capitalization from TMDB)

Benefits:

  • Accurate titles with proper formatting
  • Correct years even if you don't specify them
  • Rich metadata stored (genres, ratings, IMDB IDs)
  • Graceful fallback to disc title if lookup fails

Commits: 5d8830f, 13ef8ea

Copilot AI changed the title Add makemkvcon disc ripping and encoding automation with YAML configuration Add makemkvcon disc ripping and encoding automation with YAML configuration and online metadata lookup Dec 21, 2025
@mapitman
Copy link
Owner

@copilot Read any secrets like API KEYs from an environment variable instead of a config file

Copilot AI and others added 2 commits December 21, 2025 06:37
Co-authored-by: mapitman <4205286+mapitman@users.noreply.github.com>
Co-authored-by: mapitman <4205286+mapitman@users.noreply.github.com>
Copy link
Author

Copilot AI commented Dec 21, 2025

@copilot Read any secrets like API KEYs from an environment variable instead of a config file

Implemented! API keys are now read from environment variables for better security.

Changes:

  • TMDB API key read from TMDB_API_KEY environment variable (preferred)
  • Falls back to config file for backward compatibility
  • Warning displayed when using config file to encourage migration

Usage:

# Linux/macOS
export TMDB_API_KEY="your_api_key_here"

# Windows CMD
set TMDB_API_KEY=your_api_key_here

# Windows PowerShell
$env:TMDB_API_KEY="your_api_key_here"

Benefits:

  • API keys won't be accidentally committed to version control
  • Follows security best practices
  • Works across multiple projects without duplicating keys
  • Existing config files still work (no breaking changes)

Commits: 9689d6e, 2117c3c

Copilot AI changed the title Add makemkvcon disc ripping and encoding automation with YAML configuration and online metadata lookup Add makemkvcon disc ripping and encoding automation with YAML configuration, online metadata lookup, and secure API key management Dec 21, 2025
…l Ctrl+C exit\n\n- Removed terminal clear at startup to avoid wiping the screen\n- Suppressed progress bar init log for cleaner output\n- Handle KeyboardInterrupt: print friendly message and exit 130
…e\n\n- Refresh batch scripts for movies and TV\n- Update README usage and notes\n- Adjust requirements.txt for Rich integration\n- Add Makefile targets for rip workflows\n- Exclude run log from commits
@mapitman mapitman requested a review from Copilot December 21, 2025 12:04
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements a comprehensive automated workflow for ripping and encoding DVDs, Blu-Ray, and UltraHD discs using MakeMKV. The implementation includes intelligent content detection, online metadata lookup via TMDB, and secure API key management through environment variables.

Key changes:

  • Main Python script with 1400+ lines handling disc scanning, track selection, encoding, and metadata-based file naming
  • Online disc identification using MusicBrainz and TMDB for accurate titles and metadata
  • Secure API key management with environment variable support prioritized over config files
  • Shell wrappers and batch processing scripts for streamlined workflows

Reviewed changes

Copilot reviewed 9 out of 11 changed files in this pull request and generated 35 comments.

Show a summary per file
File Description
rip_disc.py Core implementation with disc scanning, content identification, track selection, encoding, online metadata lookup, and secure API key handling
rip_movie.sh Shell wrapper for single movie disc ripping with argument parsing
rip_tv.sh Shell wrapper for TV series disc ripping with season/episode handling
batch_rip_movies.sh Batch processing script for multiple movie discs
batch_rip_tv_season.sh Batch processing script for TV season discs
requirements.txt Python dependencies including PyYAML, requests, tmdbsimple, discid, and rich
config.example.yaml Example configuration with metadata lookup settings and API key documentation
Makefile Build automation for virtual environment and dependency management
README.md Comprehensive documentation with usage examples and troubleshooting
EXAMPLES.md Detailed usage examples for various scenarios
.gitignore Git ignore rules including config files with secrets

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

if dur > 0:
est_title_bytes = int(dur * 2.5 * 1024 * 1024)
# Use tqdm-aware logging to avoid interfering with the bar line
write_log(f"📊 Estimated title {title_id} size (approx.): {self.human_bytes(est_title_bytes)}", 'info')
Copy link

Copilot AI Dec 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The write_log function is called on line 780 but defined later on line 784. This will cause a NameError because the function doesn't exist at the point it's called. Move the function definition before its first use (before line 773).

Copilot uses AI. Check for mistakes.
except Exception:
pass
# Helper to print logs using Rich console
def write_log(msg: str, level: str = 'info'):
Copy link

Copilot AI Dec 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The write_log function references live variable which is only defined inside the if RICH_AVAILABLE: block (line 726). When Rich is not available, calling this function will raise a NameError. Consider adding a check for RICH_AVAILABLE inside the function, or ensure live is initialized to None before the if block and handle the None case properly.

Suggested change
def write_log(msg: str, level: str = 'info'):
def write_log(msg: str, level: str = 'info'):
# If Rich is not available, always fall back to the standard logger
if not RICH_AVAILABLE:
if level == 'info':
logger.info(msg)
elif level == 'warning':
logger.warning(msg)
elif level == 'error':
logger.error(msg)
else:
logger.debug(msg)
return

Copilot uses AI. Check for mistakes.
Comment on lines +677 to +717
def _get_rows_cols():
try:
fd = os.open('/dev/tty', os.O_RDONLY)
try:
buf = struct.pack('hhhh', 0, 0, 0, 0)
res = fcntl.ioctl(fd, termios.TIOCGWINSZ, buf)
rows, cols, _, _ = struct.unpack('hhhh', res)
if rows and cols:
return int(rows), int(cols)
finally:
os.close(fd)
except Exception:
pass
try:
env_cols = int(os.environ.get('COLUMNS', '0'))
env_rows = int(os.environ.get('LINES', '0'))
if env_cols > 0 and env_rows > 0:
return env_rows, env_cols
except Exception:
pass
try:
out = subprocess.run('stty size < /dev/tty', shell=True, capture_output=True, text=True)
if out.returncode == 0 and out.stdout.strip():
r, c = out.stdout.strip().split()
return int(r), int(c)
except Exception:
pass
return 24, 80

# Create Rich console for all output, write to /dev/tty to bypass tee
try:
rows, cols = _get_rows_cols()
except Exception:
rows, cols = (24, 80)

try:
tty_file = open('/dev/tty', 'w')
console = Console(file=tty_file, force_terminal=True, width=cols)
except Exception:
console = Console(force_terminal=True, width=cols)
tty_file = None
Copy link

Copilot AI Dec 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code attempts to open /dev/tty which is Unix-specific and won't exist on Windows systems. While the exception is caught, this makes the code non-functional on Windows. Consider adding a platform check and using Windows equivalents (like CON) or disabling Rich progress bars gracefully on Windows.

Suggested change
def _get_rows_cols():
try:
fd = os.open('/dev/tty', os.O_RDONLY)
try:
buf = struct.pack('hhhh', 0, 0, 0, 0)
res = fcntl.ioctl(fd, termios.TIOCGWINSZ, buf)
rows, cols, _, _ = struct.unpack('hhhh', res)
if rows and cols:
return int(rows), int(cols)
finally:
os.close(fd)
except Exception:
pass
try:
env_cols = int(os.environ.get('COLUMNS', '0'))
env_rows = int(os.environ.get('LINES', '0'))
if env_cols > 0 and env_rows > 0:
return env_rows, env_cols
except Exception:
pass
try:
out = subprocess.run('stty size < /dev/tty', shell=True, capture_output=True, text=True)
if out.returncode == 0 and out.stdout.strip():
r, c = out.stdout.strip().split()
return int(r), int(c)
except Exception:
pass
return 24, 80
# Create Rich console for all output, write to /dev/tty to bypass tee
try:
rows, cols = _get_rows_cols()
except Exception:
rows, cols = (24, 80)
try:
tty_file = open('/dev/tty', 'w')
console = Console(file=tty_file, force_terminal=True, width=cols)
except Exception:
console = Console(force_terminal=True, width=cols)
tty_file = None
if os.name != 'nt':
def _get_rows_cols():
try:
fd = os.open('/dev/tty', os.O_RDONLY)
try:
buf = struct.pack('hhhh', 0, 0, 0, 0)
res = fcntl.ioctl(fd, termios.TIOCGWINSZ, buf)
rows, cols, _, _ = struct.unpack('hhhh', res)
if rows and cols:
return int(rows), int(cols)
finally:
os.close(fd)
except Exception:
pass
try:
env_cols = int(os.environ.get('COLUMNS', '0'))
env_rows = int(os.environ.get('LINES', '0'))
if env_cols > 0 and env_rows > 0:
return env_rows, env_cols
except Exception:
pass
try:
out = subprocess.run('stty size < /dev/tty', shell=True, capture_output=True, text=True)
if out.returncode == 0 and out.stdout.strip():
r, c = out.stdout.strip().split()
return int(r), int(c)
except Exception:
pass
return 24, 80
else:
# On Windows, avoid using /dev/tty and rely on standard terminal size detection.
import shutil
def _get_rows_cols():
try:
size = shutil.get_terminal_size(fallback=(80, 24))
# shutil returns (cols, rows)
return int(size.lines), int(size.columns)
except Exception:
return 24, 80
# Create Rich console for all output, write to /dev/tty to bypass tee where supported
try:
rows, cols = _get_rows_cols()
except Exception:
rows, cols = (24, 80)
tty_file = None
if os.name != 'nt':
try:
tty_file = open('/dev/tty', 'w')
console = Console(file=tty_file, force_terminal=True, width=cols)
except Exception:
console = Console(force_terminal=True, width=cols)
tty_file = None
else:
# On Windows, do not attempt to open /dev/tty; let Rich handle the console.
console = Console(force_terminal=True, width=cols)

Copilot uses AI. Check for mistakes.
- Typically this is the main feature film

**For TV Series:**
- Finds all titles between 20-50 minutes
Copy link

Copilot AI Dec 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation states TV episodes are "20-50 minutes" but the code constant MAX_EPISODE_DURATION_SECONDS = 3600 is 60 minutes, and the comment on line 594 also says "20-60 minutes each". Update the documentation to match the actual implementation (20-60 minutes).

Suggested change
- Finds all titles between 20-50 minutes
- Finds all titles between 20-60 minutes

Copilot uses AI. Check for mistakes.
Comment on lines +947 to +948
if tty_file:
tty_file.close()
Copy link

Copilot AI Dec 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable tty_file is referenced on line 947 but may be undefined when RICH_AVAILABLE is False. The variable is only initialized inside the if RICH_AVAILABLE: block (lines 712-717), so accessing it outside that block when Rich is not available will raise a NameError. Initialize tty_file = None before the if RICH_AVAILABLE: block to fix this issue.

Copilot uses AI. Check for mistakes.
est_title_bytes = int(dur * 2.5 * 1024 * 1024)
# Use tqdm-aware logging to avoid interfering with the bar line
write_log(f"📊 Estimated title {title_id} size (approx.): {self.human_bytes(est_title_bytes)}", 'info')
except Exception:
Copy link

Copilot AI Dec 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'except' clause does nothing but pass and there is no explanatory comment.

Copilot uses AI. Check for mistakes.
Comment on lines +940 to +941
except Exception:
pass
Copy link

Copilot AI Dec 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'except' clause does nothing but pass and there is no explanatory comment.

Suggested change
except Exception:
pass
except Exception as exc:
logging.getLogger(__name__).debug(
"Non-fatal error while updating rip progress UI",
exc_info=exc,
)

Copilot uses AI. Check for mistakes.
Comment on lines +953 to +954
except Exception:
pass
Copy link

Copilot AI Dec 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'except' clause does nothing but pass and there is no explanatory comment.

Suggested change
except Exception:
pass
except Exception as e:
logger.warning("Failed to restore previous SIGWINCH handler: %s", e)

Copilot uses AI. Check for mistakes.
Comment on lines +1235 to +1236
except Exception:
pass
Copy link

Copilot AI Dec 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'except' clause does nothing but pass and there is no explanatory comment.

Suggested change
except Exception:
pass
except Exception as e:
logger.debug(f"Failed to extract title from filename {file_path}: {e}")

Copilot uses AI. Check for mistakes.
last_pct_logged = pct_int
except Exception:
logger.debug(f"Could not parse progress line: {line}")
pass
Copy link

Copilot AI Dec 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary 'pass' statement.

Suggested change
pass

Copilot uses AI. Check for mistakes.
…mmand handling\n\n- Use context manager + run_command_with_progress()\n- Keep ffmpeg binary in command; strip -progress pipe:1\n- Show percent updates via Rich bar writing to /dev/tty\n- Add package to requirements.txt
- Change WARNING level display to 'WARN' for cleaner output
- Upgrade missing library logs to warning level for expected dependencies (tmdbsimple, ffmpeg-progress-yield)
- Remove success log for Rich library (only warn when missing)
- Track actual CINFO source (CINFO:0 vs CINFO:1) and report in disc type detection logs
- Add TV-specific Blu-ray bitrate fallback (4.3 MB/s) for more accurate TV episode estimates
- Add warning when TV mode detects few episodes but finds a long title (suggests movie disc)
- Add --disc-type option to rip_movie.sh and rip_tv.sh for manual override
- Add --year option to rip_tv.sh for TV special identification
- Update README.md with disc type override documentation and examples
- Document size estimation behavior and fallback rates
- Remove 'make shell' target; 'make activate' now launches an activated shell
- Add 'make rip-tv' target alongside 'make rip-movie'
- Set bash as recipe shell for consistency (SHELL := /bin/bash)
- Document TMDB_API_KEY and OMDB_API_KEY environment variables
- Clarify OMDB is primary metadata source, TMDB is fallback
- Add Mermaid diagrams: workflow, architecture, and activation sequence
- Update all examples to use Make targets
- Remove Windows-specific documentation (PowerShell, cmd)
- Improve diagram readability with explicit dark text colors
- Update EXAMPLES.md with Make target usage examples
- Reorder help targets by logical workflow: Setup → Activation → Usage → Cleanup
- Clarify that 'make all' and 'make install' are equivalent
- Note that 'make venv' creates environment only (without dependencies)
- Add practical examples for rip-movie and rip-tv with OUTPUT and EXTRA_ARGS
- Improve readability with section headers and spacing
rip_disc.py: Suppress verbose MakeMKV scan output after first title, add concise progress logs (per-title captions shown once, include position in progress bar), and show percentage updates as titles rip. Report found title count upfront. Add informative logs when OMDB/TMDB lookups succeed; warn when falling back to disc title. Support starting episode numbering via new --episode-start CLI flag and propagate through process_disc() and rename flow.

rip_tv.sh: Add --episode-start option, propagate to Python script, and update help text for clarity.

README.md: Enable Mermaid theme initialization for activation sequence diagram for better visual consistency.
@mapitman
Copy link
Owner

I decided to do this in .NET instead.

@mapitman mapitman closed this Jan 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants