Skip to content

helplanes/audio_chunker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Audio Chunker Tool

A Python CLI tool to split audio files into chunks based on silence detection. Perfect for preparing datasets for machine learning (TTS, ASR) or simply breaking down long recordings.

Features

  • Smart Splitting: Uses pydub's split_on_silence to find natural breaks in audio.
  • Configurable: Adjust silence threshold, minimum silence length, and padding.
  • Robust: Handles various audio formats (wav, mp3, flac, etc.) supported by ffmpeg.
  • User-Friendly: Interactive CLI with progress bars and colorful output.

Prerequisites

  1. Python 3.6+
  2. FFmpeg: This tool requires FFmpeg for audio processing.
    • macOS: brew install ffmpeg
    • Ubuntu/Debian: sudo apt install ffmpeg
    • Windows: Download and add to PATH.

Installation

  1. Clone this repository:

    git clone <your-repo-url>
    cd audio_chunker
  2. Install Python dependencies:

    pip install -r requirements.txt

Usage

Usage

First, activate your virtual environment (if you haven't already):

source venv/bin/activate

Then run the script:

python chunker.py /path/to/your/audio_file.wav

Options

| Option | Flag | Default | Description | | hum | --- | --- | --- | | Output Directory | -o, --output-dir | chunks | Folder to save the output files. | | Min Silence Len | -m, --min-silence-len | 1000 | Minimum length of silence (ms) to split on. | | Silence Threshold | -t, --silence-thresh | -40 | Threshold (dBFS) to consider as silence. | | Keep Silence | -k, --keep-silence | 200 | Silence (ms) to keep at start/end of chunks. | | Prefix | -p, --prefix | chunk | Filename prefix for chunks (e.g., chunk_001.wav). | | Format | -f, --format | wav | Output format (wav, mp3, etc.). |

Examples

Basic split:

python chunker.py interview.mp3

Custom strictness (shorter silence allowed, quieter threshold):

python chunker.py podcast.wav -m 500 -t -50 -o output_folder

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages