Song Extractor

This directory contains a Python script (songSplitter.py) to automatically transcribe an audio file, split it into individual word segments, and generate a JSON mapping file. This is useful for creating datasets for projects that need audio corresponding to specific words.

This one was used as a part of April Fools' day where Twitch chat could play song, word by word with typing their messages.

Requirements

Python 3.x
OpenAI Whisper: For audio transcription.
pydub: For audio manipulation (splitting).
ffmpeg: Required by pydub for handling various audio formats (like MP3). Ensure ffmpeg is installed and accessible in your system's PATH.

You can install the Python libraries using pip:

pip install -U openai-whisper pydub

Usage

The songSplitter.py script takes the path to an audio file as input and performs the transcription and splitting process.

python extractor/songSplitter.py <audio_file_path> [options]

Arguments

<audio_file_path>: (Required) Path to the input audio file (e.g., rickroll.mp3).
-o, --output_dir: Directory to save the segmented audio files (defaults to output).
-j, --json_path: Path to save the output JSON mapping file (defaults to splitsong.json).
-m, --model: Whisper model name to use for transcription (e.g., tiny, base, small, medium, large). Defaults to medium. Larger models are more accurate but require more resources (VRAM/RAM) and time.

Example

Let's say you have the Rick Roll song saved as rickroll.mp3 in the extractor directory parent directory. To process it using the base model and save the results in a directory named rickroll_words:

python extractor/songSplitter.py ../rickroll.mp3 -m base -o rickroll_words -j rickroll_map.json

Output

The script will generate:

Segmented Audio Files: Inside the specified output directory (output or --output_dir), you will find numerous small MP3 files (e.g., 000.mp3, 001.mp3, 002.mp3, ...), each corresponding to a word detected in the original audio.
JSON Mapping File: A JSON file (splitsong.json or --json_path) containing a list of objects, where each object maps a detected word (lowercase) to its corresponding audio segment file path.

Example splitsong.json structure:

[
  {
    "word": "we're",
    "sound": "output/000.mp3"
  },
  {
    "word": "no",
    "sound": "output/001.mp3"
  },
  {
    "word": "strangers",
    "sound": "output/002.mp3"
  },
  {
    "word": "to",
    "sound": "output/003.mp3"
  },
  // ... more words
]

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
README.md		README.md
songSplitter.py		songSplitter.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Song Extractor

Requirements

Usage

Arguments

Example

Output

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Song Extractor

Requirements

Usage

Arguments

Example

Output

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages