Seonghyeon Go* · Yumin Kim*
MIPPIA Inc.
The SMP (Segment-based Music Plagiarism) dataset contains music plagiarism detection pairs with temporal segment annotations. Each row represents a pair of songs with identified similar segments.
| Column | Description |
|---|---|
ori_title |
Title of the original song |
comp_title |
Title of the comparison song |
ori_link |
YouTube link to the original song |
comp_link |
YouTube link to the comparison song |
relation |
Relationship type (plag for plagiarism) |
ori_times |
List of start times (in seconds) of similar segments in original song |
comp_times |
List of start times (in seconds) of similar segments in comparison song |
pair_number |
Unique identifier for song pairs |
acoustic_idx |
Unique identifier for segment pairs |
- Time annotations: JSON-formatted lists containing start times of similar segments
- Temporal alignment:
ori_timesandcomp_timescorrespond to matching similar segments between songs - Segment duration: Each segment represents a temporally coherent musical phrase or motif
- Total pairs: Multiple song pairs with plagiarism relationships
- Temporal annotations: Precise start times for similar musical segments
- Multi-language: Includes both English and Korean songs
Our code and demo website are licensed under a GPL License .