Skip to content

ChenyuGAO-CS/Variation-Transformer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Variation-Transformer

This repository contains the source code to reproduce the results of the paper: Gao et al., Variation Transformer: New datasets, models, and comparative evaluation for symbolic music variation generation, in ISMIR 2024.

Demo page

https://variation-transformer.glitch.me or https://chenyugao-cs.github.io/variation-transformer-demo/ All the materials used in our listening study have been uploaded to our demo page too.

Trained models and datasets

https://github.com/ChenyuGAO-CS/Variation-Transformer-Data-and-Model

Dependencies

pip install -r requirements.txt

If you are interested in testing/training the fast-Transformer model, you will also need to install:

pip install pytorch-fast-transformers==0.4.0

Reproducing Results

To generate variations using models trained by us, please visit this page to download corresponding models and datasets, and store them in corresponding folders (e.g., we store models in './trained_models/' folder, and datasets in './dataset/' folder).

We will show examples of how to use a model trained on the POP909-TVar dataset to generate a variation below.

Please change --lm and --input if you would like to try models trained on the VGMIDI-TVar dataset or other themes as input.

1. Generate a variation by using the Variation Transformer

Run the script gen_VaTr_var_user_input.py from the workspace folder.

$ python3 workspace/gen_VaTr_var_user_input.py --lm trained_models/VaTr_pop909_epoch_10.pth \ 
                           --seq_len 1025 \ 
                           --n_bars 16 \
                           --p 0.9 \
                           --save_to out/VaTr_variation1.mid \
                           --input dataset/POP909-TVar/test/052_B_0.mid

2. Generate a variation by using the Music Transformer

Run the script gen_MuTr_var_user_input.py from the workspace folder.

$ python3 workspace/gen_MuTr_var_user_input.py --lm trained_models/MuTr_pop909_epoch_10.pth \ 
                           --seq_len 1025 \ 
                           --n_bars 16 \
                           --p 0.9 \
                           --save_to out/MuTr_variation1.mid \
                           --input dataset/POP909-TVar/test/052_B_0.mid

3. Generate a variation by using the fast-Transformer

Run the script gen_FaTr_var_user_input.py from the workspace folder.

$ python3 workspace/gen_FaTr_var_user_input.py --lm trained_models/FaTr_pop909_epoch_10.pth \ 
                           --seq_len 1025 \ 
                           --n_bars 16 \
                           --p 0.9 \
                           --save_to out/FaTr_variation1.mid \
                           --input dataset/POP909-TVar/test/052_B_0.mid

4. Variation Markov

Please follow the guidance on this page to use Variation Markov.

Model Training

1. Download the datasets

Please visit this page to download POP909-TVar or VGMIDI-TVar.

2. Data preprocessing for variation generation:

The pre-processing step consists of augmenting the data, encoding it with REMI and compiling the encoded pieces as a numpy array. Please find scripts for data pre-processing in the 'dataset' folder.

Go to the 'dataset' folder:

$ cd dataset

Each theme-variation pair will be stored in a line of the numpy array, in which the token '520' will be used to separate the theme sequence and the variation sequence.

2.1. Data Augmentation

All pieces were augmented by (a) transposing to every key, and (b) increasing and decreasing the tempo by 10% as Oore et al. (2017) and Ferreira et al. (2022) described.

$ python3 augment.py --path_indir POP909-TVar --path_outdir 909_augmented

2.2. REMI Encoding

We encoded all pieces using REMI (Huang and Yang, 2020).

$ python3 encoder.py --path_indir 909_augmented --path_outdir 909_encoded

2.3. Compile pieces in a numpy array.

Then, compile the theme-and-variation pairs for model training. The token '520' will be used to separate the theme sequence and the variation sequence.

Each piece whose name starts with "songNum_phraseNum_0" will be used as a theme, which will be combined with all other pieces with the same song number, phrase number, key signature, and tempo.

Run the script below to compile pieces from the POP909-TVar dataset in a numpy array:

$ python3 compile_for_var_gen_909.py
          --path_train_indir 909_encoded/train \
          --path_test_indir 909_encoded/test \
          --path_outdir 909_compiled \
          --max_len 512
          --task language_modeling

Run the script below to compile pieces from the VGMIDI-TVar dataset in a numpy array

$ python3 compile_for_var_gen_vgmidi.py \
          --path_train_indir vgmidi_encoded/train \
          --path_test_indir vgmidi_encoded/test \
          --path_outdir vgmidi_compiled \
          --max_len 512 \
          --task language_modeling

3. Model training Using our training scripts (train_VaTr.py, train_MuTr.py, train_FaTr.py) to train variation generation models.

An example usage:

$ python3 workspace/train_VaTr.py \
          --train dataset/909_compiled/language_modeling_train.npz \
          --test dataset/909_compiled/language_modeling_test.npz \
          --seq_len 1025 --batch_size 16 \
          --save_to trained_models/VaTr/VaTr_epoch_{}.pth \
          --epochs 16

Theme-and-variation extraction

Please follow the steps on this page if you are interested in running our theme-and-variation extraction algorithms on your datasets.

Citing this Work

If you use our method or datasets in your research, please cite:

@inproceedings{gao2024variation,
  title={{Variation Transformer}: New datasets, models, and comparative evaluation for symbolic music variation generation},
  author={Chenyu Gao, Federico Reuben, and Tom Collins},
  booktitle={the 25th International Society for Music Information Retrieval Conference},
  year={2024}
}

About

Repository for Paper: Gao et al., Variation Transformer: New datasets, models, and comparative evaluation for symbolic music variation generation, in ISMIR 2024.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages