ViTex: Visual Texture Control for Multi-track Symbolic Music Generation via Discrete Diffusion Models
This repository contains the official implementation of ViTex, a discrete diffusion-based model for controllable multi-track symbolic music generation with visual texture conditioning.
We recommend using Python 3.12.
Install the dependencies with:
pip install torch tensorboard librosa muspy accelerate pydubOur processed dataset and pretrained checkpoint are available here: 👉 Google Drive Folder
Contents:
d3pm.ckpt: pretrained discrete diffusion model weightsall_pkl.tar.gz: training setpkl_test.tar.gz: test set
-
Download and extract the checkpoint and test set.
-
Modify the file paths in
utils/inference_utils.py, specifically in the functions:get_model()get_dataset()Update these paths to match where you extracted the files.
-
Run the provided inference script:
bash run.sh
This script randomly samples chord progressions and ViTex conditions from the test set, and calls
pipeline.pyto generate new music examples.
You can modify pipeline.py or the command-line arguments to adjust configurations such as:
- Conditional vs. unconditional generation
- Guidance scales for each condition
- Inpainting or continuation modes
If you wish to use our preprocessed dataset, simply extract the provided archive.
To train on your own MIDI collection, follow these steps inside the data_preprocess folder:
-
Filter invalid MIDI files
python filter.py
This filters out non-4/4 tracks, extreme BPM values, missing drum tracks, etc. (See the
process_midi()function for detailed filtering rules.) -
Normalize tempo and instrumentation
python normalize.py
This step standardizes the BPM to 120 and maps instruments to a predefined set of 12 categories.
-
Preprocess multi-track data
python preprocess_multi.py
Extracts chord and instrumentation information and saves them as
.pklfiles. -
Split into training and test sets
python split.py
Splits the preprocessed data at the song level.
After these steps, you’ll have a directory containing .pkl files ready for training.
Set the path to your dataset in the training_config section of train.py, then launch training:
accelerate launch train.pyIf you are using multiple GPUs, configure the accelerate config settings according to your hardware setup before launching.