Code for Reverse the auditory processing pathway: Coarse-to-fine audio reconstruction from human brain activity
Brain2Sound Dataset https://github.com/KamitaniLab/SoundReconstruction
Brain2Music Dataset https://openneuro.org/datasets/ds003720
Brain2Speech Dataset https://openneuro.org/datasets/ds003020/versions/1.1.1
download pretrained.pth from AudioMAE
download audioldm2-full.pth and audioldm2-speech-gigaspeech.pth from AudioLDM2
Follow the steps below to set up the virtual environment.
Create and activate the environment:
conda create -n c2f_ldm python=3.10
conda activate c2f_ldmInstall dependencies in the listed order:
pip install -r requirements.txtFirst, extract the semantic features of the ground truth:
python semantic_decoding/extract_gt_feat.py -d brain2sound
python semantic_decoding/extract_gt_feat.py -d brain2music
python semantic_decoding/extract_gt_feat.py -d brain2speechNext, perform L2-regularized linear regression:
python semantic_decoding/sound_decoding.py
python semantic_decoding/music_decoding.py
python semantic_decoding/speech_decoding.pySpecify the subject ID in the configuration file and then run:
CUDA_VISIBLE_DEVICES=1 python acoustic_decoding/train_AcousticDecoder.py -c configs/brain2sound.yaml
CUDA_VISIBLE_DEVICES=1 python acoustic_decoding/train_AcousticDecoder.py -c configs/brain2music.yaml
CUDA_VISIBLE_DEVICES=1 python acoustic_decoding/train_AcousticDecoder.py -c configs/brain2speech.yamlSpecify the subject ID and the checkpoint path of the pretrained AcousticDecoder in the configuration file and then run:
CUDA_VISIBLE_DEVICES=1 python reconstruction/train_LDM.py -c configs/brain2sound.yaml --reload_from_ckpt audioldm2-full
CUDA_VISIBLE_DEVICES=1 python reconstruction/train_LDM.py -c configs/brain2music.yaml --reload_from_ckpt audioldm2-full
CUDA_VISIBLE_DEVICES=1 python reconstruction/train_LDM.py -c configs/brain2speech.yaml --reload_from_ckpt audioldm2-speech-gigaspeech