Skip to content

cheee2000/QuantDisBrain

Repository files navigation

QuantDisBrain

Code for Quantization and Disentanglement for Cross-Modal Alignment in Neural Speech Reconstruction from Brain Activity

1  Download

1.1  Datasets

Gwilliams et al. Dataset https://osf.io/ag3kj

Armeni et al. Dataset https://data.ru.nl/collections/di/dccn/DSC_3011085.05_995

GigaSpeech Dataset (XS) https://github.com/SpeechColab/GigaSpeech

1.2  Checkpoints

download ns3_facodec_encoder.bin and ns3_facodec_decoder.bin from FACodec

download pretrained.pth from AudioMAE

download audioldm2-speech-gigaspeech.pth from AudioLDM2

2  Environment

Follow the steps below to set up the virtual environment.

Create and activate the environment:

conda create -n QDBrain python=3.10
conda activate QDBrain

Install dependencies in the listed order:

pip install -r requirements.txt

3  Training

3.1  Stage 1

CUDA_VISIBLE_DEVICES=1 python train_proj.py

3.2  Stage 2

CUDA_VISIBLE_DEVICES=1 python train_disentangle.py

Reference

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages