ECE6123_Project_VoiceEncoder

This repo is the course project for ECE6123 Maching Learning. See the pdf file for full report.

Model, Data Preprocessing and Training

This part of code and implementation is the encoder part of the project Real-Time-Voice-Cloning, including the preprocessing and training scripts. Based on the scripts, we trained our own encoder model in './encoder/saved_models/my_run.pt' for 173,001 steps. The result performance can be visualized from the UMAP projection of 10 speakers (10 utterances each):

GUI

We also write our own GUI tool to visualized the model's output in gui.py. The GUI allows you to load or record an utterance and will show the preprocessing process and the output embedding of the model. Besides, it will also plot the UMAP projection of historical input data.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
encoder		encoder
sample_data		sample_data
utils		utils
GUI.png		GUI.png
ML Voice Encoder Report.pdf		ML Voice Encoder Report.pdf
ModelPerf.png		ModelPerf.png
README.md		README.md
encoder_preprocess.py		encoder_preprocess.py
encoder_train.py		encoder_train.py
gui.py		gui.py
user02_00.wav		user02_00.wav
user02_01.wav		user02_01.wav
user02_02.wav		user02_02.wav
user02_03.wav		user02_03.wav

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ECE6123_Project_VoiceEncoder

Model, Data Preprocessing and Training

GUI

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ECE6123_Project_VoiceEncoder

Model, Data Preprocessing and Training

GUI

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages