Skip to content

sararahman1729/Audio-Classification-based-Research

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

Audio-Classification-based-Research

This project evaluates multiple audio classification architectures on the BanglaSER dataset to develop an effective Bangla Speech Emotion Recognition (SER) model. Several deep learning models—including CNN, Bi-LSTM, CNN–BiLSTM, and pretrained YAMNet-based models—were implemented using extracted acoustic features such as MFCCs and Mel-spectrograms. In addition, a Support Vector Machine (SVM) classifier was trained on handcrafted features as a classical baseline. Hyperparameter optimization techniques (Grid Search, Random Search, and Bayesian Optimization), data augmentation, and different loss functions were explored to improve performance. Experimental results show that CNN-based and Bi-LSTM architectures benefit from careful tuning, while the optimized SVM achieved the highest accuracy, highlighting the importance of feature engineering and model selection for Bangla SER.

About

The purpose of this research is to try out different audio classification models along with some audio feature extraction techniques on BanglaSER dataset to develop a Bangla Speech Recognition model architecture.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors