Audio-Classification-based-Research

This project evaluates multiple audio classification architectures on the BanglaSER dataset to develop an effective Bangla Speech Emotion Recognition (SER) model. Several deep learning models—including CNN, Bi-LSTM, CNN–BiLSTM, and pretrained YAMNet-based models—were implemented using extracted acoustic features such as MFCCs and Mel-spectrograms. In addition, a Support Vector Machine (SVM) classifier was trained on handcrafted features as a classical baseline. Hyperparameter optimization techniques (Grid Search, Random Search, and Bayesian Optimization), data augmentation, and different loss functions were explored to improve performance. Experimental results show that CNN-based and Bi-LSTM architectures benefit from careful tuning, while the optimized SVM achieved the highest accuracy, highlighting the importance of feature engineering and model selection for Bangla SER.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Audio-Classification-based-Research

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Audio-Classification-based-Research

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages