This repository focuses on Sentiment Analysis using a pretrained BERT model. The project utilizes PyTorch and Hugging Face Transformers to classify textual data into various emotional states based on remarks. The dataset is an annotated CSV file containing labeled sentiment data.
- The dataset consists of textual remarks labeled with different emotional states.
- It has been preprocessed to be suitable for the BERT model.
- Data is split into training and validation sets for model evaluation.
- Pretrained Model: BERT (Bidirectional Encoder Representations from Transformers)
- Frameworks Used:
- PyTorch for deep learning computations
- Hugging Face Transformers for leveraging the pretrained BERT model
The project follows a structured workflow to ensure systematic implementation:
- Overview of sentiment analysis and the role of transformers like BERT.
- Understanding dataset distribution and cleaning text data.
- Tokenization and handling missing values.
- Splitting dataset into training and validation sets.
- Ensuring balanced class distribution.
- Utilizing BERT tokenizer to encode the textual data.
- Padding and truncation to maintain consistency in input sizes.
- Initializing BERT-base-uncased model.
- Fine-tuning BERT for sentiment classification.
- Converting datasets into PyTorch Dataloaders for efficient batch processing.
- Using AdamW optimizer to fine-tune BERT.
- Implementing learning rate scheduler to stabilize training.
- Accuracy, Precision, Recall, and F1-score for evaluation.
- Utilizing confusion matrix to assess model predictions.
- Implementing training and validation loops.
- Monitoring loss and accuracy over epochs.
- Loading the trained model.
- Running sentiment predictions on new data.
Sentiment Analysis with BERT on Coursera (https://www.coursera.org/projects/sentiment-analysis-bert).
- Hugging Face Transformers for providing the pretrained BERT model.
- PyTorch for deep learning utilities.
- Jupyter Notebook for interactive model training and evaluation.