GitHub

The project uses a GUI for better interactive and ease purpose.
The Project delves into NLP concepts and the pre-processing involved before training a Logistic Regression model.
Dataset used- Kaggle Twitter Sentiment Analysis which consists of 4 columns - Tweet ID, Entity, Text and Sentiment
Visualization includes using the Word cloud mapping for the user-selected sentiment.
Preprocessing includes- Retaining only alphanumeric letters, Switching everything to lower-case, Removing stopwords(e.g 'The', 'is'), Lemmatizing (changing words to root form)
The Text to Vector conversion is done using Bag of Words Or TF_IDF(User Selected)
Main.py - Includes GUI parts
Train.py - Includes loading, preprocessing, vectorization, model training
Predict.py - Prediction on the user input
vectorizer.py - Includes the frequency based vector embeddings(BOW , TF_IDF)
model.py - Different models initialized Next steps:
To see the difference in metrics if Word2Vec is used. Read that Word2Vec is better in highlighting the sentiments involved in the text.
Hyperparameter tuning
Use Prediction based Vector embeddings (Word2Vec, Glove,etc). Might need to change few concepts
Bert, LSTM Implementation

--Screenshot of the GUI

⭐ Star this repo if you find it helpful!

Made with ❤️ by Vivek Padayattil

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
.github/workflows		.github/workflows
__pycache__		__pycache__
data		data
models		models
src		src
test		test
Dockerfile		Dockerfile
README.md		README.md
requirements.txt		requirements.txt

Provide feedback