Skip to content

vivupadi/Sentiment_analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

  1. The project uses a GUI for better interactive and ease purpose.
  2. The Project delves into NLP concepts and the pre-processing involved before training a Logistic Regression model.
  3. Dataset used- Kaggle Twitter Sentiment Analysis which consists of 4 columns - Tweet ID, Entity, Text and Sentiment
  4. Visualization includes using the Word cloud mapping for the user-selected sentiment.
  5. Preprocessing includes- Retaining only alphanumeric letters, Switching everything to lower-case, Removing stopwords(e.g 'The', 'is'), Lemmatizing (changing words to root form)
  6. The Text to Vector conversion is done using Bag of Words Or TF_IDF(User Selected)
  7. Main.py - Includes GUI parts
  8. Train.py - Includes loading, preprocessing, vectorization, model training
  9. Predict.py - Prediction on the user input
  10. vectorizer.py - Includes the frequency based vector embeddings(BOW , TF_IDF)
  11. model.py - Different models initialized Next steps:
  12. To see the difference in metrics if Word2Vec is used. Read that Word2Vec is better in highlighting the sentiments involved in the text.
  13. Hyperparameter tuning
  14. Use Prediction based Vector embeddings (Word2Vec, Glove,etc). Might need to change few concepts
  15. Bert, LSTM Implementation

--Screenshot of the GUI image

⭐ Star this repo if you find it helpful!

Made with ❤️ by Vivek Padayattil

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published