Trained On Twitter dataset with sentiment label and mainly use LSTM
Sentiment analysis involves determining the sentiment or emotion expressed in a piece of text. In this project, we use TensorFlow and LSTM to predict the sentiment (positive or negative) of tweets from the Twitter dataset.
The dataset consists of tweets with sentiment labels. Half of the dataset contains positive sentiments, while the other half contains negative sentiments. The dataset is sourced from [https://tianchi.aliyun.com/dataset/35761].
- Download the dataset to the root directory, rename it as the train.csv,the format must be the CSV, download the globe.6B.200d.txt (http://nlp.stanford.edu/data/glove.6B.zip) to ./dataset/
- CD the code directory, run the preprocess.py like: python preprocess.py , the csv will be process and saved as train-processed.csv
- Run the stats.py: python stats.py , it will generates several files.
- Run the lstm.py: python lstm.py
- The model will be saved in ./models/
Tensorflow==2.4.0
python==3.8
numpy
scikit-learn
scipy
nltk
keras