This project analyzes YouTube comments and predicts sentiment as Positive, Negative, or Neutral using a TF-IDF vectorizer and a machine learning classifier.
This repository includes all required items:
- Python code files: app.py, train_model.py
- Dataset files: All_Comments_Final.csv, Aggregated_Metrics_By_Video.csv, Aggregated_Metrics_By_Country_And_Subscriber_Status.csv, Video_Performance_Over_Time.csv
- Trained model files: sentiment_model.pkl, tfidf_vectorizer.pkl
- GUI application code: app.py (Streamlit web app)
- Colab notebook: Youtube_Channel_Analysis_NLP_Project_C_final.ipynb
- app.py: Streamlit GUI for live sentiment prediction
- train_model.py: Model training script and pickle export
- sentiment_model.pkl: Trained sentiment model
- tfidf_vectorizer.pkl: Trained TF-IDF vectorizer
- All_Comments_Final.csv: Main comment dataset
- Youtube_Channel_Analysis_NLP_Project_C_final.ipynb: Analysis notebook (Colab compatible)
- Youtube_Channel_Analysis_NLP_Project_C_final.pdf: Project report/exported notebook
- Create and activate a virtual environment.
- Install dependencies:
pip install streamlit scikit-learn pandas textblobpython -m streamlit run app.pyThen open the local URL shown in the terminal (usually http://localhost:8501).
If model files are missing or corrupted, retrain with:
python train_model.pyThis regenerates:
You can open Youtube_Channel_Analysis_NLP_Project_C_final.ipynb directly in Google Colab by uploading it to Colab or selecting it from your GitHub repository once pushed.
- The Streamlit app loads model files from the repository root.
- If pickle loading errors appear, run train_model.py to regenerate model artifacts.