Because the internet needed another fake news detector—but this one’s actually good.
Welcome to the Fake News Detector, the overly concerned Python project that’s here to help you separate fact from fiction. Whether you’re tired of miracle mango diets or yet another “Elon buys Mars” headline, this machine learning system will gladly rain on their parade.
Watch the magic happen:
Click the thumbnail above to see a full walkthrough of the Fake News Detector in action!
- Text preprocessing that’s cleaner than your browser history.
- Dual-model detection: Support Vector Machine (the nerd who always gets it right) and Passive Aggressive Classifier (the rebel who thrives on confrontation).
- Data visualization for all you graph lovers.
- Confusion matrices that look exactly like how you feel reading fake news.
- Custom article predictions so you can play detective at home.
- Modern Web UI (Streamlit): Because CLI is so 2010.
This project uses:
- TF-IDF Vectorizer to turn words into meaningful numbers.
- Passive Aggressive Classifier for its “I don’t care but I actually do” approach.
- SVM because, well, it works. Period.
- Preprocessing includes stemming, stopword removal, regex cleansing, and sarcasm filters (ok, not really).
Install dependencies using:
pip install -r requirements.txtOr manually, if you like to suffer.
Download the Dataset from my drive: WELFake_Dataset.csv
- Clone this amazing repository.
- Drop your dataset into the root folder (we expect columns named
textandlabel). - Adjust the
DATA_PATHinfake_news_detector.pyif you like living dangerously. - Run the Python script:
python fake_news_detector.pySit back and enjoy as the program judges your articles more critically than your relatives at a wedding.
- Make sure you’ve installed all requirements (see above).
- Double-click
run_app.bator run:
streamlit run app.py- Your browser will open. If it doesn’t, open http://localhost:8501 yourself. (We believe in you.)
- Use the beautiful UI to train models, test articles, and see analytics. No command line required!
=== Fake News Detection System ===
Loading dataset...
Training models...
Evaluating performance...
Plotting confusion matrix...
Regretting reading the news...
Input: "BREAKING: Scientists say chocolate is the new kale!"
Output:
PAC Prediction: FAKE
SVM Prediction: FAKE
Consensus: FAKE
Confidence: 98.76% – So yeah, nice try.
The system uses two complementary models:
- Passive Aggressive Classifier: Excellent for online learning and large datasets
- Support Vector Machine: Robust linear classifier for text data
Both models are evaluated using:
- Accuracy Score
- F1 Score
- Confusion Matrix
- Classification Report
-
Data Preprocessing:
- Remove HTML tags, URLs, and punctuation
- Tokenization and normalization
- Stopword removal and stemming
- TF-IDF vectorization
-
Model Training:
- Train both PAC and SVM models
- 70/30 train-test split with stratification
-
Prediction:
- Process input article through preprocessing pipeline
- Generate predictions from both models
- Provide consensus prediction with confidence metrics
This project demonstrates:
- End-to-end ML pipeline development
- NLP preprocessing techniques
- Model evaluation and comparison
- Social impact AI applications
- Clean, documented code practices
- Integration with pre-trained embeddings (BERT, Word2Vec)
- Real-time news scraping and classification
- API endpoint development
- Enhanced feature engineering with metadata
- Deep learning models (LSTM, Transformer)
- Achieve >85% accuracy on test dataset
- Clear documentation and reproducible results
- Professional presentation for portfolio showcase
- Practical application for social good
Made with coffee, code, and a sprinkle of existential dread by Devansh Singh.
Feel free to connect with me on dksdevansh@gmail.com if you’re into cool projects, sarcastic readmes, or you just want to say hi.
You can do whatever you want with this code. Just don’t make it tell people that pineapple belongs on pizza. That would be crossing the line.
No fake news was emotionally harmed during the making of this project. But a few poorly written headlines were judged. Harshly.






