Skip to content

harilexm/hatespeech-detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

45 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🛡️ Hate Speech Detection in Roman Urdu

Problem Statement

Hate speech on social media is a real problem — especially in Pakistan where people write in Roman Urdu (Urdu written in English letters). Platforms like Facebook, YouTube, Twitter are full of toxic comments, slurs, and harassment in Roman Urdu but there are almost no tools to detect this. All the existing hate speech detection systems are built for English, they completely fail on Roman Urdu. And this is the language millions of Pakistani users actually type in every single day. On top of that, Roman Urdu has no fixed spelling rules — one toxic word can be written 10 different ways. People mix Urdu, English, Punjabi all in one sentence. There's no ready-made dataset, no pre-trained model, nothing. So we built one from scratch.

  • No standard spelling — same word spelled multiple ways (khabees, khabis, khabeees)
  • Code-mixing everywhere — heUrdu, English, Punjabi, all in one comment
  • Context matters — some words are toxic only in certain contexts
  • Zero pre-built tools — no off-the-shelf models, tokenizers, or labeled datasets exist for Roman Urdu

What Our System Does

We built a complete ML pipeline that takes a Roman Urdu comment and classifies it as Toxic (1) or Not Toxic (0) — everything from scratch.

  • Scraped real comments from Pakistani subreddits
  • AI (Gemini 3.0, Claude Opus 4.6) based batch labeled 20,622 comments with toxic keywords recorded
  • Built a 1,609-word toxic lexicon from the labeled data
  • Full NLP pipeline tuned for Roman Urdu
  • Trained 5 ML models — best: SVM at 95.66% accuracy

Setup

1. Clone the repo

git clone https://github.com/your-username/hatespeech-detection.git
cd hatespeech-detection

2. Create & activate virtual environment

python -m venv venv

Windows:

venv\Scripts\activate

Mac/Linux:

source venv/bin/activate

3. Install dependencies

pip install -r requirements.txt

4. Run the notebook

jupyter notebook Model.ipynb

Dataset

Custom dataset — commentLabel.csv — scraped from Pakistani subreddits and labeled via AI in batches. 3 columns: text, label (0/1), keyword (the toxic word that triggered the label).

Detail Value
Total comments 20,622
After dedup 16,501
Not Toxic 11,867 (71.92%)
Toxic 4,634 (28.08%)
Unique toxic keywords 1,609

Pipeline

Everything runs in Model.ipynb:

  1. EDA — checked shape, nulls, duplicates (4,121 removed), label distribution
  2. Preprocessing — lowercasing, removed URLs/emojis/punctuation/numbers, normalized repeated chars, removed English + Roman Urdu stopwords
  3. Lexicon Feature — built a 1,609-word toxic lexicon, counted toxic keyword hits per comment
  4. TF-IDF Features — char n-grams (2–5, 30k) + word n-grams (1–2, 15k) + lexicon count = 45,001 features
  5. Split + SMOTE — 70/30 stratified split, SMOTE on train set → balanced to 8,306 per class
  6. Training — 5 models: XGBoost, Naive Bayes, Logistic Regression, Random Forest, SVM

Results

Model Accuracy F1 (Toxic) Precision Recall
SVM 95.66% 0.92 0.93 0.91
XGBoost 95.11% 0.91 0.90 0.93
Logistic Regression 95.07% 0.91 0.89 0.93
Multinomial NB 92.39% 0.86 0.88 0.84
Random Forest 90.93% 0.81 0.96 0.70

Best: SVM95.66% accuracy, F1: 0.9219. Manual sanity test: 10/10 correct.

About

ML-based hate speech detection system for Roman Urdu — built from scratch with a custom 20K+ labeled dataset, NLP pipeline, and 5 classifiers. Best model (SVM) achieves 95.66% accuracy.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors