GitHub - samarthyaveer/spam-ham-classifier: This email spam classification system uses three distinct machine learning models—Naive Bayes, Support Vector Machine (SVM), and Random Forest—to classify emails as either "spam" [0] or "ham" (non-spam) [1].

Spam/Ham Email Classifier This project provides a machine learning-based solution for classifying emails as Spam or Ham. It leverages multiple models, including Naive Bayes, Support Vector Machines (SVM), and Random Forest, to predict and display results interactively via a Gradio interface. The application is deployed on Hugging Face for seamless accessibility.

Features Multi-Model Predictions: Utilizes Naive Bayes, SVM, and Random Forest classifiers for predictions. Interactive Interface: Accepts single email text or batch input through CSV files. Preprocessing: Includes text preprocessing (stopword removal, lemmatization) and TF-IDF vectorization. Deployment: Deployed on Hugging Face with a user-friendly interface for classification tasks. Getting Started Follow the instructions below to set up the project locally or use the deployed version on Hugging Face.

Usage Single Email Classification Enter email content in the Text Input field. Click Submit to view the classification result for each model. Batch Email Classification Upload a CSV file containing a column with email texts. Ensure the column name is recognized or renamed automatically. Click Submit to receive predictions for all emails, including a majority vote result. Deployed Version Access the live demo on Hugging Face: [https://hf.co/spaces/samarthyaveer/spam-ham-classifier]

How It Works Preprocessing Pipeline Convert text to lowercase. Remove special characters and digits. Tokenize and filter stopwords using NLTK. Lemmatize tokens to reduce words to their base form. Transform preprocessed text using TF-IDF vectorization. Models Naive Bayes: Probabilistic model based on word frequency. Support Vector Machines (SVM): Linear classifier optimized for high-dimensional spaces. Random Forest: Ensemble model aggregating decisions from multiple decision trees. Output Displays predictions from all three models. Provides a final classification based on majority voting across the models. Acknowledgments Libraries Used: NLTK, Scikit-learn, Pandas, NumPy, Gradio

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
requirements.txt		requirements.txt
spamham.py		spamham.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages