🔍 AuthentiText: Detecting AI-Generated Text Using ML & Feature Engineering

With the rise of large language models (LLMs), distinguishing AI-generated text from human writing has become critical—especially in academia, journalism, and content authenticity. AuthentiText is a machine learning project designed to address this challenge by building an advanced classification pipeline for AI vs. human text detection.

We experimented with multiple approaches:

✅ Baseline models like logistic regression, which struggled with the complexity of the data.
🧠 Transformer models (BERT) and Random Forest ensembles, which showed promise but lacked robustness.
🧪 Named Entity Recognition and High-impact feature engineering, where we achieved ~97% accuracy using a Random Forest model enriched with linguistic and semantic features like sentiment polarity, vocabulary richness, named entities, and readability metrics.

To make the system accessible, we also built a Streamlit web app that highlights impactful words in the prediction and provides insights into linguistic patterns—helping users understand why a piece of text is classified a certain way.

🖥️ Live Demo: https://appentitext-nffbbtxqlncjazrkh7d53p.streamlit.app/

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
AuthentiText_project_report.pdf		AuthentiText_project_report.pdf
FullData_RFPipeline (3).ipynb		FullData_RFPipeline (3).ipynb
README.md		README.md
app.py		app.py
bert200-with-features (1).ipynb		bert200-with-features (1).ipynb
prediction.py		prediction.py
random_forest_model.pkl		random_forest_model.pkl
requirements.txt		requirements.txt
scaler.pkl		scaler.pkl
text_processing_functions.py		text_processing_functions.py
tfidf_vectorizer.pkl		tfidf_vectorizer.pkl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🔍 AuthentiText: Detecting AI-Generated Text Using ML & Feature Engineering

About

Uh oh!

Releases

Packages

Uh oh!

Languages

rayapudisaiakhil/AuthentiText

Folders and files

Latest commit

History

Repository files navigation

🔍 AuthentiText: Detecting AI-Generated Text Using ML & Feature Engineering

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages