This repository contains the source code for detecting whether an image is AI-generated or human-generated. The project includes several architectures to extract and classify features using deep learning and machine learning techniques. Our primary baseline model is an improvised XGBoost implementation, as demonstrated in our Jupyter Notebook.
- Project Overview
- Dataset
- Setup and Installation
- How to Run the Project
- Models and Architectures
- Preprocessing
- Repository Structure
The goal of this project is to distinguish between real, human-generated images and AI-generated images using a binary classification approach. The detection is based on a feature extraction step followed by classification. Our experiments include:
- Deep Learning Models:
Convolutional Neural Networks (CNNs), Transfomers, and hybrid models. Detailed architectures can be found in the src/models/architectures folder. - XGBoost as Baseline:
An improvised XGBoost pipeline, demonstrated with ResNet-50 for feature extraction and PCA for dimensionality reduction, serves as our base model for comparison against deep learning methods.
We use the Detect AI vs Human-Generated Images Dataset from Kaggle:
- Image Count: ~78,000 images (balanced between AI-generated and human-generated images)
- Resolution: 768x512 pixels
- Content: Images include a diverse range such as human faces, art, buildings, food, and plants.
| AI-Generated Image | Human-Generated Image |
![]() |
![]() |
Download and extract the dataset into the dataset/ folder.
-
Clone the Repository
git clone https://github.com/IamShrijan/GeneratedImageDetector.git cd GeneratedImageDetector -
Create a virtual environment (
condarecommended)conda create --name human_ai python=3.10 conda activate human_ai
-
Install Dependencies
pip install -r requirements.txt
You're ready to go!
-
Load Trained Models
Load all the trained models to avoid training the models from scratch. The models can be found here.
Download all the models and load them into
trained_models/strictly for smooth functioning. -
Run Notebooks
In the current implementation of the project, we have 5 different model implementations, ie. XGBoost, ResNet, Deep CNN, Hybrid Classifier, and Vision Transformers. The trainig for each model can be found in
notebooks/. To test the respective model run the entire notebook, and ensure you skip the training cell to avoid long training hours.
The repository offers multiple architectures to experiment with on this detection task:
- XGBoost Model:
Our base model is an improvised XGBoost pipeline located in xgboost_classifer.ipynb - CNN Model:
Implemented in cnn_models.py to extract deep features from images. - Hybrid Classifier:
Explore combinations of CNN layers and traditional feature extraction in hybrid.py - ResNet:
A residual network architecture found in resNet.py. - Visual Transformer Architecture:
Leverage transformer mechanisms in vit.py
Before feeding the images to our models, we apply several preprocessing steps:
- Padding/Cropping:
Ensure images have consistent size. - Normalization:
Pixel values scaled to [0, 1] or [-1, 1] to improve model training. - Shuffling:
Randomizes the order of image pairs to prevent positional bias. - Noise Addition:
Noise is added to simulate realistic image conditions, particularly for human-generated images.

