GitHub - Okja88/Visual-GenAI-Applications: A comprehensive portfolio of Visual Generative AI projects featuring Unconditional & Conditional WGAN-GP for image synthesis and an adaptive object counting system using YOLOv8 and Grounding DINO. Developed for the Specialist Diploma in Applied Generative AI (SDGAI).

Visual Generative AI Application Portfolio This repository contains a three-part series of projects developed for the Specialist Diploma in Applied Generative AI (SDGAI). The collection explores advanced Generative Adversarial Networks (GANs) and computer vision techniques for object counting.

📁 Project Structure Part 1: Unconditional WGAN-GP

Generates images across 10 distinct classes using a Wasserstein GAN with Gradient Penalty.

Focuses on learning the overall distribution of a dataset without explicit labels during training.

Part 2: Conditional WGAN-GP

Implements a conditional GAN to synthesize specific hand-sign letters.

Features targeted fine-tuning to improve the visual fidelity of "blurry" or low-quality classes.

Part 3: Adaptive Object Counting

A smart, queryable system that counts objects using natural language prompts (e.g., "count vehicles").

Utilizes pre-trained models like YOLOv8 and Grounding DINO + SAM for zero-shot detection and counting.

. ├── ASG_Part1_NathanOngKeeWee.ipynb ├── ASG_Part2_NathanOngKeeWee.ipynb ├── ASG_Part3_NathanOngKeeWee.ipynb ├── requirements.txt ├── README.md ├── Dataset/
│ ├── A/ # Example class folder │ ├── B/ │ ├── ... # Continues for all 24 static ASL letters (A-Y, excluding J and Z) │ └── Y/ └── outputs/

📂 Data Setup To run the GAN notebooks (Part 1 & 2), organize your data as follows:

Create a folder named Dataset/ in the root directory.
Inside Dataset/, create subfolders named A, B, C, etc. (Total 24 classes, excluding J and Z).
Place the respective ASL alphabet images into these folders.

🛠️ Setup Instructions To run these notebooks, you will need a Python environment (3.10+ recommended) and a GPU (CUDA) for efficient GAN training.

Clone the Repository Bash git clone https://github.com/YourUsername/YourRepoName.git cd YourRepoName
Install Dependencies The projects rely on PyTorch, Torchvision, and Ultralytics (for YOLOv8). You can install the required libraries using:

Bash pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 pip install tqdm numpy scipy matplotlib pillow ultralytics 3. Dataset Preparation Parts 1 & 2: Place your image data in a folder named ./Dataset. The folder should contain subdirectories for each class (e.g., ./Dataset/A/, ./Dataset/B/, etc.).

Part 3: Requires the ultralytics package for YOLOv8 weights and optionally the Grounding DINO weights if using advanced adaptive counting.

🚀 Module Highlights Part 1: Unconditional Image Generation The goal is to generate a grid of 10 representative images, one for each class, from a GAN trained unconditionally.

Challenge: Because the GAN is unconditional, it doesn't guarantee every class will be generated. A separate classifier is used to sort generated images into predicted class folders for selection.

Part 2: Controlled Letter Synthesis This module uses labels during training to allow specific image generation.

Technique: Employs a Lipschitz constraint via Gradient Penalty (GP) for stable training.

Evaluation: Includes FID (Frechet Inception Distance) scores and manual checkpoint selection based on visual clarity.

Part 3: Natural Language Object Counting This module creates a Gradio-based interface where users can upload an image and type what they want to count.

Logic: The system filters YOLOv8 detections based on the user's text prompt and provides an annotated image with the final count.

👤 Author Nathan Ong Kee Wee Developed as part of the Specialist Diploma in Applied Generative AI (SDGAI)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
ASG_Part1_NathanOngKeeWee.ipynb		ASG_Part1_NathanOngKeeWee.ipynb
ASG_Part2_NathanOngKeeWee.ipynb		ASG_Part2_NathanOngKeeWee.ipynb
ASG_Part3_NathanOngKeeWee.ipynb		ASG_Part3_NathanOngKeeWee.ipynb
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages