AI Prompt Image Analyser by Filip Ilovsky

This project is an advanced AI vision tool that leverages GroundingDINO for open vocabulary object detection. Unlike traditional models restricted to fixed class lists, this project allows users to input natural language prompts (e.g., “dog, laptop, artwork”) and dynamically detect those objects in any image.

Detected regions are highlighted with bounding boxes, annotated, and automatically saved for review. The pipeline is built with PyTorch and OpenCV, supporting multiple model backbones (SwinT, SwinB, SwinL) to balance accuracy and performance. Outputs are neatly organised into a dedicated folder, and detection thresholds can be tuned to trade off between precision and recall.

With organized input/output handling and customizable detection thresholds, the system provides a flexible solution for research, dataset generation, and real-world image analysis across virtually unlimited object categories.

✨ Features

Built with PyTorch + OpenCV – reliable, widely supported libraries
Detect objects in any image with text prompts
No fixed class list, meaning it detects anything you can describe
Supports multiple backbones:
- SwinT (Tiny) → fast, lighter
- SwinB (Base) → balanced
- SwinL (Large) → most accurate
Results saved automatically as [filename]_output.jpg into images/outputs/

🎥 Demo

Input:

Prompt:

bed, plant

Output:

🚀 Setup

If you have any problems scroll down to ⚠️ Notes

Clone this repository:

git clone https://github.com/Filip-2002/AI-prompt-image-analyser.git
cd AI-prompt-image-analyser

Create and activate a virtual environment:

Windows

python -m venv .venv
.\.venv\Scripts\activate

MacOS/Linux

python3 -m venv .venv
source .venv/bin/activate

Install dependencies:
```
pip install -r requirements.txt
```
Download model weights:

The default backbone is SwinB (Base).
Download the weight and place it inside the weights/ folder:
```
https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha2/groundingdino_swinb_cogcoor.pth
```
- If you are having trouble running the Base model you can download SwinT (smallest, least accurate, fast).
- If your machine is powerful and can handle a bigger model you can download SwinL (largest, most accurate, slow).
- see ⚠️ Notes for other weight download link.
Run:
```
python main.py
```

⚠️ Notes

Place your images in the images/ folder. Outputs will be saved automatically in images/outputs/.
If doing multiple prompts for one image separate them with commas (e.g., dog, window, tree, door).
To switch between backbones (SwinT/B/L), update CONFIG_PATH and WEIGHTS_PATH in main.py.
Download pretrained weights from https://huggingface.co/ShilongLiu/GroundingDINO/tree/main and move them into the weights/ folder.
- SwinT (Tiny) - groundingdino_swint_ogc.pth
- SwinB (Base) - groundingdino_swinb_cogcoor.pth
- SwinL (Large) - groundingdino_swinl_cogcoor.pth
By default the project runs on CPU. If you have a CUDA enabled GPU with PyTorch installed, you can change device = "cuda" in main.py for faster performance.
You can adjust box_threshold and text_threshold in main.py to control detection accuracy:
- Lower values - more detections, but more false positives.
- Higher values - fewer detections, but more precise.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
demo_pics		demo_pics
groundingdino		groundingdino
images		images
weights		weights
.gitattributes		.gitattributes
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AI Prompt Image Analyser by Filip Ilovsky

✨ Features

🎥 Demo

Input:

Prompt:

Output:

🚀 Setup

⚠️ Notes

About

Uh oh!

Releases

Packages

Languages

Filip-2002/AI-Prompt-Image-Analyser

Folders and files

Latest commit

History

Repository files navigation

AI Prompt Image Analyser by Filip Ilovsky

✨ Features

🎥 Demo

Input:

Prompt:

Output:

🚀 Setup

⚠️ Notes

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages