Welding Joint Detection System

An object detection system that identifies and classifies welding joints and weld-related regions in photographs using YOLOv8, with a web application interface for easy use.

Ideal target (5 joint types): Butt, T-Joint, Lap, Corner, Edge. In practice, public datasets often provide seam or defect/quality labels. This repo supports:

Training on merged Roboflow datasets (9 classes: Good/Bad Welding, Crack, Porosity, Spatters, seam, defect, flame, etc.).
Optional joint-type classifier so that when the detector only has a “seam” class, the app can show Butt / T-Joint / Lap.

Reference (5 joint types):

ID	Joint Type	Description
0	Butt Joint	Two pieces end-to-end in the same plane
1	T-Joint	Perpendicular pieces forming a T shape
2	Lap Joint	Overlapping pieces welded at edges
3	Corner Joint	Pieces meeting at a corner (L shape)
4	Edge Joint	Parallel edges aligned and welded

Documentation: See docs/PROJECT_SUMMARY.md for a full description of everything implemented (data, training, web app) and docs/README.md for the docs index.

Project Structure

Joint_Classifier/
├── data/
│   ├── raw/                          # Scraped images (by joint type)
│   ├── annotations/                  # Roboflow exports (single or roboflow_all)
│   ├── yolo_dataset/                 # Single-project YOLO dataset
│   └── yolo_dataset_merged/          # Merged multi-project dataset (9 classes)
│       ├── images/{train,val,test}/
│       ├── labels/{train,val,test}/
│       └── dataset.yaml
├── scraper/
│   ├── scrape_images.py              # Web image collection
│   ├── validate_images.py            # Image validation & cleanup
│   ├── download_roboflow.py          # Roboflow download (single or --all-welding)
│   ├── merge_roboflow.py             # Merge multiple Roboflow exports
│   ├── sources.py                    # Search queries & sources
│   └── config.json                   # Scraping configuration
├── src/
│   ├── train_yolo.py                 # YOLOv8 training
│   ├── train_joint_classifier.py     # Optional 3-class joint classifier
│   ├── evaluate.py                   # Model evaluation
│   ├── inference.py                  # Detection API (WeldingJointDetector)
│   ├── prepare_dataset.py            # Dataset preparation
│   └── annotation_helper.py          # Annotation utilities
├── models/
│   └── runs/                         # Training runs (e.g. weld_merged)
├── webapp/
│   ├── app.py                        # FastAPI app (detect + training-data browser)
│   ├── templates/
│   └── static/
├── docs/                             # Documentation (see docs/README.md)
├── requirements.txt
├── Dockerfile
├── docker-compose.yml
├── .gitignore
└── README.md

Quick Start

1. Install Dependencies

# Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate  # Linux/Mac
# venv\Scripts\activate   # Windows

# Install packages
pip install -r requirements.txt

2. Collect Images

Option A: Use existing datasets (recommended first step)

See docs/EXISTING_DATASETS.md for a full list. Note: There is no large public dataset that labels the 5 joint types (Butt, T, Lap, Corner, Edge); most public sets are for defects or bead segmentation. The doc lists what exists and how to get more data.

Roboflow (YOLO-ready):

# List known welding datasets on Roboflow
python scraper/download_roboflow.py --list

# Download a dataset (get API key from https://app.roboflow.com/settings/api)
export ROBOFLOW_API_KEY=your_key
python scraper/download_roboflow.py --workspace college-izka9 --project welding-jointb --version 1 --output data/annotations/roboflow_export
python src/prepare_dataset.py --from-export data/annotations/roboflow_export

Roboflow – multiple welding projects (recommended for more data):

# Get your key from https://app.roboflow.com/settings/api (Account → Roboflow Keys)
# Then set it (replace YOUR_ACTUAL_KEY with the key from the dashboard):
export ROBOFLOW_API_KEY=YOUR_ACTUAL_KEY
python scraper/download_roboflow.py --all-welding

This downloads three projects into data/annotations/roboflow_all/:

welding_jointb (64 images, seam)
welding_seam (65 images, seam)
weld_quality (3.7k images, 6 classes: Good Welding, Bad Welding, Crack, Porosity, etc.)

Then merge them into one dataset and train:

python scraper/merge_roboflow.py --input data/annotations/roboflow_all --output data/yolo_dataset_merged
python src/train_yolo.py --data data/yolo_dataset_merged/dataset.yaml

Note: The merge script writes an absolute path in dataset.yaml. After cloning on another machine, edit data/yolo_dataset_merged/dataset.yaml and set path to a relative path (e.g. data/yolo_dataset_merged) or the project root so training works from any checkout.

Other public datasets (RIAWELC, Mendeley, IEEE WELD):

python scraper/download_public_datasets.py --instructions   # Print download links
python scraper/download_public_datasets.py --riawelc       # Clone RIAWELC (24k X-ray images)

Option B: Scrape from the web

# Download images for all joint types
python scraper/scrape_images.py

# Download for a specific joint type
python scraper/scrape_images.py --joint t_joint

# Limit downloads per query
python scraper/scrape_images.py --limit 30

# Validate downloaded images
python scraper/scrape_images.py --validate-only

# Check dataset summary
python scraper/scrape_images.py --summary

Images are saved to data/raw/<joint_type>/.

3. Annotate Images

You need to draw bounding boxes around welding joints in each image.

Option A: Label Studio (Recommended)

pip install label-studio
label-studio start

# See setup guide:
python src/annotation_helper.py guide

Option B: Roboflow (Cloud-based)

Go to roboflow.com and create a free account
Upload images from data/raw/
Annotate with bounding boxes using 5 classes
Export in YOLOv8 format
Place exported files in data/annotations/

Option C: CVAT

Use CVAT for local annotation, then export in YOLO format.

4. Prepare Dataset

# From manually organized annotations
python src/prepare_dataset.py

# From annotation tool export
python src/prepare_dataset.py --from-export data/annotations/export

# Custom split ratios
python src/prepare_dataset.py --train 0.8 --val 0.15 --test 0.05

# Check annotation statistics
python src/annotation_helper.py stats data/annotations

5. Train Model

# Default training (YOLOv8m, 100 epochs)
python src/train_yolo.py

# Custom settings
python src/train_yolo.py --model m --epochs 100 --batch 16 --device 0

# Smaller model for faster training
python src/train_yolo.py --model s --epochs 50

# CPU training (slower)
python src/train_yolo.py --model s --device cpu

# Resume interrupted training
python src/train_yolo.py --resume models/runs/welding_joint_detector/weights/last.pt

Model sizes:

Size	Params	Speed	Accuracy	GPU Memory
n	3.2M	Fastest	Lower	~4GB
s	11.2M	Fast	Good	~6GB
m	25.9M	Medium	Better	~8GB
l	43.7M	Slower	High	~10GB
x	68.2M	Slowest	Highest	~12GB

6. Evaluate Model

# Evaluate on test set
python src/evaluate.py --model models/runs/welding_joint_detector/weights/best.pt

# Evaluate with visualizations
python src/evaluate.py --model best.pt --visualize data/yolo_dataset/images/test/

# Custom confidence threshold
python src/evaluate.py --model best.pt --conf 0.5

7. Run Web Application

# Start web server
python webapp/app.py --model models/runs/welding_joint_detector/weights/best.pt

# Custom host/port
python webapp/app.py --model best.pt --host 0.0.0.0 --port 8080

# Development mode with auto-reload
python webapp/app.py --model best.pt --reload

Open http://localhost:8000 in your browser.

8. Run Detection from Command Line

# Single image
python src/inference.py --model best.pt --source image.jpg

# Directory of images
python src/inference.py --model best.pt --source images_folder/

# Save results as JSON
python src/inference.py --model best.pt --source image.jpg --save-json

API Reference

POST `/api/detect`

Detect welding joints in an uploaded image.

Parameters:

file (form-data): Image file (JPEG, PNG, BMP, WebP)
confidence (query, optional): Confidence threshold (0.0-1.0, default: 0.25)
iou (query, optional): IoU threshold for NMS (0.0-1.0, default: 0.45)

Response:

{
  "success": true,
  "num_detections": 2,
  "detections": [
    {
      "class_id": 1,
      "class_name": "T-Joint",
      "confidence": 0.92,
      "bbox": [120.5, 80.3, 450.2, 310.7],
      "bbox_normalized": [0.1875, 0.1672, 0.7034, 0.6473]
    }
  ],
  "annotated_image": "data:image/jpeg;base64,...",
  "inference_time_ms": 45.2
}

GET `/api/health`

Health check endpoint.

GET `/api/classes`

List all detectable joint types.

Interactive API Docs

Visit http://localhost:8000/docs for Swagger UI documentation.

Docker Deployment

# Build and run
docker-compose up --build

# Or without compose
docker build -t welding-detector .
docker run -p 8000:8000 -v ./models:/app/models welding-detector

Python API Usage

from src.inference import WeldingJointDetector

# Initialize detector
detector = WeldingJointDetector("models/runs/welding_joint_detector/weights/best.pt")

# Detect joints
detections = detector.detect("image.jpg", conf_threshold=0.5)
for det in detections:
    print(f"{det['class_name']}: {det['confidence']:.0%}")

# Detect and visualize
annotated_img, detections = detector.detect_and_visualize("image.jpg")

# Save annotated image
import cv2
cv2.imwrite("result.jpg", annotated_img)

Performance Targets

Metric	Minimum	Good	Excellent
mAP50	>0.70	>0.85	>0.90
mAP50-95	>0.50	>0.65	>0.75
Precision	>0.80	>0.85	>0.90
Recall	>0.75	>0.80	>0.85
Inference	<500ms	<200ms	<100ms

Troubleshooting

No GPU detected:

# Check CUDA availability
python -c "import torch; print(torch.cuda.is_available())"

# Train on CPU (slower)
python src/train_yolo.py --device cpu

Out of memory during training:

Reduce batch size: --batch 8 or --batch 4
Use a smaller model: --model s or --model n
Reduce image size: --img-size 416

Low accuracy:

Collect more training images (target: 200+ per class)
Review annotation quality
Try a larger model
Increase training epochs
Ensure dataset is balanced

Web app shows "No Model":

Train a model first
Specify model path: python webapp/app.py --model path/to/best.pt
Or set environment variable: export MODEL_PATH=path/to/best.pt

Documentation

Document	Description
docs/PROJECT_SUMMARY.md	Full summary of what was built — data collection (scraping, Roboflow download/merge), training (YOLO + optional joint classifier), web app, inference. Use for onboarding or before uploading to GitHub.
docs/README.md	Index of all documentation.
docs/EXISTING_DATASETS.md	Public welding datasets and download options.
docs/NEXT_STEPS.md	What to do after training (evaluate, run app, optional improvements).

Uploading to GitHub & Google Drive

Do not commit secrets. Use a .env file or environment variables for ROBOFLOW_API_KEY; .gitignore already excludes .env.
Large assets (models, data) are not in the repo. They are listed in .gitignore and synced via Google Drive. See docs/DRIVE_SYNC.md for:
- What to upload to Drive (e.g. models/runs/, data/yolo_dataset_merged/) using scripts/drive_upload.py
- How to download after clone with scripts/drive_download.py or by setting GDRIVE_FOLDER_ID when starting the app
Full project narrative: See docs/PROJECT_SUMMARY.md for everything that was implemented and how to run it after clone.

License

This project is for educational and research purposes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Welding Joint Detection System

Project Structure

Quick Start

1. Install Dependencies

2. Collect Images

3. Annotate Images

4. Prepare Dataset

5. Train Model

6. Evaluate Model

7. Run Web Application

8. Run Detection from Command Line

API Reference

POST `/api/detect`

GET `/api/health`

GET `/api/classes`

Interactive API Docs

Docker Deployment

Python API Usage

Performance Targets

Troubleshooting

Documentation

Uploading to GitHub & Google Drive

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
config		config
data/annotations/roboflow_export		data/annotations/roboflow_export
docs		docs
sample_photos		sample_photos
scraper		scraper
scripts		scripts
src		src
webapp		webapp
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt

hmazin/Joint_Classifier

Folders and files

Latest commit

History

Repository files navigation

Welding Joint Detection System

Project Structure

Quick Start

1. Install Dependencies

2. Collect Images

3. Annotate Images

4. Prepare Dataset

5. Train Model

6. Evaluate Model

7. Run Web Application

8. Run Detection from Command Line

API Reference

POST /api/detect

GET /api/health

GET /api/classes

Interactive API Docs

Docker Deployment

Python API Usage

Performance Targets

Troubleshooting

Documentation

Uploading to GitHub & Google Drive

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

POST `/api/detect`

GET `/api/health`

GET `/api/classes`

Packages