Skip to content

hmazin/Joint_Classifier

Repository files navigation

Welding Joint Detection System

An object detection system that identifies and classifies welding joints and weld-related regions in photographs using YOLOv8, with a web application interface for easy use.

Ideal target (5 joint types): Butt, T-Joint, Lap, Corner, Edge. In practice, public datasets often provide seam or defect/quality labels. This repo supports:

  • Training on merged Roboflow datasets (9 classes: Good/Bad Welding, Crack, Porosity, Spatters, seam, defect, flame, etc.).
  • Optional joint-type classifier so that when the detector only has a “seam” class, the app can show Butt / T-Joint / Lap.

Reference (5 joint types):

ID Joint Type Description
0 Butt Joint Two pieces end-to-end in the same plane
1 T-Joint Perpendicular pieces forming a T shape
2 Lap Joint Overlapping pieces welded at edges
3 Corner Joint Pieces meeting at a corner (L shape)
4 Edge Joint Parallel edges aligned and welded

Documentation: See docs/PROJECT_SUMMARY.md for a full description of everything implemented (data, training, web app) and docs/README.md for the docs index.


Project Structure

Joint_Classifier/
├── data/
│   ├── raw/                          # Scraped images (by joint type)
│   ├── annotations/                  # Roboflow exports (single or roboflow_all)
│   ├── yolo_dataset/                 # Single-project YOLO dataset
│   └── yolo_dataset_merged/          # Merged multi-project dataset (9 classes)
│       ├── images/{train,val,test}/
│       ├── labels/{train,val,test}/
│       └── dataset.yaml
├── scraper/
│   ├── scrape_images.py              # Web image collection
│   ├── validate_images.py            # Image validation & cleanup
│   ├── download_roboflow.py          # Roboflow download (single or --all-welding)
│   ├── merge_roboflow.py             # Merge multiple Roboflow exports
│   ├── sources.py                    # Search queries & sources
│   └── config.json                   # Scraping configuration
├── src/
│   ├── train_yolo.py                 # YOLOv8 training
│   ├── train_joint_classifier.py     # Optional 3-class joint classifier
│   ├── evaluate.py                   # Model evaluation
│   ├── inference.py                  # Detection API (WeldingJointDetector)
│   ├── prepare_dataset.py            # Dataset preparation
│   └── annotation_helper.py          # Annotation utilities
├── models/
│   └── runs/                         # Training runs (e.g. weld_merged)
├── webapp/
│   ├── app.py                        # FastAPI app (detect + training-data browser)
│   ├── templates/
│   └── static/
├── docs/                             # Documentation (see docs/README.md)
├── requirements.txt
├── Dockerfile
├── docker-compose.yml
├── .gitignore
└── README.md

Quick Start

1. Install Dependencies

# Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate  # Linux/Mac
# venv\Scripts\activate   # Windows

# Install packages
pip install -r requirements.txt

2. Collect Images

Option A: Use existing datasets (recommended first step)

See docs/EXISTING_DATASETS.md for a full list. Note: There is no large public dataset that labels the 5 joint types (Butt, T, Lap, Corner, Edge); most public sets are for defects or bead segmentation. The doc lists what exists and how to get more data.

Roboflow (YOLO-ready):

# List known welding datasets on Roboflow
python scraper/download_roboflow.py --list

# Download a dataset (get API key from https://app.roboflow.com/settings/api)
export ROBOFLOW_API_KEY=your_key
python scraper/download_roboflow.py --workspace college-izka9 --project welding-jointb --version 1 --output data/annotations/roboflow_export
python src/prepare_dataset.py --from-export data/annotations/roboflow_export

Roboflow – multiple welding projects (recommended for more data):

# Get your key from https://app.roboflow.com/settings/api (Account → Roboflow Keys)
# Then set it (replace YOUR_ACTUAL_KEY with the key from the dashboard):
export ROBOFLOW_API_KEY=YOUR_ACTUAL_KEY
python scraper/download_roboflow.py --all-welding

This downloads three projects into data/annotations/roboflow_all/:

  • welding_jointb (64 images, seam)
  • welding_seam (65 images, seam)
  • weld_quality (3.7k images, 6 classes: Good Welding, Bad Welding, Crack, Porosity, etc.)

Then merge them into one dataset and train:

python scraper/merge_roboflow.py --input data/annotations/roboflow_all --output data/yolo_dataset_merged
python src/train_yolo.py --data data/yolo_dataset_merged/dataset.yaml

Note: The merge script writes an absolute path in dataset.yaml. After cloning on another machine, edit data/yolo_dataset_merged/dataset.yaml and set path to a relative path (e.g. data/yolo_dataset_merged) or the project root so training works from any checkout.

Other public datasets (RIAWELC, Mendeley, IEEE WELD):

python scraper/download_public_datasets.py --instructions   # Print download links
python scraper/download_public_datasets.py --riawelc       # Clone RIAWELC (24k X-ray images)

Option B: Scrape from the web

# Download images for all joint types
python scraper/scrape_images.py

# Download for a specific joint type
python scraper/scrape_images.py --joint t_joint

# Limit downloads per query
python scraper/scrape_images.py --limit 30

# Validate downloaded images
python scraper/scrape_images.py --validate-only

# Check dataset summary
python scraper/scrape_images.py --summary

Images are saved to data/raw/<joint_type>/.

3. Annotate Images

You need to draw bounding boxes around welding joints in each image.

Option A: Label Studio (Recommended)

pip install label-studio
label-studio start

# See setup guide:
python src/annotation_helper.py guide

Option B: Roboflow (Cloud-based)

  1. Go to roboflow.com and create a free account
  2. Upload images from data/raw/
  3. Annotate with bounding boxes using 5 classes
  4. Export in YOLOv8 format
  5. Place exported files in data/annotations/

Option C: CVAT

Use CVAT for local annotation, then export in YOLO format.

4. Prepare Dataset

# From manually organized annotations
python src/prepare_dataset.py

# From annotation tool export
python src/prepare_dataset.py --from-export data/annotations/export

# Custom split ratios
python src/prepare_dataset.py --train 0.8 --val 0.15 --test 0.05

# Check annotation statistics
python src/annotation_helper.py stats data/annotations

5. Train Model

# Default training (YOLOv8m, 100 epochs)
python src/train_yolo.py

# Custom settings
python src/train_yolo.py --model m --epochs 100 --batch 16 --device 0

# Smaller model for faster training
python src/train_yolo.py --model s --epochs 50

# CPU training (slower)
python src/train_yolo.py --model s --device cpu

# Resume interrupted training
python src/train_yolo.py --resume models/runs/welding_joint_detector/weights/last.pt

Model sizes:

Size Params Speed Accuracy GPU Memory
n 3.2M Fastest Lower ~4GB
s 11.2M Fast Good ~6GB
m 25.9M Medium Better ~8GB
l 43.7M Slower High ~10GB
x 68.2M Slowest Highest ~12GB

6. Evaluate Model

# Evaluate on test set
python src/evaluate.py --model models/runs/welding_joint_detector/weights/best.pt

# Evaluate with visualizations
python src/evaluate.py --model best.pt --visualize data/yolo_dataset/images/test/

# Custom confidence threshold
python src/evaluate.py --model best.pt --conf 0.5

7. Run Web Application

# Start web server
python webapp/app.py --model models/runs/welding_joint_detector/weights/best.pt

# Custom host/port
python webapp/app.py --model best.pt --host 0.0.0.0 --port 8080

# Development mode with auto-reload
python webapp/app.py --model best.pt --reload

Open http://localhost:8000 in your browser.

8. Run Detection from Command Line

# Single image
python src/inference.py --model best.pt --source image.jpg

# Directory of images
python src/inference.py --model best.pt --source images_folder/

# Save results as JSON
python src/inference.py --model best.pt --source image.jpg --save-json

API Reference

POST /api/detect

Detect welding joints in an uploaded image.

Parameters:

  • file (form-data): Image file (JPEG, PNG, BMP, WebP)
  • confidence (query, optional): Confidence threshold (0.0-1.0, default: 0.25)
  • iou (query, optional): IoU threshold for NMS (0.0-1.0, default: 0.45)

Response:

{
  "success": true,
  "num_detections": 2,
  "detections": [
    {
      "class_id": 1,
      "class_name": "T-Joint",
      "confidence": 0.92,
      "bbox": [120.5, 80.3, 450.2, 310.7],
      "bbox_normalized": [0.1875, 0.1672, 0.7034, 0.6473]
    }
  ],
  "annotated_image": "data:image/jpeg;base64,...",
  "inference_time_ms": 45.2
}

GET /api/health

Health check endpoint.

GET /api/classes

List all detectable joint types.

Interactive API Docs

Visit http://localhost:8000/docs for Swagger UI documentation.


Docker Deployment

# Build and run
docker-compose up --build

# Or without compose
docker build -t welding-detector .
docker run -p 8000:8000 -v ./models:/app/models welding-detector

Python API Usage

from src.inference import WeldingJointDetector

# Initialize detector
detector = WeldingJointDetector("models/runs/welding_joint_detector/weights/best.pt")

# Detect joints
detections = detector.detect("image.jpg", conf_threshold=0.5)
for det in detections:
    print(f"{det['class_name']}: {det['confidence']:.0%}")

# Detect and visualize
annotated_img, detections = detector.detect_and_visualize("image.jpg")

# Save annotated image
import cv2
cv2.imwrite("result.jpg", annotated_img)

Performance Targets

Metric Minimum Good Excellent
mAP50 >0.70 >0.85 >0.90
mAP50-95 >0.50 >0.65 >0.75
Precision >0.80 >0.85 >0.90
Recall >0.75 >0.80 >0.85
Inference <500ms <200ms <100ms

Troubleshooting

No GPU detected:

# Check CUDA availability
python -c "import torch; print(torch.cuda.is_available())"

# Train on CPU (slower)
python src/train_yolo.py --device cpu

Out of memory during training:

  • Reduce batch size: --batch 8 or --batch 4
  • Use a smaller model: --model s or --model n
  • Reduce image size: --img-size 416

Low accuracy:

  • Collect more training images (target: 200+ per class)
  • Review annotation quality
  • Try a larger model
  • Increase training epochs
  • Ensure dataset is balanced

Web app shows "No Model":

  • Train a model first
  • Specify model path: python webapp/app.py --model path/to/best.pt
  • Or set environment variable: export MODEL_PATH=path/to/best.pt

Documentation

Document Description
docs/PROJECT_SUMMARY.md Full summary of what was built — data collection (scraping, Roboflow download/merge), training (YOLO + optional joint classifier), web app, inference. Use for onboarding or before uploading to GitHub.
docs/README.md Index of all documentation.
docs/EXISTING_DATASETS.md Public welding datasets and download options.
docs/NEXT_STEPS.md What to do after training (evaluate, run app, optional improvements).

Uploading to GitHub & Google Drive

  1. Do not commit secrets. Use a .env file or environment variables for ROBOFLOW_API_KEY; .gitignore already excludes .env.
  2. Large assets (models, data) are not in the repo. They are listed in .gitignore and synced via Google Drive. See docs/DRIVE_SYNC.md for:
    • What to upload to Drive (e.g. models/runs/, data/yolo_dataset_merged/) using scripts/drive_upload.py
    • How to download after clone with scripts/drive_download.py or by setting GDRIVE_FOLDER_ID when starting the app
  3. Full project narrative: See docs/PROJECT_SUMMARY.md for everything that was implemented and how to run it after clone.

License

This project is for educational and research purposes.

About

An end-to-end welding joint detection system

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •