IRFusionFormer

IRFusionFormer: Enhancing Pavement Crack Segmentation with RGB-T Fusion and Topological-Based Loss

News

2024/10/13: We release the train and evaluation code.

A Quick Overview

Framework
Visualization

Introduction

This is the official repository of IRFusionFormer. We have established a benchmark in the field of pavement crack segmentation using RGB and infrared images, aiming to provide a platform for fair and comprehensive performance comparisons of asphalt pavement crack segmentation algorithms. Currently, the benchmark includes 1 dataset, 9 algorithms, and 6 evaluation metrics, enabling convenient performance comparisons.

Additionally, we present IRFusionFormer, a hybrid encoder-decoder model specifically designed for pavement crack segmentation. This mechanism introduces the Efficient RGB-T Cross Fusion Module (EGTCF), designed to effectively integrate and capture long-range dependencies in RGB and infrared image features across multiple scales using a cross-attention mechanism. For model supervision, a topology-based loss function is combined with an auxiliary loss function derived from infrared image segmentation results, ensuring the final crack segmentation output meets topological constraints. Our framework outperforms state-of-the-art methods across all six evaluation metrics in the benchmark.

Datasets

The dataset used in this study is an open-source dataset (Github) dedicated to crack detection using Infrared Thermography (IRT), which is included in the RGB-T asphalt pavement crack segmentation benchmark. It comprises four image types: RGB images, infrared images, fused images (combined at a 50:50 ratio using IR-Fusion™ technology), and ground truth images manually annotated using Photoshop. Each category consists of 448 images, each with a resolution of 640x480 pixels. For training and evaluation purposes, the segmentation model divides the entire dataset into two subsets: 358 images for the training set and 90 images for the test set.

Models

Type	Models (Code)	Dice	IoU	Accuracy	Precision	Specificity	Recall
I	MCNet [2]	0.6852	0.5365	0.9589	0.6936	0.9785	0.6914
R	U-net [3]	0.7891	0.6517	0.9794	0.8161	0.9909	0.7639
R	UNet++ [4]	0.8048	0.6733	0.9801	0.7937	0.9887	0.8574
R	DeepLabV3 [5]	0.8338	0.7149	0.9828	0.8134	0.9896	0.8552
R	DeepCrack [6]	0.7406	0.5880	0.9787	0.6592	0.9837	0.8450
R	CrackFormer [7]	0.8489	0.7374	0.9847	0.8462	0.9918	0.8515
IR	CRM_RGBTSeg [8]	0.8450	0.7370	0.9829	0.8651	0.9921	0.8293
IR	CMNeXt [9]	0.8760	0.7794	0.9835	0.8885	0.9921	0.8639
IR	IRFusionFormer (ours)	0.9001	0.8183	0.9899	0.9001	0.9947	0.9001

Quick Start

Prepare the Dataset and Checkpoints

Read the prepare docs

Clone the Repository

https://github.com/sheauhuu/IRFusionFormer.git
cd IRFusionFormer

Installation

This codebase was developed and tested with the following packages.

OS: Ubuntu 22.04 LTS CUDA: 12.1 PyTorch: 2.1.0 Python: 3.10.14 Detectron2: 0.6

You can install the dependencies by running the following command:

pip install -r requirements.txt

!Note:

To train MCNet model, you need to install Apex first.
To trian CRM model, you need to compile CUDA kernel for MSDeformAttn. If you have trouble, refer here

cd CRM/models/mask2former/pixel_decoder/ops/
sh make.sh

Training

MCNet:

! Note: You need to install Apex first.
```
cd MCNet
python train.py -d 0
```
DeepCrack:
```
cd DeepCrack
python train.py
```

CRM_RGBTSeg:

cd CRM_RGBTSeg
python train.py \
    --config-file /path/to/config/file \
    --num-gpus 1 \
    --name IRC_CRM

CMNeXt:

cd CMNeXt
python tools/train_mm.py --cfg /path/to/config/file

IRFusionFormer and Other:

cd IRFusionFormer
python train.py --config /path/to/config/file

Evaluation

MCNet:

cd MCNet
python eval.py \
    -e MCNet.pth \
    -d 0 \
    --save_path /path/to/save

DeepCrack:
```
cd DeepCrack
python test.py
```

CRM_RGBTSeg:

cd CRM_RGBTSeg
python test.py \
	--config-file  \
    --num-gpus 1 \
	--name Eval_CRM \
	--checkpoint /path/to/checkpoint/file

CMNeXt:

cd CMNeXt
python tools/val_mm.py --cfg /path/to/config/file

IRFusionFormer and Other:

cd IRFusionFormer 
python test.py --config /path/to/config/file

References

[1] Liu, Fangyu and Liu, Jian and Wang, Linbing. "Asphalt pavement crack detection based on convolutional neural network and infrared thermography". IEEE Transactions on Intelligent Transportation Systems 23.11 (2022): 22145-22155.

[2] Xiong, Haitao, Wenjie Cai, and Qiong Liu. "MCNet: Multi-level correction network for thermal image semantic segmentation of nighttime driving scene." Infrared Physics & Technology 113 (2021): 103628.

[3] Ronneberger, Olaf, Philipp Fischer, and Thomas Brox. "U-net: Convolutional networks for biomedical image segmentation." Medical image computing and computer-assisted intervention–MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18. Springer International Publishing, 2015.

[4] Zhou, Zongwei, et al. "Unet++: Redesigning skip connections to exploit multiscale features in image segmentation." IEEE transactions on medical imaging 39.6 (2019): 1856-1867.

[5] Chen, Liang-Chieh, et al. "Encoder-decoder with atrous separable convolution for semantic image segmentation." Proceedings of the European conference on computer vision (ECCV). 2018.

[6] Zou, Qin, et al. "Deepcrack: Learning hierarchical convolutional features for crack detection." IEEE transactions on image processing 28.3 (2018): 1498-1512.

[7] Liu, Huajun, et al. "Crackformer network for pavement crack segmentation." IEEE Transactions on Intelligent Transportation Systems 24.9 (2023): 9240-9252.

[8] Shin, Ukcheol, et al. "Complementary random masking for rgb-thermal semantic segmentation." 2024 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2024.

[9] Zhang, Jiaming, et al. "CMX: Cross-modal fusion for RGB-X semantic segmentation with transformers." IEEE Transactions on intelligent transportation systems (2023).

Cite

Please cite the following paper if you use our work in your research.

@article{irfusionformer,
  title={IRFusionFormer: Enhancing Pavement Crack Segmentation with RGB-T Fusion and Topological-Based Loss},
  author={Xiao, Ruiqiang and Chen, Xiaohu},
  journal={arXiv preprint arXiv:2409.20474},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
CRM		CRM
DELIVER		DELIVER
DeepCrack		DeepCrack
IRFusionFormer		IRFusionFormer
MCNet		MCNet
img		img
.gitignore		.gitignore
Prepare.md		Prepare.md
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IRFusionFormer

News

A Quick Overview

Introduction

Datasets

Models

Quick Start

References

Cite

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

IRFusionFormer

News

A Quick Overview

Introduction

Datasets

Models

Quick Start

References

Cite

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages