PV-MM3D: Point-Voxel Parallel Dual-Stream Framework with Dual-Attention Region Adaptive Fusion for Multimodal 3D Object Detection
This is a official code release of PV-MM3D. This code is mainly based on OpenPCDet, some codes are from VirConv,TED, CasA, PENet and SFD.
The detection frameworks are shown below.
conda create -n pvmm3d python=3.9
conda activate pvmm3d
pip install torch==1.8.1+cu111 torchvision==0.9.1+cu111 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html
pip install numpy==1.19.5 protobuf==3.19.4 scikit-image==0.19.2 spconv-cu111 numba scipy pyyaml easydict fire tqdm shapely matplotlib opencv-python addict pyquaternion awscli open3d pandas future pybind11 tensorboardX tensorboard Cython prefetch-generator
Our released implementation is tested on.
- Ubuntu 20.04
- Python 3.9.13
- PyTorch 1.8.1
- Numba 0.53.1
- Spconv 2.1.22 # pip install spconv-cu111
- NVIDIA CUDA 11.1
- 3x NVIDIA A100 GPUs
cd PV-MM3D
python setup.py develop
This code is released under the Apache 2.0 license.
@article{wang2025pv,
title={PV-MM3D: Point-voxel parallel dual-stream framework with dual-attention region adaptive fusion for multimodal 3D object detection},
author={Wang, Baotong and Xia, Chenxing and Gao, Xiuju and Yang, Yuan and Ge, Bin and Li, Kuan-Ching and Zhang, Yan},
journal={Information Fusion},
pages={103983},
year={2025},
publisher={Elsevier}
}
