Skip to content

devinli123/MV-SAM3D

Repository files navigation

MV-SAM3D

MV-SAM3D is a multi-view 3D reconstruction framework that extends SAM 3D Objects to leverage observations from multiple viewpoints. It supports both single-object and multi-object generation, and is designed to produce more stable geometry, texture, and scene-level consistency.

Paper

Installation

Please follow the environment setup from:

Data Format

scene/
├── images/
│   ├── 0.png
│   ├── 1.png
│   └── ...
├── object_a/
│   ├── 0.png
│   ├── 1.png
│   └── ...
├── object_b/
│   └── ...
└── ...

Mask files are RGBA PNG where alpha indicates foreground.

Results Comparison

Single-object

Single-View (View 3) Single-View (View 6) MV-SAM3D
Input Image
Input Image
Input Images
↓ 3D Reconstruction ↓

Single-view baseline.

Single-view baseline.

Better multi-view consistency.

Multi-object

SAM 3D (single-view) MV-SAM3D w/o Pose Optimization MV-SAM3D (full)

Shape and pose are often unstable.

Multi-view improves object quality.

Improved overall scene alignment.

Quick Start

Single-object inference

python run_inference_weighted.py \
  --input_path ./data/example \
  --mask_prompt stuffed_toy \
  --da3_output ./da3_outputs/example/da3_output.npz

Multi-object inference

python run_inference_weighted.py \
  --input_path ./data/desk_objects0 \
  --mask_prompt keyboard,speaker,mug,stuffed_toy \
  --da3_output ./da3_outputs/desk_objects0/da3_output.npz \
  --merge_da3_glb \
  --run_pose_optimization

Default Settings (No Extra Flags)

For single-object inference (run_inference_weighted.py), key defaults are:

  • Stage 1 weighting: enabled (stage1_entropy_alpha=30.0)
  • Stage 2 weighting: enabled (stage2_weight_source=entropy)
  • Stage 2 alpha defaults: stage2_entropy_alpha=30.0, stage2_visibility_alpha=30.0

Preprocessing for a New Scene

python preprocessing/build_mvsam3d_dataset.py \
  --input data/your_scene \
  --objects keyboard,speaker,mug,stuffed_toy
python scripts/run_da3.py \
  --image_dir ./data/your_scene/images \
  --output_dir ./da3_outputs/your_scene

Citation

@article{li2026mv,
  title={MV-SAM3D: Adaptive Multi-View Fusion for Layout-Aware 3D Generation},
  author={Li, Baicheng and Wu, Dong and Li, Jun and Zhou, Shunkai and Zeng, Zecui and Li, Lusong and Zha, Hongbin},
  journal={arXiv preprint arXiv:2603.11633},
  year={2026}
}

Acknowledgments

We thank the authors of SAM 3D Objects and Depth Anything 3 for their excellent work.

License

Please refer to LICENSE for usage terms.

About

SAM 3D Objects with Multi-view Images

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages