Matching and smOoth difussioN for 3d reconsTRuction
This is an experimental project as Final project of Spring 2025 CSE 559A Computer Vision course at Washington University in St. Louis. It combines MASt3R and Smooth-Diffusion. To test if we can use 2D image generation model to improve 3D reconstruction.
It has nothing to do with Arknights (perhaps).
Image credits from Melanbread
Zheyuan Wu me@trance-0.com
From limited sample 2D images, generate a 3D scene that matches the 2D images.
Then for each perspective with information loss, generate a 2D image from the incomplete 3D scene with masking on the missing parts.
Then we ask the smooth-diffusion to complete the 2D image.
Then we use the completed 2D image to reconstruct the 3D scene to obtain the full 3D scene from limited construction.
-
3D scene generation from 2D images
- 2D image pair generation with masks
-
2D image generation from 3D scene
- In-painting 2D image with smooth-diffusion
-
3D scene reconstruction from incomplete 2D images
-
Build a simple UI to show the process of the key functions
- 3D scene generation from 2D images
- 2D inpainting with smooth-diffusion
This project don't work as expected.
Possible reasons:
- The 2D image generation model is not good at generating translating frame in the 3D scene.
In majority of the cases, the 3D scene generated image is not good as a translational state of the movement of the camera, even with the correct camera pose, mask and 2D image.
- The 3D scene reconstruction model is not good for general reconstruction scene, especially with the scene with reflective surface.
But the pipeline is still working.
Expected at least 8 GB of VRAM for running smooth-diffusion.
python mon3tr/demo.py

