GPOcc leverages generalizable visual geometry priors (e.g., VGGT) and represents volumetric evidence as sparse 3D Gaussians for efficient monocular 3D occupancy prediction, with a training-free incremental fusion strategy for streaming inputs.
- Accepted to CVPR2026, code will be released before the conference.
If you find this work useful, please consider citing:
@misc{zhou2026generalizingvisualgeometrypriors,
title={Generalizing Visual Geometry Priors to Sparse Gaussian Occupancy Prediction},
author={Changqing Zhou and Yueru Luo and Changhao Chen},
year={2026},
eprint={2602.21552},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2602.21552},
}