OccAny: Generalized Unconstrained Urban 3D Occupancy

CVPR 2026

Anh-Quan Cao Tuan-Hung Vu
Valeo.ai, Paris, France

TL;DR: A unified framework for generalized unconstrained urban 3D occupancy prediction.

teaser.mp4

OccAny provides demo and inference code for urban 3D occupancy under unconstrained inputs. This repository currently includes two model variants:

OccAny which is based on Must3R and SAM2,
OccAny+ which is based on Depth Anything 3 and SAM3

The repository includes sample RGB inputs in demo_data/input, pretrained weights in checkpoints/, and visualization tools for both point clouds and voxel grids.

Citation

If you find this work or code useful, please cite the paper and consider starring the repository:

@inproceedings{cao2026occany,
  title={OccAny: Generalized Unconstrained Urban 3D Occupancy},
  author={Anh-Quan Cao and Tuan-Hung Vu},
  booktitle={CVPR},
  year={2026}
}

📝 TO-DO List

Inference code for OccAny (Must3R + SAM2) and OccAny+ (DA3 + SAM3)
Pretrained checkpoints
Evaluation code for nuScenes and KITTI
Dataset preparation scripts for Waymo, PandaSet, DDAD, VKitti, ONCE
Training code for OccAny (Must3R + SAM2) and OccAny+ (DA3 + SAM3)

Installation

1. Clone the repository

git clone https://github.com/valeoai/OccAny.git
cd OccAny

2. Create a Python environment

conda create -n occany python=3.12 -y
conda activate occany
python -m pip install --upgrade pip setuptools wheel ninja

3. Install PyTorch and CUDA

conda install -c nvidia cuda-toolkit=12.6
pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu126
pip install xformers==0.0.29.post2

4. Install shared Python dependencies

pip install -r requirements.txt

5. Install `torch-scatter`

export CUDA_HOME=$CONDA_PREFIX
export PATH=$CUDA_HOME/bin:$PATH
export LD_LIBRARY_PATH=$CUDA_HOME/lib:$LD_LIBRARY_PATH
pip install torch-scatter --no-cache-dir --no-build-isolation

6. Use the vendored third-party code

OccAny relies on the copies bundled in third_party/:

third_party/croco for croco
third_party/dust3r for dust3r
third_party/Grounded-SAM-2 for Grounded-SAM-2, sam2, and groundingdino
third_party/sam3 for SAM3
third_party/Depth-Anything-3 for Depth Anything 3

inference.py already prepends these paths automatically at runtime. If you want to import the vendored packages in a shell, notebook, or standalone sanity check, export them explicitly:

export PYTHONPATH="$PWD/third_party:$PWD/third_party/dust3r:$PWD/third_party/croco/models/curope:$PWD/third_party/Grounded-SAM-2:$PWD/third_party/Grounded-SAM-2/grounding_dino:$PWD/third_party/sam3:$PWD/third_party/Depth-Anything-3/src:$PYTHONPATH"

Avoid adding third_party/sam2 on top of this unless you explicitly need the standalone SAM2 copy, because it exposes the same top-level module name as third_party/Grounded-SAM-2.

7. Compile CroCo's `curope` extension (recommended)

export CUDA_HOME=$CONDA_PREFIX
export PATH=$CUDA_HOME/bin:$PATH
export LD_LIBRARY_PATH=$CUDA_HOME/lib:$LD_LIBRARY_PATH

cd third_party/croco/models/curope
python setup.py install
cd ../../../..

This builds a curope*.so file next to the sources. The PYTHONPATH export above includes that directory so models.curope can resolve it at runtime.

The vendored third_party/croco/models/curope/setup.py currently targets SM 70, 80, and 90. If your GPU uses a different compute capability, update all_cuda_archs there before rebuilding.

8. Optional sanity check

python - <<'PY'
import sys
from pathlib import Path

repo_root = Path.cwd()
for path in reversed([
    repo_root / "third_party",
    repo_root / "third_party" / "dust3r",
    repo_root / "third_party" / "croco" / "models" / "curope",
    repo_root / "third_party" / "Grounded-SAM-2",
    repo_root / "third_party" / "Grounded-SAM-2" / "grounding_dino",
    repo_root / "third_party" / "sam3",
    repo_root / "third_party" / "Depth-Anything-3" / "src",
]):
    path_str = str(path)
    if path.exists() and path_str not in sys.path:
        sys.path.insert(0, path_str)

import torch
import sam2
import sam3
import groundingdino
import depth_anything_3
import dust3r.utils.path_to_croco  # noqa: F401
from croco.models.pos_embed import RoPE1D

print("torch:", torch.__version__)
print("cuda:", torch.version.cuda)
print("RoPE1D backend:", RoPE1D.__name__)
print("third-party imports: ok")
PY

Checkpoints

Model checkpoints are hosted on Hugging Face:

https://huggingface.co/anhquancao/OccAny/tree/main/checkpoints

Download checkpoints with:

cd OccAny
python -c "from huggingface_hub import snapshot_download; snapshot_download(repo_id='anhquancao/OccAny', repo_type='model', local_dir='.', allow_patterns='checkpoints/*')"

Expected files under checkpoints/:

occany_da3_gen.pth
occany_da3_recon.pth
occany_must3r.pth
groundingdino_swinb_cogcoor.pth
sam2.1_hiera_large.pt

Inference

After installation, the demo commands below can be run as-is. By default:

RGB inputs are read from ./demo_data/input
Outputs are written to ./demo_data/output
The repo already includes sample input scenes such as kitti_08_1390 and nuscenes_scenes-0039

OccAny+ (Depth Anything 3 + SAM3)

python inference.py \
  --batch_gen_view 2 \
  --view_batch_size 2 \
  --semantic distill@SAM3 \
  --compute_segmentation_masks \
  --gen \
  -rot 30 \
  -vpi 2 \
  -fwd 5 \
  --seed_translation_distance 2 \
  --recon_conf_thres 2.0 \
  --gen_conf_thres 6.0 \
  --apply_majority_pooling \
  --model occany_da3

OccAny (Must3R + SAM2)

python inference.py \
  --batch_gen_view 2 \
  --view_batch_size 2 \
  --semantic distill@SAM2_large \
  --compute_segmentation_masks \
  --gen \
  -rot 30 \
  -vpi 2 \
  -fwd 5 \
  --seed_translation_distance 2 \
  --recon_conf_thres 2.0 \
  --gen_conf_thres 2.0 \
  --apply_majority_pooling \
  --model occany_must3r

Key inference flags

The most commonly adjusted flags fall into three groups: common flags, semantic flags, and generation-specific flags. If you only want reconstruction output, omit --gen and any flag whose scope below is Generation or Generation + semantic.

Flag	Scope	Description
`--model`	Common	Select the inference backbone: `occany_da3` or `occany_must3r`
`--input_dir`	Common	Directory containing RGB demo scene folders
`--output_dir`	Common	Directory where outputs are written
`--gen`	Common toggle	Enable novel-view generation before voxel fusion
`-vpi`, `--views_per_interval`	Generation	Number of generated views sampled per reconstruction view
`-fwd`, `--gen_forward_novel_poses_dist`	Generation	Forward offset for generated views, in meters
`-rot`, `--gen_rotate_novel_poses_angle`	Generation	Left/right yaw rotation applied to generated views, in degrees
`--num_seed_rotations`	Generation	Number of additional seed rotations used when initializing generated poses
`--seed_rotation_angle`	Generation	Angular spacing between seed rotations, in degrees
`--seed_translation_distance`	Generation	Lateral translation paired with each seed rotation, in meters
`--batch_gen_view`	Generation	Number of generated views processed in parallel
`--semantic`	Semantic	Enable semantic inference with a SAM2 or SAM3 variant
`--compute_segmentation_masks`	Semantic	Save segmentation masks during semantic inference
`--view_batch_size`	Semantic	Number of views processed together during semantic inference
`--recon_conf_thres`	Reconstruction	Confidence threshold used when voxelizing reconstructed points
`--gen_conf_thres`	Generation	Confidence threshold used when voxelizing generated points
`--no_semantic_from_rotated_views`	Generation + semantic	Ignore semantics from rotated generated views
`--only_semantic_from_recon_view`	Generation + semantic	Use semantics only from reconstruction views, even when generated views are present
`--gen_semantic_from_distill_sam3`	Generation + semantic	For `pretrained@SAM3`, infer generated-view semantics from distilled SAM3 features when available
`--apply_majority_pooling`	Post-processing	Apply 3x3x3 majority pooling to the fused voxel grid

Visualization

Point-cloud visualization with `viser`

Use vis_viser.py to inspect the saved pts3d_*.npy outputs interactively:

python vis_viser.py --input_folder ./demo_data/output

You can point --input_folder either to the output root or directly to a single scene folder. In the viewer, the common dropdown options are:

render for reconstruction output
render_gen for generated-view output
render_recon_gen for the combined output

Voxel visualization with `mayavi`

vis_voxel.py renders voxel predictions to image files. Install mayavi separately if you want to use this path:

pip install mayavi
python vis_voxel.py --input_root ./demo_data/output --dataset nuscenes

Helpful notes:

The script writes rendered images to ./output by default
If the requested --prediction_key is missing, it automatically falls back to the best available render* grid
Use --dataset kitti for KITTI-style scenes and --dataset nuscenes for nuScenes-style surround-view scenes
Add --save_input_images if you also want stacked input RGB images next to the voxel render

Outputs

Each processed scene is written under ./demo_data/output/<frame_id>_<model>/. Typical artifacts include:

pts3d_render.npy for reconstruction views
pts3d_render_gen.npy for generated views when --gen is enabled
pts3d_render_recon_gen.npy for the merged point-cloud output
voxel_predictions.pkl for voxelized predictions and visualization metadata

inference.py currently uses an urban voxel grid tuned for the included demo scenes:

voxel_size = 0.4
occ_size = [200, 200, 24]
voxel_origin = [-40.0, -40.0, -3.6]

If you need a different dataset convention or voxel layout, update these values in inference.py before running inference. Two common presets are:

KITTI

voxel_size = 0.2
occ_size = [256, 256, 32]
voxel_origin = np.array([0.0, -25.6, -2.0], dtype=np.float32)

nuScenes

voxel_size = 0.4
occ_size = [200, 200, 16]
voxel_origin = np.array([-40.0, -40.0, -1.0], dtype=np.float32)

License

This project is licensed under the Apache License 2.0, see the LICENSE file for details.

Acknowledgments

We thanks the authors of these great repositories Dust3r, Must3r, Depth-Anything-3, SAM2, SAM3 and viser.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OccAny: Generalized Unconstrained Urban 3D Occupancy

CVPR 2026

Citation

Table of contents

📝 TO-DO List

Installation

1. Clone the repository

2. Create a Python environment

3. Install PyTorch and CUDA

4. Install shared Python dependencies

5. Install `torch-scatter`

6. Use the vendored third-party code

7. Compile CroCo's `curope` extension (recommended)

8. Optional sanity check

Checkpoints

Inference

OccAny+ (Depth Anything 3 + SAM3)

OccAny (Must3R + SAM2)

Key inference flags

Visualization

Point-cloud visualization with `viser`

Voxel visualization with `mayavi`

Outputs

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
assets		assets
demo_data/input		demo_data/input
occany		occany
third_party		third_party
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
inference.py		inference.py
requirements.txt		requirements.txt
vis_viser.py		vis_viser.py
vis_voxel.py		vis_voxel.py

Folders and files

Latest commit

History

Repository files navigation

OccAny: Generalized Unconstrained Urban 3D Occupancy

CVPR 2026

Citation

Table of contents

📝 TO-DO List

Installation

1. Clone the repository

2. Create a Python environment

3. Install PyTorch and CUDA

4. Install shared Python dependencies

5. Install torch-scatter

6. Use the vendored third-party code

7. Compile CroCo's curope extension (recommended)

8. Optional sanity check

Checkpoints

Inference

OccAny+ (Depth Anything 3 + SAM3)

OccAny (Must3R + SAM2)

Key inference flags

Visualization

Point-cloud visualization with viser

Voxel visualization with mayavi

Outputs

License

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors 2

Languages

5. Install `torch-scatter`

7. Compile CroCo's `curope` extension (recommended)

Point-cloud visualization with `viser`

Voxel visualization with `mayavi`

Packages