CVPR 2026 (Oral)
FILTR predicts persistence diagrams from pretrained point-cloud encoder features in a feed-forward pass. This repository contains model code, training utilities, CUDA extensions, DONUT dataset helpers, checkpoint download helpers, and preprocessing scripts for feature and diagram generation.
Most urgent follow-up work:
- Publish stable pretrained checkpoints and document exact filenames/checksums (#5)
- Fix and validate the end-to-end training and validation pipeline (#14)
- Add feature extraction scripts for Point-MAE, PCP-MAE, PointGPT, point2vec, and other supported backbones (#13)
- Improve generated artifact storage management for features, diagrams, logs, and checkpoints (#7)
- Remove tracked compiled artifacts from the repository history/index (#2)
Clone with submodules:
git clone --recursive https://github.com/lmartinez2001/filtr.git
cd filtrCreate host directories for data and checkpoints:
mkdir -p /path/to/filtr-data
mkdir -p /path/to/filtr-ckptsEdit .devcontainer/devcontainer.json and set mounts for your machine:
"mounts": [
"source=/path/to/filtr-data,target=/workspaces/filtr/data,type=bind,consistency=cached",
"source=/path/to/filtr-ckpts,target=/workspaces/filtr/ckpts,type=bind,consistency=cached"
]Then run VS Code command:
Dev Containers: Rebuild and Reopen in Container
The devcontainer runs bash extensions/install_extensions.sh after creation so CUDA extensions are built after GPU access is available.
Release images include CUDA, Python, PyTorch, Python dependencies, and the repository source under /workspaces/filtr. Runtime data, checkpoints, and experiment outputs should still be bind-mounted from the host.
Use this in .devcontainer/devcontainer.json instead of the build block:
"image": "ghcr.io/lmartinez2001/filtr:latest"Keep the same GPU arguments and mounts:
"runArgs": ["--gpus", "all", "--shm-size=8g"],
"mounts": [
"source=/path/to/filtr-data,target=/workspaces/filtr/data,type=bind,consistency=cached",
"source=/path/to/filtr-ckpts,target=/workspaces/filtr/ckpts,type=bind,consistency=cached"
],
"postCreateCommand": "bash extensions/install_extensions.sh"For plain Docker:
docker run --gpus all -it \
--shm-size=8g \
-v /path/to/filtr-data:/workspaces/filtr/data \
-v /path/to/filtr-ckpts:/workspaces/filtr/ckpts \
-v /path/to/filtr-experiments:/workspaces/filtr/experiments \
ghcr.io/lmartinez2001/filtr:latestInside the container, run CUDA extension installation once before training:
bash extensions/install_extensions.shInside the container:
python3 preprocess/download_checkpoints.py ckptsThis downloads Point-BERT, Point-MAE, PCP-MAE, and PointGPT checkpoints into ckpts/. To skip Google Drive downloads:
python3 preprocess/download_checkpoints.py ckpts --skip-google-driveLog in to Hugging Face before downloading DONUT:
huggingface-cli loginIf your environment uses the newer Hugging Face CLI, use:
hf auth loginThen download the dataset:
python3 preprocess/datasets/get_donut.py data/donutThis materializes the DONUT dataset under:
data/donut/pcd
data/donut/obj
Initialize the PointBERT submodule if needed:
git submodule update --init --recursive third_party/pointbertExtract features:
python3 preprocess/features_extraction/extract_pointbert_features.py \
--model_ckpt ckpts/Point-BERT.pth \
--dvae_ckpt ckpts/dvae.pth \
--pcd_dir data/donut/pcd \
--output_dir data/donut/features/pointbert \
--in_points 1024 \
--out_points 2048 \
--seed 0Features are written to:
data/donut/features/pointbert/out_2048/in_1024
python3 preprocess/topology/compute_alpha_diagrams.py \
--pcd_dir data/donut/pcd \
--output_dir data/donut/diagrams_alpha \
--rescale \
--n_workers 8Rips diagrams:
Caution
Do not use more than 2 or 3 workers for Rips diagrams computation, as it might overload memory.
python3 preprocess/topology/compute_rips_diagrams.py \
--pcd_dir data/donut/pcd \
--output_dir data/donut/diagrams_rips \
--max_edge_length 2.0 \
--max_dimension 2 \
--n_workers 2Generate split manifests:
python3 preprocess/datasets/create_splits.py \
--diagram_dir data/donut/diagrams_alpha \
--pcd_dir data/donut/pcd \
--tokens_dir data/donut/features/pointbert/out_2048/in_1024 \
--split_dir data/donut \
--output_dir data/donut \
--output_suffix _out2048_in1024_pbert \
--overwriteThis writes config-ready manifests:
data/donut/train_out2048_in1024_pbert.json
data/donut/val_out2048_in1024_pbert.json
Run the default Point-BERT feature model:
python3 train.py exp_name=donut_pbertOutputs are written to:
experiments/donut_pbert
For Rips diagrams, use the Rips dataset config and matching index names:
python3 train.py exp_name=donut_pbert_rips dataset=donut_ripsFILTR uses the logger Hydra config group. The default logger writes no external logs:
python3 train.py exp_name=donut_pbert logger=defaultTo log to Weights & Biases, first link the container to your W&B account:
wandb loginPaste an API key from:
https://wandb.ai/authorize
Then run training with the W&B logger:
python3 train.py exp_name=donut_pbert logger=wandb logger.project=<your-wandb-project>For example:
python3 train.py exp_name=donut_pbert logger=wandb logger.project=FILTRIf you are running on a remote machine or CI, you can also provide the key through the environment:
export WANDB_API_KEY=<your-api-key>
python3 train.py exp_name=donut_pbert logger=wandb logger.project=FILTR@inproceedings{Martinez2026FILTR,
title={FILTR: Extracting Topological Features from Pretrained 3D Models},
author={Louis Martinez and Maks Ovsjanikov},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2026}
}