Rare Challenge 2025

The repository for the IMSY team participating in the Rare25 Challenge (MICCAI 2025).

The structure of this repo was based on the example challenge submission repository provided by the challenge organisers.

1- Environment set up

First, clone the repository, create, and activate a new environment:

conda create -n rare python=3.12.0

conda activate rare

Next, activate the env, cd to the repo folder and install the requirements.txt file

pip install -r requirements.txt

Our training scripts use wandb for logging, log in to your wandb account before running them.

The training pipeline accesses the GastroNet weights and the training dataset through huggingface. Please log in to huggingface via huggingface-cli login.

2- Dataset

Raw data

The dataset can be downloaded from here or using huggingface-cli:

pip install huggingface_hub[cli]
huggingface-cli download TimJaspersTue/RARE25-train --repo-type dataset --local_dir <location>

Please place it in the data/train directory.

Data splits

The data splits can be generated by running:

python data_splitting/create_splits.py

This will generate csv files defining the different data splits in the data/splits directory:

5 fold cross validation: 5fold_cv.csv
Train on Center 1, test on Center 2: center1_train_center2_test.csv
Train on Center 2, test on Center 1: center2_train_center1_test.csv
5 fold cross-validation with a separate test set: holdout_cv.csv

Each file has the following columns:

image_path: path to the image in the train directory.
sample_id: unique image identifier (file name).
center: center number
class_name: ndbe for non-dysplastic Barrett's Esophagus, neo for Neoplasia
target: class id 0 for non-dysplastic Barrett's Esophagus, 1 for Neoplasia
split: string defining in which split the image is.

3- Training

python -m training/train runs our training pipeline with the parameters specified in training/config.py.

In order to train the GastroNet and DINOv3 models, one needs to request access to the pre-trained weights in the respective repos. Place the Dinov3 ViT-L weights in resources. When downloading the Dinov3 weights please make sure it's the dinov3_vitl16_pretrain_lvd1689m version.

To reproduce our training runs used for the final challenge submission, run ./reproduce_runs.sh. The weight-path variable needs to be replaced with the filename of your Dinov3 ViT-L weights you downloaded earlier.

In case something went wrong in training, and you want to re-run the experiment, please remove the generated final_runs.db file, or choose a new name for the study in reproduce_runs.sh.

After everything ran through successfully, the next step is recalibration:

4- Re-calibration

Run python training/recalibrate_files.py to compute the recalibration parameters.

5- Weight extraction

For minimizing latency and memory usage we have to run python extract_lora_weights.py which runs through the trained models and separates frozen base weights of dino from the added lora weights (which are different across the single models).

6- Submission docker creation

For the use of the DINOv3 models in the final submission docker one needs to first clone the DINOv3 repo and then build the docker:

Clone the DINOv3 repository into the base directory here, ideally name it dino_repo. You can use the following call: git clone https://github.com/facebookresearch/dinov3.git dino_repo
Copy relevant checkpoint folders (vitland resnet) as they are to the resources directory. For ResNet go through the directories top1 through top4 and pull the contents of the subfolder up one level. Now the checkpoints (as well as calibration files and such) should be directly below topX. For more efficient loading of the weights, we recommend removing any of the vitl weights under models not containing the keyword "lora". You may also remove any of the "_lora_dino.pth" weights that are not from fold_0.
Run ./do_test_run.sh to verify the submission runs correctly.
Run ./do_save.sh to save the docker image to be submitted to the challenge leaderboard.

Team members:

Piotr Kalinowski, Dominik Michael, Amine Yamlahi, Berit Pauls, Lucas Luttner, Patrick Godau, Lena Maier-Hein

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data_splitting		data_splitting
model		model
test/input/interface_0		test/input/interface_0
training		training
validation		validation
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
do_build.sh		do_build.sh
do_save.sh		do_save.sh
do_test_run.sh		do_test_run.sh
extract_lora_weights.py		extract_lora_weights.py
inference.py		inference.py
readme.md		readme.md
reproduce_runs.sh		reproduce_runs.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Rare Challenge 2025

1- Environment set up

2- Dataset

Raw data

Data splits

3- Training

4- Re-calibration

5- Weight extraction

6- Submission docker creation

Team members:

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

IMSY-DKFZ/rare2025-challenge

Folders and files

Latest commit

History

Repository files navigation

Rare Challenge 2025

1- Environment set up

2- Dataset

Raw data

Data splits

3- Training

4- Re-calibration

5- Weight extraction

6- Submission docker creation

Team members:

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages