[codex] Add multiscale surface training and BCE+Dice ignore-label support by giorgioangel · Pull Request #799 · ScrollPrize/villa

giorgioangel · 2026-03-28T08:27:38Z

What changed

This PR adds multiscale surface-training support for vesuvius and introduces a BCE+Dice training path that works with the existing ignore_label: 2 surface labels.

Concretely it:

loads dataset_config.ome_zarr_resolution into the training config manager and validates it against valid_patch_find_resolution
makes patch-cache generation and lookup scale-aware by including the training resolution in the cache key
converts training-resolution patch sizes to full-resolution patch sizes for patch finding
converts cached full-resolution patch coordinates back to the selected OME-Zarr training level when reading cached patches
adds BinaryBCEAndDiceLoss for single-channel surface training with ignore_label: 2
adds a GPU fit-probe helper for patch-size and batch-size experiments
adds 8 surface training configs covering scale 0/2, 256^3 @ bs=3 and 128^3 @ bs=28, with both MedialSurfaceRecall and BCE+Dice variants
adds focused regression tests for multiscale config loading, cache separation, coordinate scaling, and the new BCE+Dice loss path

Why

Two feature gaps were blocking the intended surface experiments:

Multiscale training was not wired correctly end-to-end.
Patch finding cached full-resolution coordinates, but the dataset could open lower OME-Zarr levels without converting those coordinates back to the training level. Cache keys also did not distinguish between scale 0 and scale 2.
The existing nnUNet BCE+Dice path was not a good fit for the current scalar 0/1/2 surface labels.
It expects region-based targets, while these runs need single-channel binary logits with ignore_label: 2 preserved.

Impact

Surface experiments can now target both OME-Zarr scale 0 and scale 2 correctly.
Cache files are isolated per training scale, avoiding cross-scale cache reuse bugs.
Surface BCE+Dice runs can use the same ignore-label semantics as the MedialSurfaceRecall runs.
The added configs make the 8-run comparison matrix reproducible.

Validation

Executed and verified on the remote H100 node:

python3 -m py_compile for the modified Python files
uv run --all-extras pytest vesuvius/tests/models/test_surface_multiscale_training.py -q
uv run --all-extras pytest vesuvius/tests -k "ome_zarr_resolution or patch_cache or zarr_dataset or surface_multiscale_training" -q
fit probe results:
- 256^3 @ bs=3 fits on a single H100
- 128^3 @ bs=28 fits on a single H100
vesuvius.find_patches completed for the 8 configs, producing 4 unique cache files and 4 expected cache hits across loss variants
the corresponding remote training runs were launched successfully with W&B logging enabled

Notes

This PR intentionally scopes to the multiscale training fix, the BCE+Dice compatibility path, the probe helper, tests, and the new experiment configs. It does not include unrelated local or remote worktree changes.

…port

vercel · 2026-03-28T08:27:44Z

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment

Project	Deployment	Actions	Updated (UTC)
scrollprize-org	Ignored	Preview	Mar 28, 2026 1:35pm

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 53d8019ed2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-28T08:44:59Z

vesuvius/src/vesuvius/models/datasets/zarr_dataset.py

+            if coord % scale_factor != 0:
+                raise ValueError(
+                    f"Cached full-resolution position {position} is not divisible by scale factor {scale_factor}"


Avoid hard-failing on non-divisible cached coordinates

This assumes every cached full-resolution coordinate is divisible by 2**ome_zarr_resolution, but find_valid_patches explicitly falls back to full-resolution scanning when the requested valid_patch_find_resolution level is unavailable, and those fallback coordinates are not guaranteed to satisfy that divisibility. In that common fallback path, dataset startup will now raise ValueError when loading cache; even when divisible by chance, scaling is still wrong if _open_zarr also fell back to level 0. The conversion should be based on the actually resolved level (or level availability should be validated up front) rather than assuming divisibility here.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-28T08:44:59Z

vesuvius/src/vesuvius/models/configuration/single_task/surface_resenc_s2_ps256_bs3_bcedice.yaml

+    surface:
+      out_channels: 1
+      valid_patch_value: 1
+      activation: "sigmoid"


Set BCE+Dice surface configs to logit activation

BinaryBCEAndDiceLoss is logit-based (BCEWithLogitsLoss + internal sigmoid), but this config enables activation: "sigmoid". Because NetworkFromConfig applies task activations in eval mode and the validation loop computes losses under model.eval(), validation loss for these runs is computed on already-sigmoided outputs and then transformed again inside the loss, which distorts val-loss tracking and any early-stopping/checkpoint decisions tied to it. These training configs should keep activation: "none" and apply sigmoid only in inference/postprocessing.

Useful? React with 👍 / 👎.

giorgioangel · 2026-03-30T12:47:31Z

@bruniss can we merge this?

Add multiscale surface training configs and BCE+Dice ignore-label sup…

53d8019

…port

giorgioangel marked this pull request as ready for review March 28, 2026 08:35

giorgioangel requested a review from jrudolph as a code owner March 28, 2026 08:35

giorgioangel requested a review from bruniss March 28, 2026 08:35

chatgpt-codex-connector bot reviewed Mar 28, 2026

View reviewed changes

giorgioangel added 3 commits March 28, 2026 10:12

Fix BCE validation activation handling

08371f9

Honor ignore label in patch cache lookup

bb5cf4e

Filter patches by ignore-label z-range

8167dfb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[codex] Add multiscale surface training and BCE+Dice ignore-label support#799

[codex] Add multiscale surface training and BCE+Dice ignore-label support#799
giorgioangel wants to merge 4 commits intomainfrom
codex/vesuvius-multiscale-bce-ignore2

giorgioangel commented Mar 28, 2026

Uh oh!

vercel bot commented Mar 28, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Mar 28, 2026

Uh oh!

chatgpt-codex-connector bot Mar 28, 2026

Uh oh!

giorgioangel commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

giorgioangel commented Mar 28, 2026

What changed

Why

Impact

Validation

Notes

Uh oh!

vercel bot commented Mar 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

giorgioangel commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel bot commented Mar 28, 2026 •

edited

Loading