vesuvius: add guided Dinovol modes, pixelshuffle pretrained decoder, and MedNeXt architectures by giorgioangel · Pull Request #806 · ScrollPrize/villa

giorgioangel · 2026-04-04T11:35:06Z

Summary

Adds guided volumetric Dinovol segmentation features, a frozen-backbone PixelShuffle decoder, and MedNeXt v1/v2 architectures to vesuvius, while keeping training and inference routed through the existing CLI, NetworkFromConfig, and checkpoint loader paths.

Included

Guided Dinovol input gating, encoder-feature gating, skip concatenation, and direct TokenBook segmentation
Guided compile/runtime/debug improvements, including compile policy controls and W&B/debug preview support
Frozen pretrained_backbone protection from global InitWeights_He
model_config.pretrained_decoder_type: pixelshuffle_conv
model_config.architecture_type: mednext_v1 and mednext_v2
MedNeXt benchmark coverage
Config-plumbed AdamW epsilon support

Final retained behavior

The PixelShuffle pretrained-backbone decoder does not use the later experimental input skip
The retained PixelShuffle structure is:
- per-stage Conv -> PixelShuffle -> Conv -> GroupNorm -> GELU
- final head 3x3x3 -> GroupNorm -> GELU -> 3x3x3 -> 1x1x1 logits
mednext_v2 is implemented as a paper-derived extension over vendored MedNeXt v1, with explicit preset selection via mednext_model_id

Validation

uv run --extra models --extra tests pytest tests/models/build/test_guided_network.py tests/models/build/test_mednext_shapes.py tests/models/build/test_primus_shapes.py tests/models/training/test_guided_trainer.py tests/models/training/test_mednext_trainer.py tests/models/training/test_base_trainer.py tests/models/configuration/test_config_manager.py tests/models/configuration/test_ps256_config_compat.py -q
Result on the clean PR branch: 139 passed

Benchmark snapshots

Guided Dinovol benchmark on the clean PR branch:

32^3: baseline train step 61.21 ms; direct segmentation 11.57 ms; feature encoder 14.52 ms; skip concat 23.18 ms; input gating 62.65 ms
64^3: baseline train step 16.35 ms; direct segmentation 9.68 ms; feature encoder 16.92 ms; skip concat 24.12 ms; input gating 19.32 ms

MedNeXt benchmark on the clean PR branch:

128^3: UNet train step 118.62 ms; mednext_v1 B 248.90 ms; mednext_v2 L 1160.38 ms
128^3: mednext_v2 L width2 forward runs but train-step OOMs; mednext_v2 B startup OOMs on the local RTX 4090
192^3: only the UNet baseline remains trainable locally; the current MedNeXt variants OOM in this setup

Caveats

Do not treat mednext_v2 as upstream-official nnUNet code; it is a paper-derived extension over vendored MedNeXt v1
Remote MedNeXt training recipes are still exploratory; VRAM/stability findings are not presented as solved
Local working files notes.md and implementation.md were used to reconcile retained behavior vs reverted experiments, but they are outside the villa git repo and are not part of this PR

vercel · 2026-04-04T11:35:11Z

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment

Project	Deployment	Actions	Updated (UTC)
scrollprize-org	Ignored	Preview	Apr 5, 2026 2:23pm

giorgioangel · 2026-04-04T13:11:25Z

Follow-up fix pushed in 941daa8ed:

preserve resolved per-target MedNeXt decoder layout in final_config so mixed shared/separate decoder checkpoints rebuild exactly
rebuild train.py checkpoints with enable_deep_supervision preserved in the inference loader
wrap DS-enabled train.py models for plain inference outputs so strict checkpoint load still works while inference receives highest-resolution logits
add regressions for:
- DS-enabled MedNeXt checkpoint reload through Inferer
- mixed shared/separate MedNeXt decoder checkpoint reload

Validation rerun on the clean PR branch:

uv run --extra models --extra tests pytest tests/models/build/test_guided_network.py tests/models/build/test_mednext_shapes.py tests/models/build/test_primus_shapes.py tests/models/training/test_guided_trainer.py tests/models/training/test_mednext_trainer.py tests/models/training/test_base_trainer.py tests/models/configuration/test_config_manager.py tests/models/configuration/test_ps256_config_compat.py -q
Result: 141 passed

giorgioangel · 2026-04-04T15:14:57Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 941daa8ed2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-04T15:20:53Z

vesuvius/src/vesuvius/models/build/build_network_from_config.py

+        if self.guide_freeze:
+            with torch.inference_mode():
+                frozen_features = self.guide_backbone(x)[0]
+        return frozen_features.clone()
+        return self.guide_backbone(x)[0]


Fix non-frozen guide feature path

When guide_freeze is false, this function skips the if block so frozen_features is never assigned, but return frozen_features.clone() still runs unconditionally. That raises UnboundLocalError on the first guided forward pass and makes trainable guidance unusable. The following return self.guide_backbone(x)[0] is unreachable, so the intended non-frozen branch currently never executes.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-04T15:20:53Z

vesuvius/src/vesuvius/models/training/train.py

            scaler.step(optimizer)
            scaler.update()
            optimizer_stepped = True
+            if should_time_optimizer:


Update EMA weights after optimizer step

EMA support is wired into config/loading, but after scaler.step(optimizer) there is no call to _update_ema_model(model) (and no other call site in BaseTrainer). In runs with ema_enabled: true, the EMA copy stays at initialization and never tracks training weights, so any EMA validation/checkpoint flow silently uses stale parameters.

Useful? React with 👍 / 👎.

giorgioangel added 30 commits April 4, 2026 13:23

vesuvius: include nest-asyncio in models env

ce02304

vesuvius: support local dinovol guide checkpoints

ddba20d

vesuvius: add input-gated volumetric guidance

999f611

vesuvius: add optional guided supervision loss

0aab025

vesuvius: add guided dinovol config and benchmark

bbc33b9

vesuvius: expand guided benchmark instrumentation

7c91bef

vesuvius: optimize guided runtime transfers

4becbce

vesuvius: add tokenbook prototype cap

7f23b7f

vesuvius: document guided performance guidance

35ec636

vesuvius: add ps256 config compatibility tests

6e9af95

vesuvius: add guided ps256 configs

e8c9bdd

vesuvius: add guide debug image logging

9cb4040

vesuvius: document ps256 guide debug guidance

a46c4ca

vesuvius: stabilize guided compile startup

57fe2cf

vesuvius: make guided loss amp safe

2aaf3d4

vesuvius: optimize guided ddp defaults

3693c9b

vesuvius: add selective guide compile modes

e4a1b9f

vesuvius: log guided auxiliary losses to wandb

6b47e0b

vesuvius: respect ckpt_out_base for run dirs

52c5f82

vesuvius: disable guide loss and separate guide debug logging

6a3418f

vesuvius: add encoder-only guide gating

af31403

vesuvius: wire feature-encoder trainer logging

d8a87ae

vesuvius: benchmark feature-encoder guidance

aec3d42

vesuvius: add token-weighted tokenbook guidance

13b2074

vesuvius: compile all guidance submodules

c708fb4

vesuvius: log guide previews to wandb

665ae2c

vesuvius: make feature gating skip-only and residual

ba5ff41

vesuvius: smooth and pair guide preview rows

10821c7

vesuvius: add frozen skip concat guidance

2f7dce1

vesuvius: pick nonzero validation debug samples

9050061

giorgioangel added 17 commits April 4, 2026 13:23

vesuvius: narrow skip concat projector widths

61b31a7

vesuvius: normalize skip concat projectors

c7cd9e6

vesuvius: add direct tokenbook segmentation mode

4e8455a

vesuvius: freeze pretrained backbones before init

e7b8a3e

vesuvius: add pixelshuffle decoder for frozen dino

94eb38e

vesuvius: refine pixelshuffle decoder head

f962184

vesuvius: add late input skip to pixelshuffle decoder

5419e46

vesuvius: simplify pixelshuffle final head

ebdd6ba

vesuvius: drop pixelshuffle input skip

2b89b3e

vesuvius: add pixelshuffle head normalization

87fee82

vesuvius: vendor mednext v1 core

2769862

vesuvius: integrate mednext v1 architecture

8e3c306

vesuvius: benchmark mednext v1

8b35508

vesuvius: add mednext v2 architecture

71f3e17

vesuvius: benchmark mednext v2 variants

092275d

vesuvius: plumb adamw epsilon from config

7fc3eec

vesuvius: fix clean branch benchmark and backbone packaging

e83aa22

vesuvius: fix mednext checkpoint reload paths

941daa8

chatgpt-codex-connector bot reviewed Apr 4, 2026

View reviewed changes

giorgioangel added 8 commits April 5, 2026 10:45

vesuvius: add pixelshuffle pre-refine block

8f6c954

vesuvius: drop pixelshuffle conv biases before groupnorm

635d069

vesuvius: split pixelshuffle pretrained decoder variants

a3aceb4

vesuvius: support explicit runtime volume specs

c73787f

vesuvius: preserve explicit volume cache ids

a42dcae

vesuvius: revert bighead stage kernels to 3x3

23c0f33

vesuvius: rename and reshape pixelshuffle bighead

a75c92a

vesuvius: tune bighead final conv stack

0701ecf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

vesuvius: add guided Dinovol modes, pixelshuffle pretrained decoder, and MedNeXt architectures#806

vesuvius: add guided Dinovol modes, pixelshuffle pretrained decoder, and MedNeXt architectures#806
giorgioangel wants to merge 59 commits intomainfrom
pr/vesuvius-guided-dinovol-mednext-clean

giorgioangel commented Apr 4, 2026

Uh oh!

vercel bot commented Apr 4, 2026 •

edited

Loading

Uh oh!

giorgioangel commented Apr 4, 2026

Uh oh!

giorgioangel commented Apr 4, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Apr 4, 2026

Uh oh!

chatgpt-codex-connector bot Apr 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

giorgioangel commented Apr 4, 2026

Summary

Included

Final retained behavior

Validation

Benchmark snapshots

Caveats

Uh oh!

vercel bot commented Apr 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

giorgioangel commented Apr 4, 2026

Uh oh!

giorgioangel commented Apr 4, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel bot commented Apr 4, 2026 •

edited

Loading