Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
59 commits
Select commit Hold shift + click to select a range
ce02304
vesuvius: include nest-asyncio in models env
giorgioangel Apr 1, 2026
ddba20d
vesuvius: support local dinovol guide checkpoints
giorgioangel Apr 1, 2026
999f611
vesuvius: add input-gated volumetric guidance
giorgioangel Apr 1, 2026
0aab025
vesuvius: add optional guided supervision loss
giorgioangel Apr 1, 2026
bbc33b9
vesuvius: add guided dinovol config and benchmark
giorgioangel Apr 1, 2026
7c91bef
vesuvius: expand guided benchmark instrumentation
giorgioangel Apr 1, 2026
4becbce
vesuvius: optimize guided runtime transfers
giorgioangel Apr 1, 2026
7f23b7f
vesuvius: add tokenbook prototype cap
giorgioangel Apr 1, 2026
35ec636
vesuvius: document guided performance guidance
giorgioangel Apr 1, 2026
6e9af95
vesuvius: add ps256 config compatibility tests
giorgioangel Apr 1, 2026
e8c9bdd
vesuvius: add guided ps256 configs
giorgioangel Apr 1, 2026
9cb4040
vesuvius: add guide debug image logging
giorgioangel Apr 1, 2026
a46c4ca
vesuvius: document ps256 guide debug guidance
giorgioangel Apr 1, 2026
57fe2cf
vesuvius: stabilize guided compile startup
giorgioangel Apr 1, 2026
2aaf3d4
vesuvius: make guided loss amp safe
giorgioangel Apr 1, 2026
3693c9b
vesuvius: optimize guided ddp defaults
giorgioangel Apr 1, 2026
e4a1b9f
vesuvius: add selective guide compile modes
giorgioangel Apr 1, 2026
6b47e0b
vesuvius: log guided auxiliary losses to wandb
giorgioangel Apr 1, 2026
52c5f82
vesuvius: respect ckpt_out_base for run dirs
giorgioangel Apr 1, 2026
6a3418f
vesuvius: disable guide loss and separate guide debug logging
giorgioangel Apr 1, 2026
af31403
vesuvius: add encoder-only guide gating
giorgioangel Apr 2, 2026
d8a87ae
vesuvius: wire feature-encoder trainer logging
giorgioangel Apr 2, 2026
aec3d42
vesuvius: benchmark feature-encoder guidance
giorgioangel Apr 2, 2026
13b2074
vesuvius: add token-weighted tokenbook guidance
giorgioangel Apr 2, 2026
c708fb4
vesuvius: compile all guidance submodules
giorgioangel Apr 2, 2026
665ae2c
vesuvius: log guide previews to wandb
giorgioangel Apr 2, 2026
ba5ff41
vesuvius: make feature gating skip-only and residual
giorgioangel Apr 2, 2026
10821c7
vesuvius: smooth and pair guide preview rows
giorgioangel Apr 2, 2026
2f7dce1
vesuvius: add frozen skip concat guidance
giorgioangel Apr 2, 2026
9050061
vesuvius: pick nonzero validation debug samples
giorgioangel Apr 2, 2026
6c86b04
vesuvius: rotate validation debug previews by epoch
giorgioangel Apr 2, 2026
b641467
vesuvius: clean feature skip concat config surface
giorgioangel Apr 2, 2026
7ba08ec
vesuvius: project skip concat guides before resize
giorgioangel Apr 2, 2026
61b31a7
vesuvius: narrow skip concat projector widths
giorgioangel Apr 2, 2026
c7cd9e6
vesuvius: normalize skip concat projectors
giorgioangel Apr 2, 2026
4e8455a
vesuvius: add direct tokenbook segmentation mode
giorgioangel Apr 3, 2026
e7b8a3e
vesuvius: freeze pretrained backbones before init
giorgioangel Apr 3, 2026
94eb38e
vesuvius: add pixelshuffle decoder for frozen dino
giorgioangel Apr 3, 2026
f962184
vesuvius: refine pixelshuffle decoder head
giorgioangel Apr 3, 2026
5419e46
vesuvius: add late input skip to pixelshuffle decoder
giorgioangel Apr 3, 2026
ebdd6ba
vesuvius: simplify pixelshuffle final head
giorgioangel Apr 3, 2026
2b89b3e
vesuvius: drop pixelshuffle input skip
giorgioangel Apr 3, 2026
87fee82
vesuvius: add pixelshuffle head normalization
giorgioangel Apr 3, 2026
2769862
vesuvius: vendor mednext v1 core
giorgioangel Apr 4, 2026
8e3c306
vesuvius: integrate mednext v1 architecture
giorgioangel Apr 4, 2026
8b35508
vesuvius: benchmark mednext v1
giorgioangel Apr 4, 2026
71f3e17
vesuvius: add mednext v2 architecture
giorgioangel Apr 4, 2026
092275d
vesuvius: benchmark mednext v2 variants
giorgioangel Apr 4, 2026
7fc3eec
vesuvius: plumb adamw epsilon from config
giorgioangel Apr 4, 2026
e83aa22
vesuvius: fix clean branch benchmark and backbone packaging
giorgioangel Apr 4, 2026
941daa8
vesuvius: fix mednext checkpoint reload paths
giorgioangel Apr 4, 2026
8f6c954
vesuvius: add pixelshuffle pre-refine block
giorgioangel Apr 5, 2026
635d069
vesuvius: drop pixelshuffle conv biases before groupnorm
giorgioangel Apr 5, 2026
a3aceb4
vesuvius: split pixelshuffle pretrained decoder variants
giorgioangel Apr 5, 2026
c73787f
vesuvius: support explicit runtime volume specs
giorgioangel Apr 5, 2026
a42dcae
vesuvius: preserve explicit volume cache ids
giorgioangel Apr 5, 2026
23c0f33
vesuvius: revert bighead stage kernels to 3x3
giorgioangel Apr 5, 2026
a75c92a
vesuvius: rename and reshape pixelshuffle bighead
giorgioangel Apr 5, 2026
0701ecf
vesuvius: tune bighead final conv stack
giorgioangel Apr 5, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
101 changes: 101 additions & 0 deletions vesuvius/docs/guided_dinovol.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
# Guided Dinovol Training

## Environment

From `villa/vesuvius`:

```bash
uv sync --extra models --extra tests
```

## Example Guided Training

Use the example config:

```bash
uv run --extra models python -m vesuvius.models.training.train \
--config src/vesuvius/models/configuration/single_task/ps128_guided_dinovol_ink.yaml \
--input /path/to/data
```

The guided config uses the local volumetric DINO checkpoint at:

```text
/home/giorgio/Projects/dino-vesuvius/dino-checkpoints/checkpoint_step_342500.pt
```

It also enables:

```yaml
model_config:
guide_tokenbook_tokens: 256
```

This is an opt-in speed setting for the example config only. The model default remains full-grid TokenBook prototypes when the key is omitted.

Ps256 guided configs are also available:

```text
src/vesuvius/models/configuration/single_task/ps256_guided_medial.yaml
src/vesuvius/models/configuration/single_task/ps256_guided_dicece.yaml
```

For large guided runs, generate patch caches before training:

```bash
uv run --extra models vesuvius.find_patches \
--config src/vesuvius/models/configuration/single_task/ps256_guided_medial.yaml
```

## Tests

Run the guided coverage:

```bash
uv run --extra models --extra tests python -m pytest \
tests/models/configuration/test_ps256_config_compat.py \
tests/models/build/test_dinovol_local_backbone.py \
tests/models/build/test_guided_network.py \
tests/models/training/test_guided_trainer.py -q
```

## Benchmark

Profile unguided vs guided input gating on the local RTX 4090:

```bash
uv run --extra models python -m vesuvius.models.benchmarks.benchmark_guided_dinovol \
--guide-checkpoint /home/giorgio/Projects/dino-vesuvius/dino-checkpoints/checkpoint_step_342500.pt \
--patch-size 64,64,64 \
--device cuda
```

For faster prototype-count sweeps without compile variants:

```bash
uv run --extra models python -m vesuvius.models.benchmarks.benchmark_guided_dinovol \
--guide-checkpoint /home/giorgio/Projects/dino-vesuvius/dino-checkpoints/checkpoint_step_342500.pt \
--patch-size 64,64,64 \
--device cuda \
--guide-tokenbook-tokens 256 \
--skip-compile-variants \
--skip-stage-breakdown
```

## Current Operational Guidance

- Large guided runs should precompute patch caches with `vesuvius.find_patches` before training.
- Guided training now supports `tr_config.compile_policy`:
- `auto`: guided models compile the inner module, unguided models keep the legacy DDP-wrapper compile path
- `module`: compile the inner module before DDP wrapping
- `ddp_wrapper`: preserve the legacy `torch.compile(DDP(model))` path
- `off`: eager mode
- The guided backbone path is explicitly excluded from compiler capture because full-token guided `ps256` hit an Inductor `BackendCompilerFailed` crash during the first compiled train step.
- Do not expose or rely on a public `channels_last_3d` toggle; the measured gain was negligible relative to plain compile.
- Prefer capped TokenBook prototypes for large training patches; the example config uses `256` for `128^3`.
- The trainer now logs the guide validation visualization twice when guidance is enabled:
- embedded inside the composite `debug_image`
- separately as `debug_guide_image`
- Set `tr_config.startup_timing: true` when debugging slow startup or first-step stalls. This logs dataset init, model build, compile, first batch fetch, first forward/backward, and first optimizer step timings.
- On the local RTX 4090, full `256^3` forward inference for both unguided and guided ps256 configs hit OOM, so practical timing comparisons should use smaller patches or larger-memory GPUs.
- The guide panel/render overhead is small relative to model runtime, so it is reasonable to keep both composite and separate guide-image logging enabled.
1 change: 1 addition & 0 deletions vesuvius/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,7 @@ models = [
"typed-argument-parser>=1.11.0",
"wandb[media]>=0.22.0",
"psutil>=7.1.0",
"nest-asyncio>=1.6.0",

# Data pipeline / I/O and compression
"aiohttp>=3.12.15",
Expand Down
1 change: 1 addition & 0 deletions vesuvius/src/vesuvius/models/benchmarks/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
"""Benchmark helpers for model profiling."""
Loading
Loading