-
Notifications
You must be signed in to change notification settings - Fork 0
Add refactored MaskBuilder #22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
AdamBajger
wants to merge
55
commits into
main
Choose a base branch
from
feature/refactor-maskbuilder-for-xai
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
55 commits
Select commit
Hold shift + click to select a range
044fa01
feat: try to migrate changes to RatioPath from RationAI masks
AdamBajger 12aaf8f
refactor: split into files
AdamBajger d29feb7
Update ratiopath/masks/write_big_tiff.py
vejtek f8002d4
fix: add numpy and jaxtyping imports to mask builder modules
AdamBajger dc4b7be
Update ratiopath/masks/mask_builders/receptive_field_manipulation.py
AdamBajger f4ec8c7
Update pyproject.toml
AdamBajger 3ded496
fix: overlap naming convention
AdamBajger 215b962
fix: all imports correction
AdamBajger a4e8202
refactor: move __all__ exports to top in mask builder module __init__
AdamBajger c5405cf
chore: sync lockfile
AdamBajger 87723c3
fix: typo bracket
AdamBajger 39c34c6
chore: ruff check and format
AdamBajger 40807ee
debug: install libvips and openslide
AdamBajger e285b6a
fix: test filenames
AdamBajger 561cb6a
fix: update geopandas dependency version and refactor GeoJSONParser l…
6531f87
fix: update lock + freeze sync
matejpekar c2042d0
fix: mypy
matejpekar fd1effb
fix: unlink overlaps file in tests which was previously left linked
AdamBajger 74d2180
Initial plan
Copilot f52a84e
fix: correct module and class names in example code
Copilot d5747e3
fix: update docstring Args to match constructor signature
Copilot ee62864
Update tests/test_mask_builders.py
AdamBajger a62c3c8
Update ratiopath/masks/mask_builders/__init__.py
AdamBajger 3d67b05
Initial plan
Copilot 75f443a
fix: correct typo in test docstring (SImple → Simple)
Copilot 93f06dd
Merge pull request #23 from RationAI/copilot/sub-pr-22
AdamBajger 6c7bf95
Merge pull request #24 from RationAI/copilot/sub-pr-22-again
AdamBajger b5a60df
Apply suggestions from code review
AdamBajger 35565d5
Update tests/test_mask_builders.py
AdamBajger bc94f4e
Initial plan
Copilot b9874c2
chore: run ruff format to fix linting issues
Copilot c4519e3
Merge pull request #25 from RationAI/copilot/sub-pr-22-another-one
AdamBajger d766442
docs: fix example code
AdamBajger 1645cd0
refactor: remove obsolete field
AdamBajger 51921b1
chore: replace ellipsis by pass
AdamBajger 2c15c1c
Initial plan
Copilot cb02638
fix: correct docstring example - remove duplicate import, add numpy, …
Copilot 736853a
fix: update all mask builder docstring examples with correct API sign…
Copilot bb29f92
docs: add mask builders documentation and run ruff format
Copilot 1da68db
docs: validate mkdocs builds successfully
Copilot 6b27da0
chore: add site/ to gitignore and remove from git
Copilot cf42a1d
docs: clarify generate_tiles_from_slide is a placeholder function
Copilot 5388463
Merge pull request #26 from RationAI/copilot/sub-pr-22-yet-again
AdamBajger cc6d41b
fix: remove unnecessary pass to satisfylinter ruff
AdamBajger e207fcf
Initial plan
Copilot 45db890
chore: run ruff format to fix linting errors
Copilot 3177f5d
Merge pull request #27 from RationAI/copilot/sub-pr-22-one-more-time
AdamBajger af4fd45
fix: add explicit dtype parameter
AdamBajger 3972bf9
docs: add docstrings
AdamBajger 466dfca
fix: inheritance param mismatches
AdamBajger 5022161
fix: bump numpy version
AdamBajger b87ce27
fix: ruff formatting and linting
AdamBajger 8552d3c
docs: fix OpenSLide level_dimensions use in examples
AdamBajger aa959d4
fix: enhance memory setup in AutoScalingAveragingClippingNumpyMemMapM…
AdamBajger 915cd41
chore: ruff format
AdamBajger File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -19,4 +19,7 @@ wheels/ | |
| .mypy_cache/ | ||
|
|
||
| # VS Code | ||
| .vscode/ | ||
| .vscode/ | ||
|
|
||
| # MkDocs | ||
| site/ | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,254 @@ | ||
| # Mask Builders | ||
|
|
||
| Mask builders are tools for assembling feature masks from neural network predictions or other tile-level data. They handle the complexity of combining overlapping tiles, scaling between coordinate spaces, and managing memory for large output masks. | ||
|
|
||
| ## Overview | ||
|
|
||
| When processing whole-slide images with neural networks, you often need to: | ||
|
|
||
| 1. Extract tiles from a slide | ||
| 2. Run inference to get predictions or features for each tile | ||
| 3. Assemble these predictions back into a full-resolution mask | ||
|
|
||
| Mask builders automate step 3, handling: | ||
|
|
||
| - **Coordinate transformation**: Converting from tile coordinates to mask coordinates | ||
| - **Overlap handling**: Averaging or taking the maximum when tiles overlap | ||
| - **Memory management**: Using in-memory arrays or memory-mapped files for large masks | ||
| - **Edge clipping**: Removing boundary artifacts from tiles | ||
|
|
||
| ## Available Builders | ||
|
|
||
| ### AveragingScalarUniformTiledNumpyMaskBuilder | ||
|
|
||
| ::: ratiopath.masks.mask_builders.AveragingScalarUniformTiledNumpyMaskBuilder | ||
|
|
||
| **Use case**: You have scalar predictions (e.g., class probabilities) for each tile and want to create a mask where each prediction is uniformly expanded to fill the tile's footprint. Overlapping regions are averaged. | ||
|
|
||
| **Example**: Creating a heatmap of tumor probability predictions. | ||
|
|
||
| ```python | ||
| import numpy as np | ||
| import openslide | ||
| from ratiopath.masks.mask_builders import ( | ||
| AveragingScalarUniformTiledNumpyMaskBuilder, | ||
| ) | ||
| import matplotlib.pyplot as plt | ||
|
|
||
| # Open slide and set up tiling parameters | ||
| LEVEL = 3 | ||
| tile_extents = (512, 512) | ||
| tile_strides = (256, 256) | ||
| slide = openslide.OpenSlide("path/to/slide.mrxs") | ||
| slide_extent_x, slide_extent_y = slide.level_dimensions[LEVEL] | ||
|
|
||
| # Load your model | ||
| vgg16_model = load_vgg16_model(...) # load your pretrained model here | ||
|
|
||
| # Initialize mask builder | ||
| mask_builder = AveragingScalarUniformTiledNumpyMaskBuilder( | ||
| mask_extents=(slide_extent_y, slide_extent_x), | ||
| channels=1, # for binary classification | ||
| mask_tile_extents=tile_extents, | ||
| mask_tile_strides=tile_strides, | ||
| ) | ||
|
|
||
| # Process tiles | ||
| # Note: generate_tiles_from_slide is a placeholder - you must implement your own tile extraction logic | ||
| for tiles, xs, ys in generate_tiles_from_slide( | ||
| slide, LEVEL, tile_extents, tile_strides, batch_size=32 | ||
| ): | ||
| # tiles has shape (B, C, H, W) | ||
| features = vgg16_model.predict(tiles) # features has shape (B, channels) | ||
| # Stack ys and xs into coords_batch with shape (N, B) where N=2 (y, x dimensions) | ||
| coords_batch = np.stack([ys, xs], axis=0) | ||
| mask_builder.update_batch(features, coords_batch) | ||
|
|
||
| # Finalize and visualize | ||
| assembled_mask, overlap = mask_builder.finalize() | ||
| plt.imshow(assembled_mask[0], cmap="gray", interpolation="nearest") | ||
| plt.axis("off") | ||
| plt.show() | ||
| ``` | ||
|
|
||
| --- | ||
|
|
||
| ### MaxScalarUniformTiledNumpyMaskBuilder | ||
|
|
||
| ::: ratiopath.masks.mask_builders.MaxScalarUniformTiledNumpyMaskBuilder | ||
|
|
||
| **Use case**: Similar to the averaging builder, but takes the maximum value at each pixel instead of averaging. Useful when you want to preserve the strongest signal. | ||
|
|
||
| **Example**: Creating activation maps from intermediate network layers. | ||
|
|
||
| ```python | ||
| import numpy as np | ||
| import openslide | ||
| from ratiopath.masks.mask_builders import MaxScalarUniformTiledNumpyMaskBuilder | ||
| import matplotlib.pyplot as plt | ||
| from rationai.explainability.model_probing import HookedModule | ||
|
|
||
| LEVEL = 3 | ||
| tile_extents = (512, 512) | ||
| tile_strides = (256, 256) | ||
| slide = openslide.OpenSlide("path/to/slide.mrxs") | ||
| slide_extent_x, slide_extent_y = slide.level_dimensions[LEVEL] | ||
|
|
||
| # Set up model with hooks to extract intermediate activations | ||
| vgg16_model = load_vgg16_model(...) | ||
| hooked_model = HookedModule(vgg16_model, layer_name="backbone.9") | ||
|
|
||
| mask_builder = MaxScalarUniformTiledNumpyMaskBuilder( | ||
| mask_extents=(slide_extent_y, slide_extent_x), | ||
| channels=1, | ||
| mask_tile_extents=tile_extents, | ||
| mask_tile_strides=tile_strides, | ||
| ) | ||
|
|
||
| # Note: generate_tiles_from_slide is a placeholder - you must implement your own tile extraction logic | ||
| for tiles, xs, ys in generate_tiles_from_slide( | ||
| slide, LEVEL, tile_extents, tile_strides, batch_size=32 | ||
| ): | ||
| # tiles has shape (B, C, H, W) | ||
| outputs = hooked_model.predict(tiles) # outputs are not used directly | ||
| features = hooked_model.get_activations("backbone.9") # shape (B, C, H, W) | ||
| # Stack ys and xs into coords_batch with shape (N, B) where N=2 (y, x dimensions) | ||
| coords_batch = np.stack([ys, xs], axis=0) | ||
| mask_builder.update_batch(features, coords_batch) | ||
|
|
||
| (assembled_mask,) = mask_builder.finalize() | ||
| plt.imshow(assembled_mask[0], cmap="gray", interpolation="nearest") | ||
| plt.axis("off") | ||
| plt.show() | ||
| ``` | ||
|
|
||
| --- | ||
|
|
||
| ### AutoScalingAveragingClippingNumpyMemMapMaskBuilder2D | ||
|
|
||
| ::: ratiopath.masks.mask_builders.AutoScalingAveragingClippingNumpyMemMapMaskBuilder2D | ||
|
|
||
| **Use case**: You have high-resolution feature maps from a network and need to: | ||
| - Handle masks too large for RAM (using memory-mapped files) | ||
| - Automatically scale coordinates from input to output space | ||
| - Remove edge artifacts from tiles | ||
|
|
||
| **Example**: Building attention maps with edge clipping to remove boundary artifacts. | ||
|
|
||
| ```python | ||
| import numpy as np | ||
| import openslide | ||
| from ratiopath.masks.mask_builders import ( | ||
| AutoScalingAveragingClippingNumpyMemMapMaskBuilder2D, | ||
| ) | ||
| from rationai.explainability.model_probing import HookedModule | ||
| import matplotlib.pyplot as plt | ||
|
|
||
| LEVEL = 3 | ||
| tile_extents = (512, 512) | ||
| tile_strides = (256, 256) | ||
| slide = openslide.OpenSlide("path/to/slide.mrxs") | ||
| slide_extent_x, slide_extent_y = slide.level_dimensions[LEVEL] | ||
|
|
||
| vgg16_model = load_vgg16_model(...) | ||
| hooked_model = HookedModule(vgg16_model, layer_name="backbone.9") | ||
|
|
||
| # This builder handles coordinate scaling and uses memory-mapped storage | ||
| mask_builder = AutoScalingAveragingClippingNumpyMemMapMaskBuilder2D( | ||
| source_extents=(slide_extent_y, slide_extent_x), | ||
| source_tile_extents=tile_extents, | ||
| source_tile_strides=tile_strides, | ||
| mask_tile_extents=(64, 64), # output resolution per tile | ||
| channels=3, # for RGB masks | ||
| clip=(4, 4, 4, 4), # clip 4 pixels from each edge | ||
| ) | ||
|
|
||
| # Note: generate_tiles_from_slide is a placeholder - you must implement your own tile extraction logic | ||
| for tiles, xs, ys in generate_tiles_from_slide( | ||
| slide, LEVEL, tile_extents, tile_strides, batch_size=32 | ||
| ): | ||
| # tiles has shape (B, C, H, W) | ||
| output = vgg16_model.predict(tiles) # outputs are not used directly | ||
| features = hooked_model.get_activations("backbone.9") # shape (B, C, H, W) | ||
| # Stack ys and xs into coords_batch with shape (N, B) where N=2 (y, x dimensions) | ||
| coords_batch = np.stack([ys, xs], axis=0) | ||
| mask_builder.update_batch(features, coords_batch) | ||
|
|
||
| assembled_mask, overlap = mask_builder.finalize() | ||
| plt.imshow(assembled_mask[0], cmap="gray", interpolation="nearest") | ||
| plt.axis("off") | ||
| plt.show() | ||
| ``` | ||
|
|
||
| --- | ||
|
|
||
| ### AutoScalingScalarUniformValueConstantStrideMaskBuilder | ||
|
|
||
| ::: ratiopath.masks.mask_builders.AutoScalingScalarUniformValueConstantStrideMaskBuilder | ||
|
|
||
| **Use case**: Your network outputs scalar predictions per tile, and you want each prediction to represent a fixed-size region in the output mask, automatically handling coordinate scaling. | ||
|
|
||
| **Example**: Creating a low-resolution classification map where each tile's prediction covers a 64×64 region. | ||
|
|
||
| ```python | ||
| import numpy as np | ||
| import openslide | ||
| from ratiopath.masks.mask_builders import ( | ||
| AutoScalingScalarUniformValueConstantStrideMaskBuilder, | ||
| ) | ||
| import matplotlib.pyplot as plt | ||
|
|
||
| LEVEL = 3 | ||
| tile_extents = (512, 512) | ||
| tile_strides = (256, 256) | ||
| slide = openslide.OpenSlide("path/to/slide.mrxs") | ||
| slide_extent_x, slide_extent_y = slide.level_dimensions[LEVEL] | ||
| classifier_model = load_classifier_model(...) | ||
|
|
||
| # Build a mask where each scalar prediction covers 64x64 pixels in output | ||
| mask_builder = AutoScalingScalarUniformValueConstantStrideMaskBuilder( | ||
| source_extents=(slide_extent_y, slide_extent_x), | ||
| source_tile_extents=tile_extents, | ||
| source_tile_strides=tile_strides, | ||
| mask_tile_extents=(64, 64), # each scalar value expands to 64x64 | ||
| channels=3, # for multi-class predictions | ||
| ) | ||
|
|
||
| # Note: generate_tiles_from_slide is a placeholder - you must implement your own tile extraction logic | ||
| for tiles, xs, ys in generate_tiles_from_slide( | ||
| slide, LEVEL, tile_extents, tile_strides, batch_size=32 | ||
| ): | ||
| # tiles has shape (B, C, H, W) | ||
| predictions = classifier_model.predict(tiles) # predictions has shape (B, channels) | ||
| # Stack ys and xs into coords_batch with shape (N, B) where N=2 (y, x dimensions) | ||
| coords_batch = np.stack([ys, xs], axis=0) | ||
| mask_builder.update_batch(predictions, coords_batch) | ||
|
|
||
| assembled_mask, overlap = mask_builder.finalize() | ||
| plt.imshow(assembled_mask[0], cmap="viridis", interpolation="nearest") | ||
| plt.axis("off") | ||
| plt.show() | ||
| ``` | ||
|
|
||
| ## Choosing a Mask Builder | ||
|
|
||
| | Builder | Scalar/Feature Map | Aggregation | Memory | Auto-scaling | Edge Clipping | | ||
| |---------|-------------------|-------------|---------|--------------|---------------| | ||
| | `AveragingScalarUniformTiledNumpyMaskBuilder` | Scalar | Average | RAM | No | No | | ||
| | `MaxScalarUniformTiledNumpyMaskBuilder` | Feature Map | Max | RAM | No | No | | ||
| | `AutoScalingAveragingClippingNumpyMemMapMaskBuilder2D` | Feature Map | Average | Disk (memmap) | Yes | Yes | | ||
| | `AutoScalingScalarUniformValueConstantStrideMaskBuilder` | Scalar | Average | RAM | Yes | No | | ||
|
|
||
| ## Coordinate System Notes | ||
|
|
||
| All mask builders expect coordinates in the format `(N, B)` where: | ||
| - `N` is the number of spatial dimensions (typically 2 for height and width) | ||
| - `B` is the batch size | ||
|
|
||
| When implementing your own tile extraction logic (such as the `generate_tiles_from_slide` placeholder shown in examples), you should provide `xs` and `ys` arrays representing tile coordinates. Stack them as: | ||
|
|
||
| ```python | ||
| coords_batch = np.stack([ys, xs], axis=0) # Shape: (2, B) | ||
| ``` | ||
|
|
||
| Note the order: `[ys, xs]` not `[xs, ys]`, as the first dimension represents height (y) and the second represents width (x). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
Repository: RationAI/ratiopath
Length of output: 76
🏁 Script executed:
cat -n pyproject.toml | head -60Repository: RationAI/ratiopath
Length of output: 1595
🌐 Web query:
jaxtyping latest stable version release documentation💡 Result:
Latest stable jaxtyping release: 0.3.5 — uploaded to PyPI on January 5, 2026. Install: pip install jaxtyping. Documentation: https://docs.kidger.site/jaxtyping/ [1][2]
Sources:
Add version constraint for jaxtyping to match project conventions.
Line 30:
jaxtypingis the only dependency without a version constraint. All other dependencies specify minimum versions (e.g.,numpy>=2.2.2,torch>=2.6.0). Addjaxtyping>=0.3.5to maintain consistency and prevent unexpected breaking changes in future releases.🤖 Prompt for AI Agents
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
☝
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Adames4, thanks for acknowledging! Just a friendly reminder to add the version constraintjaxtyping>=0.3.5to maintain consistency with the other dependencies when you get a chance.