Skip to content

Conversation

@AdamBajger
Copy link

@AdamBajger AdamBajger commented Jan 9, 2026

Summary by CodeRabbit

  • New Features

    • Mask-building toolkit: averaging & max aggregation, edge clipping, autoscaling across resolutions/strides, scalar→tile expansion, RAM- or disk-backed accumulators, and BigTIFF export.
  • Tests

    • Extensive unit tests for tiling, aggregation, clipping, autoscaling, memmap persistence/cleanup; deterministic test seeding added.
  • Documentation

    • Added Mask Builders reference page with examples and navigation entry.
  • Chores

    • Project bumped to 1.1.0; added jaxtyping and updated geopandas; CI now installs OpenSlide/libvips and uses updated dependency-sync flags.
  • Stability

    • Minor GeoJSON handling tweak, small typing cleanup, and .gitignore updated.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Jan 9, 2026

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

Adds a tile-based mask assembly system: MaskBuilder core, allocator mixins (ndarray/memmap), receptive-field mixins (clipping, autoscaling, scalar→tile), aggregation mixins (averaging, max), concrete builders, BigTIFF writer, tests, docs, CI tweaks, and minor dependency updates (jaxtyping added; geopandas bumped).

Changes

Cohort / File(s) Summary
Dependency & Project
pyproject.toml
Version bumped to 1.1.0; added dependency jaxtyping; updated geopandas >=1.1.1>=1.1.2.
Core Framework
ratiopath/masks/mask_builders/mask_builder.py
New MaskBuilder base class and helper compute_acc_slices; defines allocate_accumulator, setup_memory, update_batch, update_batch_impl, finalize.
Aggregation Mixins
ratiopath/masks/mask_builders/aggregation.py
Added AveragingMaskBuilderMixin (accumulator + overlap_counter, averaging finalize) and MaxMaskBuilderMixin (elementwise max aggregation).
Storage Backends
ratiopath/masks/mask_builders/storage.py
Added NumpyMemMapMaskBuilderAllocatorMixin (temp/file-backed memmap allocation) and NumpyArrayMaskBuilderAllocatorMixin (in-memory ndarray allocation).
Receptive-field Utilities
ratiopath/masks/mask_builders/receptive_field_manipulation.py
Added EdgeClippingMaskBuilderMixin / EdgeClippingMaskBuilder2DMixin, AutoScalingConstantStrideMixin, and ScalarUniformTiledMaskBuilder (clipping, autoscaling/rescaling, scalar→tile expansion).
Public Builders & API
ratiopath/masks/mask_builders/__init__.py
Added concrete builders: AveragingScalarUniformTiledNumpyMaskBuilder, MaxScalarUniformTiledNumpyMaskBuilder, AutoScalingAveragingClippingNumpyMemMapMaskBuilder2D, AutoScalingScalarUniformValueConstantStrideMaskBuilder; updated __all__ and docstrings.
TIFF Utility
ratiopath/masks/write_big_tiff.py
Added write_big_tiff(image, path, mpp_x, mpp_y, tile_width=512, tile_height=512) to save pyvips Images as BigTIFF with DEFLATE, tiling, and pyramid.
GeoJSON Parser
ratiopath/parsers/geojson_parser.py
Simplified first-subkey handling in get_filtered_geodataframe: use column values directly (removed JSON-string parsing branch).
Tests & Fixtures
tests/conftest.py, tests/test_mask_builders.py
Added autouse RNG-seeding fixture and extensive tests covering averaging/max aggregation, clipping, memmap temp & persistent behavior, autoscaling, batching and finalization.
CI Workflows
.github/workflows/tests.yml, .../build-docs.yml, .../publish.yml
Tests workflow: added openslide & libvips install steps; changed uv sync flags from --locked--frozen and adjusted invocations.
Docs / Navigation
docs/reference/masks/mask_builders.md, mkdocs.yml
New comprehensive Mask Builders documentation and mkdocs navigation entry.
Miscellaneous
ratiopath/tiling/overlays.py, .gitignore
Minor type-hint tweak; added site/ to .gitignore.

Sequence Diagram(s)

sequenceDiagram
    participant User as User
    participant Builder as MaskBuilder
    participant Alloc as AllocatorMixin
    participant RF as ReceptiveFieldMixin
    participant Agg as AggregationMixin
    participant Acc as Accumulator

    User->>Builder: __init__(mask_extents, channels, ...)
    Builder->>Alloc: allocate_accumulator(mask_extents, channels, dtype, ...)
    Alloc->>Acc: create ndarray / memmap (channels, *mask_extents)
    Acc-->>Builder: accumulator assigned

    User->>Builder: update_batch(data_batch, coords_batch)
    Builder->>RF: update_batch (clip / scale / expand)
    RF->>Agg: update_batch_impl (compute_acc_slices)
    Agg->>Acc: write/update accumulator (and overlap counter if averaging)

    User->>Builder: finalize()
    Agg-->>User: final mask (and overlap counter if present)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I hop through tiles, stitch by stitch,
Clips trimmed neat, scales set to pitch,
Memmaps cradle heavy loads of light,
Means and maxima twine through night,
A rabbit seals the tiled map with delight.

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 57.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Add refactored MaskBuilder' directly describes the main change: introducing a refactored MaskBuilder implementation with multiple new classes and supporting infrastructure across the mask_builders module.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link

Summary of Changes

Hello @AdamBajger, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the mask building infrastructure by introducing a highly modular and extensible MaskBuilder system. The new design leverages Python's multiple inheritance to allow for the flexible combination of various functionalities, such as different aggregation methods (averaging, max), advanced receptive field manipulations (edge clipping, automatic coordinate scaling), and optimized storage solutions (in-memory NumPy arrays or memory-mapped files for large datasets). This enhancement aims to provide a robust, efficient, and adaptable framework for assembling complex masks from tiled data, particularly beneficial for applications involving large images and neural network outputs.

Highlights

  • Modular MaskBuilder Architecture: Introduced a new MaskBuilder system using a cooperative multiple inheritance pattern, allowing flexible composition of features.
  • Aggregation Mixins: Added AveragingMaskBuilderMixin and MaxMaskBuilderMixin for different tile overlap aggregation strategies.
  • Receptive Field Manipulation: Implemented mixins for EdgeClipping and AutoScalingConstantStride to handle resolution mismatches and boundary artifacts.
  • Storage Backends: Provided NumpyArrayMaskBuilderAllocatorMixin and NumpyMemMapMaskBuilderAllocatorMixin for efficient memory management, including disk-backed storage for large masks.
  • New Utility for BigTIFF: Added write_big_tiff function for saving pyvips images in BigTIFF format, suitable for large-scale image viewing applications.
  • Comprehensive Testing: Included extensive unit tests for all new MaskBuilder components, ensuring robustness and correctness.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a well-structured, mixin-based architecture for mask builders, which is a great improvement for modularity and reusability. The separation of concerns into aggregation, storage, and receptive field manipulation is clean. However, there are several critical issues, mostly related to missing imports, that will prevent the code from running. There are also some bugs, particularly in temporary file handling for memory-mapped arrays, and several inconsistencies in docstrings and tests. Once these issues are addressed, this will be a solid contribution.

Comment on lines +38 to +45
with tempfile.NamedTemporaryFile() as temp_file:
return np.lib.format.open_memmap(
temp_file.name,
mode="w+",
shape=(channels, *mask_extents),
dtype=dtype,
version=(3, 0),
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Using tempfile.NamedTemporaryFile within a with statement is incorrect here. The temporary file is deleted as soon as the with block is exited, but the returned memmap object still refers to it, making it invalid. Any attempt to use the accumulator will fail.

To fix this, the NamedTemporaryFile object must be kept alive for the lifetime of the MaskBuilder instance. You can store it as an instance attribute to prevent it from being garbage collected prematurely.

            # The temporary file object must be kept alive for the lifetime of the memmap.
            # We store it on the instance to prevent it from being garbage collected and
            # the file from being deleted when this function returns.
            if not hasattr(self, "_temp_files"):
                self._temp_files = []
            temp_file = tempfile.NamedTemporaryFile()
            self._temp_files.append(temp_file)
            return np.lib.format.open_memmap(
                temp_file.name,
                mode="w+",
                shape=(channels, *mask_extents),
                dtype=dtype,
                version=(3, 0),
            )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, looks like a critical error that was just resolved.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 14

🤖 Fix all issues with AI agents
In @pyproject.toml:
- Line 30: The dependency entry for jaxtyping is missing a version constraint;
update the dependency list by replacing the bare "jaxtyping" entry with a
minimum-version specifier "jaxtyping>=0.3.5" so it matches the project's
convention of pinning minimum versions like numpy>=2.2.2 and torch>=2.6.0.

In @ratiopath/masks/mask_builders/__init__.py:
- Around line 231-245: The override of setup_memory in the class does all
allocation itself (calls allocate_accumulator and sets self.accumulator and
self.overlap_counter) but does not call super().setup_memory(...); either add a
call to super().setup_memory(mask_extents, channels, accumulator_filepath,
overlap_counter_filepath, **kwargs) at the start (or end) to preserve
cooperative mixin behavior, or add a short docstring/comment on this
setup_memory method explaining why the base implementation is intentionally
bypassed; ensure references to allocate_accumulator, self.accumulator and
self.overlap_counter remain correct if you call super().
- Around line 30-60: Update the docstring examples to use the correct package
name "ratiopath" and call the mask builder API using update_batch(data_batch,
coords_batch) instead of update_batch(features, xs, ys); specifically, change
imports to from ratiopath.masks.mask_builders import
AveragingScalarUniformTiledNumpyMaskBuilder, ensure generate_tiles_from_slide
remains as the tile producer, and pass coords_batch as a single value (e.g., a
tuple or list containing xs and ys) when calling
mask_builder.update_batch(data_batch, coords_batch); apply the same fixes to the
other examples mentioned (lines 99-127 and 175-205) and ensure the finalize()
usage and assembled_mask/overlap variable names remain consistent.
- Around line 1-9: The module uses the jaxtyping type alias Int64 in multiple
annotations but never imports it; add "Int64" from jaxtyping to the imports and
reorder imports into stdlib, third-party, then local groups. Specifically,
update the top-level import block to include Int64 (so Int64 is available for
annotations referenced throughout the file) and ensure import ordering follows
the stdlib / third-party / local convention while keeping existing symbols like
AveragingMaskBuilderMixin, MaxMaskBuilderMixin, AccumulatorType,
AutoScalingConstantStrideMixin, EdgeClippingMaskBuilder2DMixin,
ScalarUniformTiledMaskBuilder, NumpyArrayMaskBuilderAllocatorMixin, and
NumpyMemMapMaskBuilderAllocatorMixin unchanged.

In @ratiopath/masks/mask_builders/aggregation.py:
- Around line 65-104: MaxMaskBuilderMixin must override setup_memory() to
reinitialize the allocated accumulator to the proper minimum value (not zeros)
so max-reduction with negative tiles isn't clamped to 0: call
super().setup_memory(), obtain the accumulator dtype, compute the dtype min (use
np.finfo for floats and np.iinfo for ints), and then set every element of
self.accumulator to that min (via fill or np.full/assignment) so subsequent
np.maximum calls produce correct results for negative values.
- Around line 1-5: Add the missing imports: import numpy as np (used by
np.asarray and np.maximum) and import the Int64 and Shaped type names from the
project typing module (e.g., from ratiopath.typing import Int64, Shaped) so the
annotations and calls referencing Int64 and Shaped compile; update the top of
the file alongside the existing AccumulatorType, MaskBuilder, compute_acc_slices
import so np, Int64, and Shaped are available where functions like
compute_acc_slices, AccumulatorType, and MaskBuilder use them.

In @ratiopath/masks/mask_builders/mask_builder.py:
- Line 2: Remove the unused import of the math module in mask_builder.py: delete
the line importing math at the top of
ratiopath/masks/mask_builders/mask_builder.py since no functions or classes in
this file reference math; ensure no other references to math exist in functions
or classes within this module (e.g., any methods in MaskBuilder or related
helpers) before removing.

In @ratiopath/masks/mask_builders/receptive_field_manipulation.py:
- Line 56: Rename the misspelled variable adjusterd_coords_batch to
adjusted_coords_batch where it is defined and used (in the expression
coords_batch + self.clip_start_indices[:, np.newaxis]) so all references in
receptive_field_manipulation.py consistently use adjusted_coords_batch; update
any subsequent usages of adjusterd_coords_batch to the corrected name.
- Around line 1-4: The import ordering is incorrect: move the standard
library/third-party import "from typing import Any" to appear before the local
module import "from ratiopath.masks.mask_builders.mask_builder import
AccumulatorType, MaskBuilder", remove the extra blank line between imports, and
ensure imports follow the standard group order (stdlib, third-party, local) to
satisfy the linter/isort rules.

In @ratiopath/masks/mask_builders/storage.py:
- Around line 1-7: Imports in storage.py are mis-ordered (the local package
import from ratiopath.masks.mask_builders.mask_builder appears before
standard-library imports), causing linter I001; reorder the imports so
standard-library modules (tempfile, pathlib.Path, typing.Any) come first, then
third-party (if any), then local package imports (the line importing
AccumulatorType and MaskBuilder), remove the extra blank line between them, and
then run the project formatter/import sorter (e.g., isort/black) to satisfy CI.
- Around line 1-7: The file is missing numpy and typing imports (e.g., import
numpy as np, import numpy.typing as npt, and Int64 from pandas or appropriate
module) and allocate_accumulator incorrectly creates an np.memmap via
tempfile.NamedTemporaryFile which deletes the backing file when the context
exits; update imports to include np, npt, and Int64 (or the correct alias used
elsewhere) and change allocate_accumulator to create a stable tempfile path
(e.g., use tempfile.mkstemp or NamedTemporaryFile(delete=False) or
tempfile.TemporaryDirectory) and call np.lib.format.open_memmap or np.memmap on
that persistent path, ensure the file handle is closed and the path returned
alongside the memmap or the file is not deleted on context exit so the returned
memmap remains valid (modify function names allocate_accumulator and any use of
open_memmap/np.memmap accordingly).

In @ratiopath/masks/write_big_tiff.py:
- Around line 45-46: Validate the mpp inputs before computing xres/yres: check
that mpp_x and mpp_y are provided and greater than zero (e.g., in the same
function where xres = 1000 / mpp_x and yres = 1000 / mpp_y, likely in
write_big_tiff or its caller), and if not raise a clear ValueError (or
TypeError) with a message like "mpp_x/mpp_y must be positive numbers"; this
prevents ZeroDivisionError and invalid resolutions and ensures downstream code
receives valid resolution values.

In @tests/test_mask_builders.py:
- Around line 259-299: The test test_numpy_memmap_tempfile_management currently
only checks the temp file is gone after del assembler and doesn't verify that
the temp file existed while the builder was alive; update the test to assert
that the captured path in captured_files (from intercepting_namedtempfile)
exists immediately after constructing
AutoScalingAveragingClippingNumpyMemMapMaskBuilder2D (and/or after calling
assembler.update_batch) before deleting the assembler, so you prove the
memmap/tempfile was created and lived during the builder's lifetime; if the
builder unlinks immediately on creation, adjust the expectation to reflect that
behavior and assert accordingly.
- Around line 1-12: Tests import block is unsorted causing I001 lint; run the
project's import sorter (e.g., isort/ruff -a or the configured formatter) on
tests/test_mask_builders.py to reorder and group imports: standard library
(tempfile, Path), third-party (numpy as np, pytest), then local package imports.
Also combine multiple imports from ratiopath.masks.mask_builders into
consolidated, alphabetized import statements (including
AveragingScalarUniformTiledNumpyMaskBuilder,
MaxScalarUniformTiledNumpyMaskBuilder,
AutoScalingScalarUniformValueConstantStrideMaskBuilder,
AutoScalingAveragingClippingNumpyMemMapMaskBuilder2D) so the import block
follows the project's style.
🧹 Nitpick comments (4)
ratiopath/masks/write_big_tiff.py (1)

39-43: Adjust stacklevel for more useful warning location.

The stacklevel=1 points to this function itself. Using stacklevel=2 would point to the caller, making the warning more actionable.

📍 Proposed fix
-            "Make sure the resolution is in µm/pixel not in pixels/mm.", stacklevel=1
+            "Make sure the resolution is in µm/pixel not in pixels/mm.", stacklevel=2
ratiopath/masks/mask_builders/mask_builder.py (1)

84-96: Consider documenting dtype handling in setup_memory.

The method calls allocate_accumulator without explicitly passing dtype, relying on it being provided via **kwargs or using the default in the implementation. While this works (as seen in storage.py implementations with dtype=np.float32), explicitly documenting this behavior would improve clarity.

Consider adding a note to the docstring:

def setup_memory(self, mask_extents, channels, **kwargs) -> None:
    """This method sets up memory structures needed for mask building.
    
    ...existing documentation...
    
    Note:
        The dtype for the accumulator can be specified via kwargs and will
        default to np.float32 if not provided.
    """
ratiopath/masks/mask_builders/receptive_field_manipulation.py (1)

162-199: Consider validating source_extents >= source_tile_extents.

The calculation at line 185 assumes source_extents >= source_tile_extents. If this assumption is violated, total_strides could be negative, leading to unexpected behavior. Adding an explicit check would make the constraint clear and fail fast with a helpful error message.

🛡️ Proposed validation
     ) -> None:
+        if not np.all(source_extents >= source_tile_extents):
+            raise ValueError(
+                f"source_extents must be >= source_tile_extents in all dimensions, "
+                f"got source_extents={source_extents}, source_tile_extents={source_tile_extents}"
+            )
+
         # self.source_extents = source_extents
         self.source_tile_extents = source_tile_extents
ratiopath/masks/mask_builders/__init__.py (1)

258-263: Decide whether AutoScalingScalarUniformValueConstantStrideMaskBuilder should be in __all__.
Tests import it explicitly (works), but __all__ suggests it’s not part of the intended public surface.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 55d3b3f and 12aaf8f.

📒 Files selected for processing (9)
  • pyproject.toml
  • ratiopath/masks/mask_builders/__init__.py
  • ratiopath/masks/mask_builders/aggregation.py
  • ratiopath/masks/mask_builders/mask_builder.py
  • ratiopath/masks/mask_builders/receptive_field_manipulation.py
  • ratiopath/masks/mask_builders/storage.py
  • ratiopath/masks/write_big_tiff.py
  • tests/conftest.py
  • tests/test_mask_builders.py
🧰 Additional context used
🧬 Code graph analysis (6)
tests/test_mask_builders.py (4)
ratiopath/masks/mask_builders/__init__.py (3)
  • AveragingScalarUniformTiledNumpyMaskBuilder (11-77)
  • MaxScalarUniformTiledNumpyMaskBuilder (80-145)
  • AutoScalingScalarUniformValueConstantStrideMaskBuilder (265-323)
ratiopath/masks/mask_builders/mask_builder.py (2)
  • update_batch (98-119)
  • finalize (145-158)
ratiopath/masks/mask_builders/receptive_field_manipulation.py (3)
  • update_batch (41-60)
  • update_batch (201-209)
  • update_batch (257-268)
ratiopath/masks/mask_builders/aggregation.py (2)
  • finalize (59-62)
  • finalize (103-104)
ratiopath/masks/mask_builders/mask_builder.py (4)
ratiopath/masks/mask_builders/__init__.py (1)
  • setup_memory (231-244)
ratiopath/masks/mask_builders/aggregation.py (5)
  • setup_memory (28-33)
  • update_batch_impl (35-57)
  • update_batch_impl (76-100)
  • finalize (59-62)
  • finalize (103-104)
ratiopath/masks/mask_builders/storage.py (2)
  • allocate_accumulator (29-55)
  • allocate_accumulator (65-72)
ratiopath/masks/mask_builders/receptive_field_manipulation.py (3)
  • update_batch (41-60)
  • update_batch (201-209)
  • update_batch (257-268)
ratiopath/masks/mask_builders/aggregation.py (3)
ratiopath/masks/mask_builders/mask_builder.py (6)
  • MaskBuilder (43-158)
  • compute_acc_slices (18-40)
  • setup_memory (84-96)
  • allocate_accumulator (74-82)
  • update_batch_impl (122-142)
  • finalize (145-158)
ratiopath/masks/mask_builders/__init__.py (1)
  • setup_memory (231-244)
ratiopath/masks/mask_builders/storage.py (2)
  • allocate_accumulator (29-55)
  • allocate_accumulator (65-72)
ratiopath/masks/mask_builders/storage.py (1)
ratiopath/masks/mask_builders/mask_builder.py (2)
  • MaskBuilder (43-158)
  • allocate_accumulator (74-82)
ratiopath/masks/mask_builders/__init__.py (4)
ratiopath/masks/mask_builders/aggregation.py (3)
  • AveragingMaskBuilderMixin (7-62)
  • MaxMaskBuilderMixin (65-104)
  • setup_memory (28-33)
ratiopath/masks/mask_builders/receptive_field_manipulation.py (2)
  • AutoScalingConstantStrideMixin (119-209)
  • EdgeClippingMaskBuilder2DMixin (63-116)
ratiopath/masks/mask_builders/storage.py (4)
  • NumpyArrayMaskBuilderAllocatorMixin (58-72)
  • NumpyMemMapMaskBuilderAllocatorMixin (9-55)
  • allocate_accumulator (29-55)
  • allocate_accumulator (65-72)
ratiopath/masks/mask_builders/mask_builder.py (2)
  • setup_memory (84-96)
  • allocate_accumulator (74-82)
ratiopath/masks/mask_builders/receptive_field_manipulation.py (1)
ratiopath/masks/mask_builders/mask_builder.py (2)
  • MaskBuilder (43-158)
  • update_batch (98-119)
🪛 GitHub Actions: Lint
tests/test_mask_builders.py

[warning] 1-1: I001 Import block is un-sorted or un-formatted. Organize imports.

ratiopath/masks/mask_builders/mask_builder.py

[warning] 2-2: F401 Unused import 'math'. Remove to fix lint.

ratiopath/masks/mask_builders/aggregation.py

[warning] 24-25: F821 Undefined name 'Int64' (likely missing import/alias).

ratiopath/masks/mask_builders/storage.py

[warning] 1-1: I001 Import block is un-sorted or un-formatted. Organize imports.

ratiopath/masks/mask_builders/__init__.py

[error] 65-70: Undefined name 'Int64' (F821) reported by Ruff in multiple type hints.

ratiopath/masks/mask_builders/receptive_field_manipulation.py

[warning] 1-1: I001 Import block is un-sorted or un-formatted. Organize imports.

🪛 GitHub Actions: Tests
tests/test_mask_builders.py

[error] 1-1: pytest collection failed due to missing libvips dependencies. ModuleNotFoundError: No module named '_libvips'. Additionally, OSError: cannot load library 'libvips.so.42' during import of pyvips. Process completed with exit code 2. Command: uv run pytest.

🔇 Additional comments (6)
tests/conftest.py (1)

5-8: LGTM!

The fixture correctly establishes deterministic behavior for NumPy-based tests.

ratiopath/masks/mask_builders/mask_builder.py (1)

18-40: LGTM!

The function correctly computes slice objects for accumulator indexing. The logic is clear and the implementation handles batch processing efficiently.

ratiopath/masks/mask_builders/receptive_field_manipulation.py (2)

89-116: LGTM!

The parameter parsing logic correctly handles all three clip formats (single int, tuple of 2, tuple of 4) and provides clear error messages for invalid inputs.


242-268: LGTM!

The GCD-based compression strategy is elegant and correctly reduces memory usage for uniform tiles. The broadcasting logic in update_batch correctly expands scalar values into tiles.

tests/test_mask_builders.py (2)

14-75: > Likely an incorrect or invalid review comment.


1-12: Remove incorrect CI diagnosis; address minor import and seeding issues.

The imports from ratiopath.masks.mask_builders do not trigger pyvips unconditional imports—consolidate multiple imports from the same module into a single statement (lines 7–11). Additionally, the explicit np.random.seed(0) at line 155 overrides the autouse fixture seed; either remove it to rely on the fixture or document the intentional override.

Likely an incorrect or invalid review comment.

"zarr>=3.1.1",
"geopandas>=1.1.1",
"rasterio>=1.4.3",
"jaxtyping",
Copy link

@coderabbitai coderabbitai bot Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

git ls-files pyproject.toml

Repository: RationAI/ratiopath

Length of output: 76


🏁 Script executed:

cat -n pyproject.toml | head -60

Repository: RationAI/ratiopath

Length of output: 1595


🌐 Web query:

jaxtyping latest stable version release documentation

💡 Result:

Latest stable jaxtyping release: 0.3.5 — uploaded to PyPI on January 5, 2026. Install: pip install jaxtyping. Documentation: https://docs.kidger.site/jaxtyping/ [1][2]

Sources:

  • PyPI jaxtyping release info (jaxtyping 0.3.5, uploaded Jan 5, 2026). [1]
  • jaxtyping GitHub / docs page. [2]

Add version constraint for jaxtyping to match project conventions.

Line 30: jaxtyping is the only dependency without a version constraint. All other dependencies specify minimum versions (e.g., numpy>=2.2.2, torch>=2.6.0). Add jaxtyping>=0.3.5 to maintain consistency and prevent unexpected breaking changes in future releases.

🤖 Prompt for AI Agents
In @pyproject.toml at line 30, The dependency entry for jaxtyping is missing a
version constraint; update the dependency list by replacing the bare "jaxtyping"
entry with a minimum-version specifier "jaxtyping>=0.3.5" so it matches the
project's convention of pinning minimum versions like numpy>=2.2.2 and
torch>=2.6.0.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Adames4, thanks for acknowledging! Just a friendly reminder to add the version constraint jaxtyping>=0.3.5 to maintain consistency with the other dependencies when you get a chance.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a comprehensive mask building framework for assembling large masks from tiled data. The implementation provides a flexible, modular architecture using multiple inheritance patterns with mixins.

Key changes include:

  • A base MaskBuilder class with abstract interfaces for accumulator allocation, batch updates, and finalization
  • Mixin classes for different aggregation strategies (averaging, max-pooling), storage backends (numpy arrays, memory-mapped files), and preprocessing operations (edge clipping, auto-scaling)
  • Concrete builder classes combining mixins for specific use cases
  • BigTIFF export functionality via pyvips
  • Comprehensive test suite covering various parameter combinations

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 19 comments.

Show a summary per file
File Description
ratiopath/masks/mask_builders/mask_builder.py Defines base abstract class with cooperative multiple inheritance pattern for mask building
ratiopath/masks/mask_builders/aggregation.py Implements averaging and max aggregation mixins for overlapping tiles
ratiopath/masks/mask_builders/receptive_field_manipulation.py Provides edge clipping and auto-scaling mixins for coordinate transformation
ratiopath/masks/mask_builders/storage.py Defines storage allocation strategies (numpy arrays and memory-mapped files)
ratiopath/masks/mask_builders/__init__.py Exports concrete builder classes combining various mixins
ratiopath/masks/write_big_tiff.py Adds utility function for exporting masks as BigTIFF files with pyramids
tests/test_mask_builders.py Comprehensive test suite covering various builder configurations and edge cases
tests/conftest.py Adds pytest fixture for deterministic random seed management
pyproject.toml Adds jaxtyping dependency for type annotations

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +37 to +45
) -> np.memmap:
if filepath is None:
with tempfile.NamedTemporaryFile() as temp_file:
return np.lib.format.open_memmap(
temp_file.name,
mode="w+",
shape=(channels, *mask_extents),
dtype=dtype,
version=(3, 0),
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical bug in temporary file handling. The with tempfile.NamedTemporaryFile() as temp_file: context manager will close and delete the temporary file when the context exits (line 45), but the returned memmap still references that file. This will cause the memmap to fail when accessed.

To fix this, the NamedTemporaryFile should be created with delete=False and the file management should be handled manually, or use delete_on_close=False in Python 3.12+. The file reference should be stored as an instance variable to prevent premature deletion.

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this resolved withou any comment or fix?

AdamBajger and others added 3 commits January 9, 2026 15:57
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🤖 Fix all issues with AI agents
In @ratiopath/masks/mask_builders/__init__.py:
- Around line 1-146: The examples in the docstrings for
AveragingScalarUniformTiledNumpyMaskBuilder and
MaxScalarUniformTiledNumpyMaskBuilder are out of sync: update imports to use
ratiopath (not rationai) and change the update_batch call to the new signature
by constructing a single coords_batch (stack ys and xs into shape (2, B) or (N,
B) as required) and call update_batch(data_batch=features,
coords_batch=coords_batch); keep the rest of the example text but replace the
old update_batch(features, xs, ys) usage and show building coords via
np.stack([ys, xs], axis=0) (or the correct (y,x) order) before calling
update_batch.
- Around line 231-256: Update the get_vips_scale_factors docstring to include a
blank line before the "Returns:" section to satisfy D411, and ensure the
returned values are plain Python floats (not NumPy scalar types) by casting the
computed scale_factors derived from self.overflow_buffered_source_extents /
self.accumulator.shape[1:] to float before returning (e.g., convert each element
of scale_factors to float and return a tuple of two floats).

In @ratiopath/masks/mask_builders/aggregation.py:
- Around line 1-65: Remove the unused "numpy.typing as npt" import and reorder
imports to stdlib → third-party → relative (move the local ratiopath import
after third-party numpy/jaxtyping), and change
AveragingMaskBuilderMixin.finalize to avoid allocating a clipped temporary by
performing in-place guarded division using numpy.divide with out= and where=
(divide self.accumulator by self.overlap_counter in-place where overlap_counter
!= 0), then return the accumulators as before.

In @ratiopath/masks/mask_builders/receptive_field_manipulation.py:
- Around line 1-62: Rename the misspelled local variable adjusterd_coords_batch
to adjusted_coords_batch in EdgeClippingMaskBuilderMixin.update_batch and update
its use in the super().update_batch call; also reorder imports to follow stdlib
→ third-party → local (e.g., from typing import Any; from jaxtyping import
Shaped, Int64; import numpy as np; then from
ratiopath.masks.mask_builders.mask_builder import AccumulatorType, MaskBuilder),
removing extraneous blank lines so import groups are grouped correctly.
- Around line 213-269: Convert mask_extents, mask_tile_extents, and
mask_tile_strides to numpy arrays at the start of
ScalarUniformTiledMaskBuilder.__init__ (e.g., via np.asarray(..., dtype=int)) to
defensively handle tuple inputs and ensure subsequent array ops (np.gcd, //)
behave as expected; then proceed to compute self.compression_factors,
adjusted_mask_extents, adjusted_tile_extents and call super() as before, leaving
update_batch unchanged.
- Around line 120-211: The constructor of AutoScalingConstantStrideMixin assumes
NumPy array semantics but accepts tuple inputs; coerce source_extents,
source_tile_extents, source_tile_strides, and mask_tile_extents to NumPy arrays
(e.g. via np.asarray(..., dtype=np.int64) or np.array(..., dtype=np.int64)) at
the very start of __init__() so subsequent arithmetic (the multiplied_ check,
total_strides calc, overflow_buffered_* math, and update_batch scaling) works
reliably; update references to self.source_tile_extents and
self.mask_tile_extents after coercion (matching the pattern used in
EdgeClippingMaskBuilderMixin).

In @ratiopath/masks/mask_builders/storage.py:
- Around line 1-56: The allocate_accumulator implementation in
NumpyMemMapMaskBuilderAllocatorMixin is unsafe on Windows and imports are
misordered: replace the tempfile.NamedTemporaryFile context with an
instance-level TemporaryDirectory kept as self._memmap_tmpdir, create temp files
inside that dir with delete=False, close the file handle before calling
np.lib.format.open_memmap, and keep the backing file alive for the object's
lifetime so memmaps can be reopened (use open_memmap on the file path string);
also reorder imports so stdlib (pathlib, typing, tempfile) come first, then
third-party (numpy, numpy.typing, jaxtyping), then local
(ratiopath.masks.mask_builders.mask_builder).

In @ratiopath/masks/write_big_tiff.py:
- Around line 6-51: The write_big_tiff function must validate mpp_x and mpp_y
and pass a string filename to pyvips: check that mpp_x > 0 and mpp_y > 0 and
raise a ValueError with a clear message if not, compute xres/yres as before
(1000 / mpp_x, 1000 / mpp_y), and call image.tiffsave using str(path) instead of
the Path object; update references to path, mpp_x, mpp_y, write_big_tiff, and
image.tiffsave accordingly.
🧹 Nitpick comments (1)
ratiopath/masks/mask_builders/aggregation.py (1)

67-106: Use in-place max to avoid per-tile temporary allocations.

np.maximum(a, b) allocates; for large tiles/batches this can add up. Use out= against the accumulator view.

Proposed fix
         for acc_slices, data in zip(
             zip(*acc_slices_all_dims, strict=True),
             data_batch,
             strict=True,
         ):
-            self.accumulator[
-                :,
-                *acc_slices,
-            ] = np.maximum(
-                self.accumulator[
-                    :,
-                    *acc_slices,
-                ],
-                data,
-            )
+            acc_view = self.accumulator[:, *acc_slices]
+            np.maximum(acc_view, data, out=acc_view)
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 12aaf8f and f8002d4.

📒 Files selected for processing (5)
  • ratiopath/masks/mask_builders/__init__.py
  • ratiopath/masks/mask_builders/aggregation.py
  • ratiopath/masks/mask_builders/receptive_field_manipulation.py
  • ratiopath/masks/mask_builders/storage.py
  • ratiopath/masks/write_big_tiff.py
🧰 Additional context used
🧬 Code graph analysis (3)
ratiopath/masks/mask_builders/storage.py (1)
ratiopath/masks/mask_builders/mask_builder.py (2)
  • MaskBuilder (43-158)
  • allocate_accumulator (74-82)
ratiopath/masks/mask_builders/aggregation.py (3)
ratiopath/masks/mask_builders/mask_builder.py (6)
  • MaskBuilder (43-158)
  • compute_acc_slices (18-40)
  • setup_memory (84-96)
  • allocate_accumulator (74-82)
  • update_batch_impl (122-142)
  • finalize (145-158)
ratiopath/masks/mask_builders/__init__.py (1)
  • setup_memory (231-244)
ratiopath/masks/mask_builders/storage.py (2)
  • allocate_accumulator (30-56)
  • allocate_accumulator (66-73)
ratiopath/masks/mask_builders/receptive_field_manipulation.py (1)
ratiopath/masks/mask_builders/mask_builder.py (2)
  • MaskBuilder (43-158)
  • update_batch (98-119)
🪛 GitHub Actions: Lint
ratiopath/masks/mask_builders/storage.py

[error] 1-1: I001: Import block is un-sorted or un-formatted

ratiopath/masks/mask_builders/aggregation.py

[error] 1-1: I001: Import block is un-sorted or un-formatted


[error] 4-4: F401: numpy.typing imported but unused

ratiopath/masks/mask_builders/__init__.py

[error] 1-1: I001: Import block is un-sorted or un-formatted


[error] 251-251: D411: Missing blank line before section ("Returns")

ratiopath/masks/mask_builders/receptive_field_manipulation.py

[error] 1-1: I001: Import block is un-sorted or un-formatted

🔇 Additional comments (1)
ratiopath/masks/mask_builders/__init__.py (1)

258-324: Export list may be incomplete (intent check): AutoScalingScalarUniformValueConstantStrideMaskBuilder not in __all__.

If this class is meant to be part of the public surface (like the other new builders), add it to __all__; otherwise, consider moving it to a private module or prefixing with _ to signal non-public intent.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In @ratiopath/masks/mask_builders/receptive_field_manipulation.py:
- Line 62: There is an extra closing parenthesis causing a syntax error in the
receptive_field_manipulation module; remove the stray ")" (the unmatched
parenthesis shown at the end of the expression around line 62) so the
surrounding expression or return statement in the receptive_field_manipulation
code block compiles correctly (locate the unmatched ")" near the end of the
function or expression in mask_builders/receptive_field_manipulation.py and
delete it).

In @tests/test_mask_builders.py:
- Around line 1-11: Reorder the import block in tests/test_mask_builders.py so
it follows linting import groups and alphabetical order (standard library first:
tempfile, pathlib.Path; then third-party: numpy as np, pytest; then local
package imports from ratiopath.masks.mask_builders) — ensure the multiple class
imports from ratiopath.masks.mask_builders are grouped and alphabetized
(AveragingScalarUniformTiledNumpyMaskBuilder,
AutoScalingAveragingClippingNumpyMemMapMaskBuilder2D,
AutoScalingScalarUniformValueConstantStrideMaskBuilder,
MaxScalarUniformTiledNumpyMaskBuilder), then run your import fixer (e.g., ruff
or isort) as suggested: ruff check --select I --fix tests/test_mask_builders.py.
- Around line 317-331: The test creates an overlap counter file using
overlap_counter_filepath=filepath.with_suffix(".overlaps"+filepath.suffix) but
asserts existence of filepath.with_suffix(".overlap"+filepath.suffix) (missing
's'); update the assertion to check for
filepath.with_suffix(".overlaps"+filepath.suffix) so the expected filename
matches the one created (references: overlap_counter_filepath, filepath,
assembler.update_batch).
🧹 Nitpick comments (8)
ratiopath/masks/mask_builders/receptive_field_manipulation.py (6)

30-40: Validate clip indices to prevent invalid configurations.

The constructor doesn't validate that clip_start_indices and clip_end_indices are non-negative or that their sum doesn't exceed tile dimensions. Invalid values could cause incorrect slicing or runtime errors.

✅ Suggested validation
     def __init__(
         self,
         mask_extents: Int64[AccumulatorType, " N"],
         channels: int,
         clip_start_indices: Int64[AccumulatorType, " N"],
         clip_end_indices: Int64[AccumulatorType, " N"],
         **kwargs: Any,
     ) -> None:
         super().__init__(mask_extents, channels, **kwargs)
         self.clip_start_indices = np.asarray(clip_start_indices, dtype=np.int64)
         self.clip_end_indices = np.asarray(clip_end_indices, dtype=np.int64)
+        
+        if np.any(self.clip_start_indices < 0) or np.any(self.clip_end_indices < 0):
+            raise ValueError("Clip indices must be non-negative")
+        if len(self.clip_start_indices) != len(mask_extents) or len(self.clip_end_indices) != len(mask_extents):
+            raise ValueError(f"Clip indices must match mask dimensions: {len(mask_extents)}")

98-99: Validate dimensionality matches 2D expectation.

The class is named EdgeClippingMaskBuilder2DMixin, but line 99 applies integer clip values to all dimensions in mask_extents without verifying it's actually 2D. This could silently accept 3D+ masks with unexpected behavior.

✅ Suggested validation
     def __init__(
         self,
         mask_extents: Int64[AccumulatorType, " N"],
         channels: int,
         clip: int | tuple[int, int] | tuple[int, int, int, int] = 0,
         **kwargs: Any,
     ) -> None:
+        if len(mask_extents) != 2:
+            raise ValueError(f"EdgeClippingMaskBuilder2DMixin requires 2D mask_extents, got {len(mask_extents)} dimensions")
+        
         if isinstance(clip, int):
+            if clip < 0:
+                raise ValueError(f"Clip value must be non-negative, got {clip}")
             clip_start_indices = clip_end_indices = (clip,) * len(mask_extents)

187-188: Consider integer arithmetic to avoid floating-point precision issues.

The current approach uses float division followed by ceiling and conversion to int64. For large extent/stride values, floating-point precision could introduce errors.

♻️ Alternative using integer arithmetic
-        total_strides = (source_extents - source_tile_extents) / source_tile_strides
-        total_strides = np.ceil(total_strides).astype(np.int64)
+        # Integer ceiling division: (a + b - 1) // b
+        total_strides = (source_extents - source_tile_extents + source_tile_strides - 1) // source_tile_strides

164-176: Validate input parameters are positive and well-formed.

The constructor doesn't validate that extents and strides are positive, or that source_tile_extents <= source_extents. Invalid inputs could cause confusing errors downstream.

✅ Suggested validation
     def __init__(
         self,
         source_extents: Int64[AccumulatorType, " N"],
         source_tile_extents: Int64[AccumulatorType, " N"],
         source_tile_strides: Int64[AccumulatorType, " N"],
         mask_tile_extents: Int64[AccumulatorType, " N"],
         channels: int,
         **kwargs: Any,
     ) -> None:
+        source_extents = np.asarray(source_extents, dtype=np.int64)
+        source_tile_extents = np.asarray(source_tile_extents, dtype=np.int64)
+        source_tile_strides = np.asarray(source_tile_strides, dtype=np.int64)
+        mask_tile_extents = np.asarray(mask_tile_extents, dtype=np.int64)
+        
+        if np.any(source_extents <= 0) or np.any(source_tile_extents <= 0) or np.any(source_tile_strides <= 0) or np.any(mask_tile_extents <= 0):
+            raise ValueError("All extents and strides must be positive")
+        if np.any(source_tile_extents > source_extents):
+            raise ValueError(f"source_tile_extents {source_tile_extents} cannot exceed source_extents {source_extents}")
+        
         # self.source_extents = source_extents
-        self.source_tile_extents = source_tile_extents
-        self.mask_tile_extents = mask_tile_extents
+        self.source_tile_extents = source_tile_extents
+        self.mask_tile_extents = mask_tile_extents

265-268: Simplify tile expansion using np.tile or np.broadcast_to.

Creating a zeros array and then adding the scalar values works but is less efficient and less clear than directly expanding the values.

♻️ More efficient alternatives
     def update_batch(
         self,
         data_batch: Shaped[AccumulatorType, "B C"],
         coords_batch: Shaped[AccumulatorType, "N B"],
     ) -> None:
         """For each scalar/vector in the batch, repeat it in each dimension to form a tile, then update the mask with the tile."""
-        adjusted_tiles = np.zeros((*data_batch.shape, *self.adjusted_tile_extents))
-        adjusted_tiles += data_batch[
-            ..., *[np.newaxis] * len(self.adjusted_tile_extents)
-        ]
+        # Option 1: Using np.broadcast_to (view, no copy)
+        adjusted_tiles = np.broadcast_to(
+            data_batch[..., *[np.newaxis] * len(self.adjusted_tile_extents)],
+            (*data_batch.shape, *self.adjusted_tile_extents)
+        )
+        # Note: broadcast_to returns a view, so we need a copy if super().update_batch modifies it
+        adjusted_tiles = np.array(adjusted_tiles)
+        
+        # Option 2: Using np.tile (if repeating is intended)
+        # adjusted_tiles = np.tile(data_batch[..., *[np.newaxis] * len(self.adjusted_tile_extents)], (1, 1, *self.adjusted_tile_extents))
+        
         adjusted_coordinates = coords_batch // self.compression_factors[:, np.newaxis]
         super().update_batch(adjusted_tiles, coords_batch=adjusted_coordinates)

244-252: Validate tile parameters for consistency.

The constructor doesn't validate that mask_tile_extents and mask_tile_strides are positive, or that the GCD computation produces valid compression factors.

✅ Suggested validation
     def __init__(
         self,
         mask_extents: Int64[AccumulatorType, " N"],
         channels: int,
         mask_tile_extents: Int64[AccumulatorType, " N"],
         mask_tile_strides: Int64[AccumulatorType, " N"],
         **kwargs: Any,
     ) -> None:
+        mask_extents = np.asarray(mask_extents, dtype=np.int64)
+        mask_tile_extents = np.asarray(mask_tile_extents, dtype=np.int64)
+        mask_tile_strides = np.asarray(mask_tile_strides, dtype=np.int64)
+        
+        if np.any(mask_extents <= 0) or np.any(mask_tile_extents <= 0) or np.any(mask_tile_strides <= 0):
+            raise ValueError("All extents and strides must be positive")
+        if len(mask_extents) != len(mask_tile_extents) or len(mask_extents) != len(mask_tile_strides):
+            raise ValueError("mask_extents, mask_tile_extents, and mask_tile_strides must have the same dimensions")
+        
         self.compression_factors = np.gcd(mask_tile_strides, mask_tile_extents)
tests/test_mask_builders.py (2)

333-335: Clean up the overlap file in the test teardown.

The test only unlinks the main file but leaves the overlap counter file behind, which could clutter the test directory.

🧹 Proposed cleanup
     # Clean up
     filepath.unlink()
+    filepath.with_suffix(".overlap"+filepath.suffix).unlink()

353-354: Remove unnecessary blank lines.

Lines 353-354 contain extra blank lines that don't improve readability.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f8002d4 and 3ded496.

📒 Files selected for processing (3)
  • pyproject.toml
  • ratiopath/masks/mask_builders/receptive_field_manipulation.py
  • tests/test_mask_builders.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • pyproject.toml
🧰 Additional context used
🪛 GitHub Actions: Lint
tests/test_mask_builders.py

[error] 1-1: I001 Import block is un-sorted or un-formatted. Organize imports.

ratiopath/masks/mask_builders/receptive_field_manipulation.py

[error] 62-62: invalid-syntax: Expected a statement.


[error] 62-62: invalid-syntax: Expected a statement.

🪛 GitHub Actions: Tests
tests/test_mask_builders.py

[error] 7-7: ImportError during test collection: ModuleNotFoundError: No module named '_libvips' followed by OSError: cannot load library 'libvips.so.42'. Ensure libvips and pyvips are properly installed in the CI environment.

🔇 Additional comments (1)
tests/test_mask_builders.py (1)

155-155: Good practice: deterministic testing with random seed.

Setting np.random.seed(0) ensures reproducible test results across runs, which is helpful for debugging and CI stability.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Fix all issues with AI agents
In @ratiopath/masks/mask_builders/__init__.py:
- Around line 268-327: The docstring example uses the wrong class name; replace
the instantiation of AutoScalingScalarUniformTiledMaskBuilder with the actual
exported class AutoScalingScalarUniformValueConstantStrideMaskBuilder (or
vice‑versa if you intended a different public API) so the example in the
docstring of class AutoScalingScalarUniformValueConstantStrideMaskBuilder
matches the real class name and parameters (source_extents, channels,
source_tile_extents, source_tile_strides, mask_tile_extents).
- Around line 12-17: The __all__ list in this module is unsorted; sort the
entries in __all__ alphabetically to satisfy RUF022 and keep the public API
tidy—i.e., reorder the referenced symbols
(AutoScalingAveragingClippingNumpyMemMapMaskBuilder2D,
AutoScalingScalarUniformValueConstantStrideMaskBuilder,
AveragingScalarUniformTiledNumpyMaskBuilder,
MaxScalarUniformTiledNumpyMaskBuilder) into alphabetical order and update the
__all__ assignment accordingly.
- Around line 1-10: Reorder the imports in
ratiopath/masks/mask_builders/__init__.py into isort-compliant groups: place
stdlib imports first (Path, Any from pathlib and typing), then third-party
imports (Int64 from jaxtyping), then local package imports (all
ratiopath.masks.mask_builders.* symbols). Ensure a single blank line separates
each group and keep the existing imported symbols (AveragingMaskBuilderMixin,
MaxMaskBuilderMixin, AccumulatorType, AutoScalingConstantStrideMixin,
EdgeClippingMaskBuilder2DMixin, ScalarUniformTiledMaskBuilder,
NumpyArrayMaskBuilderAllocatorMixin, NumpyMemMapMaskBuilderAllocatorMixin)
unchanged.
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3ded496 and c5405cf.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (1)
  • ratiopath/masks/mask_builders/__init__.py
🧰 Additional context used
🪛 GitHub Actions: Lint
ratiopath/masks/mask_builders/__init__.py

[error] 1-9: I001: Import block is un-sorted or un-formatted. Organize imports


[warning] 12-18: RUF022: all is not sorted. Apply an isort-style sorting to all


[warning] 259-261: D411: Missing blank line before section (Returns). Add a blank line before 'Returns' in the docstring.

🔇 Additional comments (1)
ratiopath/masks/mask_builders/__init__.py (1)

19-86: [rewritten review comment]
[classification tag]

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Fix all issues with AI agents
In @ratiopath/masks/mask_builders/__init__.py:
- Around line 71-77: The docstring example calls
mask_builder.update_batch(features, xs, ys) but update_batch expects two args:
data_batch and coords_batch (coords_batch is a 2D array shaped (N, B)), so
change the example to construct a single coords_batch by stacking/column-wise
combining xs and ys into a 2-row or 2-column 2D array of shape (2, B) (or (B, 2)
depending on your implementation) and pass mask_builder.update_batch(features,
coords_batch) so the call matches update_batch(data_batch, coords_batch) and
preserves the expected shapes.
- Around line 138-144: The call to mask_builder.update_batch(features, xs, ys)
uses the old argument order; inspect the actual mask_builder.update_batch
signature in the MaskBuilder implementation and call it with the correct
parameters and order (e.g., include tiles if the new signature expects tiles, or
reorder to features then coords, or use keyword args like tiles=tiles,
features=features, xs=xs, ys=ys) so the types match what update_batch expects;
keep the vgg16_model.predict(features) usage (or swap to pass tiles if
update_batch now expects raw tiles) and leave the mask_builder.finalize()
unpacking as-is.

In @ratiopath/masks/mask_builders/storage.py:
- Around line 40-48: The tempfile is created inside a with block and deleted
when the context exits, leaving the np.lib.format.open_memmap result pointing at
a removed file; change creation to persist the temp file (e.g., use
tempfile.NamedTemporaryFile(delete=False) and do not use the context manager or
use tempfile.mkstemp and close the fd) before calling np.lib.format.open_memmap
so the memmap targets an existing file, and, if necessary, record the temp
filename (from temp_file.name) for later cleanup so it isn't leaked.

In @tests/test_mask_builders.py:
- Around line 313-347: The test test_numpy_memmap_persistent_file has a suffix
mismatch: the assembler is constructed with overlap_counter_filepath using
".overlaps" but the assertion checks for ".overlap"; update the assertion to
expect ".overlaps" (i.e., change filepath.with_suffix(".overlap" +
filepath.suffix) to filepath.with_suffix(".overlaps" + filepath.suffix)) so it
matches the overlap_counter_filepath passed into
AutoScalingAveragingClippingNumpyMemMapMaskBuilder2D and the created file.
🧹 Nitpick comments (5)
tests/test_mask_builders.py (3)

19-78: Consider setting random seed for test determinism.

The test generates random coordinates (line 43) but doesn't set a seed, which could lead to non-deterministic test behavior. Consider adding np.random.seed(0) at the start of the test for reproducibility, similar to the pattern used in test_edge_clipping_heatmap_assembler (line 167).

♻️ Suggested fix
 def test_scalar_uniform_averaging_2d(
     mask_extents, channels, mask_tile_extents, mask_tile_strides
 ):
+    np.random.seed(0)
     mask_extents = np.asarray(mask_extents)
     mask_tile_extents = np.asarray(mask_tile_extents)

84-137: Consider setting random seed for test determinism.

Similar to the averaging test, this test generates random coordinates (line 106) without setting a seed, which could lead to flaky tests.

♻️ Suggested fix
 def test_scalar_uniform_max_2d(
     mask_extents, channels, mask_tile_extents, mask_tile_strides
 ):
+    np.random.seed(0)
     mask_extents = np.asarray(mask_extents)
     mask_tile_extents = np.asarray(mask_tile_extents)

349-430: Consider setting random seed for test determinism.

The test generates random coordinates (line 393) without setting a seed, similar to the first two tests reviewed.

♻️ Suggested fix
 def test_autoscaling_scalar_uniform_value_constant_stride(
     source_extents, channels, source_tile_extents, mask_tile_extents
 ):
     """Test AutoScalingScalarUniformValueConstantStrideMaskBuilder with autoscaling and scalar tiling."""
+    np.random.seed(0)
     batch_size = 4
     num_batches = 8
ratiopath/masks/mask_builders/receptive_field_manipulation.py (1)

262-273: Consider preserving the input dtype when creating adjusted_tiles.

The np.zeros() call defaults to float64, which may not match the dtype of data_batch or the accumulator (typically float32). This causes unnecessary memory overhead for the intermediate array.

Proposed fix
     def update_batch(
         self,
         data_batch: Shaped[AccumulatorType, "B C"],
         coords_batch: Shaped[AccumulatorType, "N B"],
     ) -> None:
         """For each scalar/vector in the batch, repeat it in each dimension to form a tile, then update the mask with the tile."""
-        adjusted_tiles = np.zeros((*data_batch.shape, *self.adjusted_tile_extents))
+        adjusted_tiles = np.zeros(
+            (*data_batch.shape, *self.adjusted_tile_extents), dtype=data_batch.dtype
+        )
         adjusted_tiles += data_batch[
             ..., *[np.newaxis] * len(self.adjusted_tile_extents)
         ]
         adjusted_coordinates = coords_batch // self.compression_factors[:, np.newaxis]
         super().update_batch(adjusted_tiles, coords_batch=adjusted_coordinates)
ratiopath/masks/mask_builders/__init__.py (1)

283-286: Address or track the TODO comment.

The get_vips_scale_factors method lacks test coverage as noted.

Would you like me to open an issue to track adding tests for this method?

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c5405cf and 40807ee.

📒 Files selected for processing (7)
  • .github/workflows/tests.yml
  • ratiopath/masks/mask_builders/__init__.py
  • ratiopath/masks/mask_builders/aggregation.py
  • ratiopath/masks/mask_builders/mask_builder.py
  • ratiopath/masks/mask_builders/receptive_field_manipulation.py
  • ratiopath/masks/mask_builders/storage.py
  • tests/test_mask_builders.py
🧰 Additional context used
🧬 Code graph analysis (5)
tests/test_mask_builders.py (3)
ratiopath/masks/mask_builders/__init__.py (3)
  • AutoScalingAveragingClippingNumpyMemMapMaskBuilder2D (169-286)
  • AveragingScalarUniformTiledNumpyMaskBuilder (32-98)
  • MaxScalarUniformTiledNumpyMaskBuilder (101-166)
ratiopath/masks/mask_builders/mask_builder.py (2)
  • update_batch (98-119)
  • finalize (145-158)
ratiopath/masks/mask_builders/aggregation.py (2)
  • finalize (65-68)
  • finalize (110-111)
ratiopath/masks/mask_builders/__init__.py (2)
ratiopath/masks/mask_builders/storage.py (4)
  • NumpyArrayMaskBuilderAllocatorMixin (61-75)
  • NumpyMemMapMaskBuilderAllocatorMixin (12-58)
  • allocate_accumulator (32-58)
  • allocate_accumulator (68-75)
ratiopath/masks/mask_builders/mask_builder.py (2)
  • setup_memory (86-96)
  • allocate_accumulator (76-84)
ratiopath/masks/mask_builders/receptive_field_manipulation.py (1)
ratiopath/masks/mask_builders/mask_builder.py (2)
  • MaskBuilder (45-158)
  • update_batch (98-119)
ratiopath/masks/mask_builders/storage.py (1)
ratiopath/masks/mask_builders/mask_builder.py (2)
  • MaskBuilder (45-158)
  • allocate_accumulator (76-84)
ratiopath/masks/mask_builders/aggregation.py (3)
ratiopath/masks/mask_builders/mask_builder.py (6)
  • MaskBuilder (45-158)
  • compute_acc_slices (17-42)
  • setup_memory (86-96)
  • allocate_accumulator (76-84)
  • update_batch_impl (122-142)
  • finalize (145-158)
ratiopath/masks/mask_builders/__init__.py (1)
  • setup_memory (252-272)
ratiopath/masks/mask_builders/storage.py (2)
  • allocate_accumulator (32-58)
  • allocate_accumulator (68-75)
🪛 GitHub Actions: Tests
tests/test_mask_builders.py

[error] 341-341: Persistent overlap file /tmp/.../persistent_heatmap.overlap.npy should exist after finalization.

🔇 Additional comments (14)
.github/workflows/tests.yml (1)

17-24: LGTM! Dependency installation for image processing tests.

The addition of OpenSlide and LibVips dependencies is appropriate for the new mask builder tests that handle image tiles and memory-mapped storage.

tests/test_mask_builders.py (3)

145-244: LGTM! Comprehensive edge clipping test with proper seed management.

The test properly sets a random seed for reproducibility and thoroughly validates accumulator checksums, finalization, and shape consistency.


246-269: LGTM! Clear and focused edge clipping validation.

This simple test effectively validates that edge clipping correctly zeros out border pixels.


271-311: Test correctly validates intended behavior but depends on storage.py fix.

The test logic is correct and uses good practices (monkeypatching, cleanup validation). However, this test currently won't work as intended due to the tempfile lifecycle issue in ratiopath/masks/mask_builders/storage.py lines 40-48, where the temporary file is deleted immediately upon creation.

The test will function correctly once the storage.py tempfile bug is resolved.

ratiopath/masks/mask_builders/mask_builder.py (2)

17-42: LGTM! Well-implemented slice computation utility.

The function correctly computes accumulator slices with proper type hints and defensive programming (strict=True in zip).


45-158: LGTM! Excellent cooperative inheritance design.

The MaskBuilder base class is well-architected with:

  • Clear separation between concrete update_batch() (mixin wrapping point) and abstract update_batch_impl() (implementation point)
  • Proper setup_memory() hook for mixin composition
  • Comprehensive documentation
  • Good type hints

This design pattern will work well for the mixin-based architecture.

ratiopath/masks/mask_builders/aggregation.py (2)

13-69: LGTM! Correct averaging aggregation implementation.

The mixin properly:

  • Allocates overlap_counter in setup_memory
  • Accumulates tiles and overlap counts in update_batch_impl
  • Safely averages using clip(min=1) to prevent division by zero
  • Returns both accumulator and overlap_counter as documented

71-111: LGTM! Correct maximum aggregation implementation.

The mixin correctly implements max aggregation using np.maximum for element-wise maximum across overlapping tiles, and appropriately returns a tuple containing just the accumulator since no post-processing is needed.

ratiopath/masks/mask_builders/storage.py (1)

61-75: LGTM! Simple and correct array allocation.

The numpy array allocator correctly creates zero-initialized arrays with the specified shape and dtype.

ratiopath/masks/mask_builders/receptive_field_manipulation.py (3)

9-62: LGTM!

The EdgeClippingMaskBuilderMixin correctly implements edge clipping with:

  • Proper coordinate adjustment via clip_start_indices
  • Clean slice construction for clipping
  • Good use of strict=True in zip to catch dimension mismatches

The docstring clearly explains the MRO positioning requirement.


65-118: LGTM!

The 2D mixin properly handles all three clip parameter formats and the validation logic is correct. Good error message for invalid input.


121-214: LGTM!

The AutoScalingConstantStrideMixin correctly:

  • Validates divisibility constraints (line 179) to avoid fractional strides
  • Computes total_strides with proper ceiling for partial edge tiles (lines 187-188)
  • Calculates overflow-buffered extents for both source and mask spaces
  • Adjusts coordinates in update_batch using integer division

The documentation thoroughly explains the resolution mismatch problem and the solution.

ratiopath/masks/mask_builders/__init__.py (2)

252-272: LGTM!

The setup_memory override correctly handles custom filepath logic for memmap-backed accumulators:

  • Uses explicit accumulator_filepath for the main accumulator
  • Derives overlap counter path with .overlaps suffix when only accumulator path is provided
  • Falls back to temp files when no paths are specified

This appropriately bypasses the parent's setup_memory since it needs custom allocation behavior.


289-347: LGTM!

The MRO composition is correct:

  1. NumpyArrayMaskBuilderAllocatorMixin → provides allocate_accumulator
  2. AutoScalingConstantStrideMixin → scales coords from source to mask space
  3. ScalarUniformTiledMaskBuilder → compresses and expands scalar values
  4. AveragingMaskBuilderMixin → provides averaging accumulation and finalization

The docstring example correctly shows the expected usage pattern.

@AdamBajger
Copy link
Author

@copilot pls run ruff format to fix linting errors

Copy link
Contributor

Copilot AI commented Jan 21, 2026

@AdamBajger I've opened a new pull request, #27, to work on those changes. Once the pull request is ready, I'll request review from you.

Copilot AI and others added 2 commits January 21, 2026 22:37
Co-authored-by: AdamBajger <33228382+AdamBajger@users.noreply.github.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@ratiopath/masks/mask_builders/mask_builder.py`:
- Around line 62-96: The base initializer and setup_memory must explicitly
accept and pass dtype to match the abstract allocate_accumulator signature: add
a dtype parameter to __init__ (e.g., dtype: np.dtype) and store/forward it,
update setup_memory(self, mask_extents, channels, dtype, **kwargs) to call
self.allocate_accumulator(mask_extents, channels, dtype=dtype, **kwargs), and
ensure any references to self.accumulator allocation use the provided dtype so
concrete implementations of allocate_accumulator no longer rely on a hidden
default.
♻️ Duplicate comments (2)
ratiopath/masks/mask_builders/mask_builder.py (2)

87-90: Fix docstring grammar (“methods” → “method”)

Line 89: “This methods can …” → “This method can …”.

✍️ Suggested edit
-        This methods can be overridden by mixins or concrete builders to set up any necessary memory structures.
+        This method can be overridden by mixins or concrete builders to set up any necessary memory structures.

145-156: Align finalize() return type with the docstring

The docstring says it always returns a tuple, but the type hint allows AccumulatorType. Consider narrowing the type or updating the docstring for consistency.

🧩 Suggested type tightening
-    def finalize(self) -> tuple[AccumulatorType, ...] | AccumulatorType:
+    def finalize(self) -> tuple[AccumulatorType, ...]:

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@ratiopath/masks/mask_builders/mask_builder.py`:
- Around line 63-72: Docstring for __init__ is missing documentation for the
dtype parameter; add a short description under Args explaining dtype (its
expected types, e.g., numpy dtype or npt.DTypeLike) and its purpose (the element
type used when allocating the accumulator via allocate_accumulator()), keeping
formatting consistent with the other Args entries and mentioning that it is
forwarded to allocate_accumulator().

In `@ratiopath/masks/mask_builders/receptive_field_manipulation.py`:
- Around line 164-205: The subclass __init__ in receptive_field_manipulation is
not forwarding the dtype argument to the parent MaskBuilder.__init__, causing
tests to fail; update the super().__init__ call in the __init__ method to
include dtype=dtype (i.e., pass the local dtype parameter through alongside
mask_extents, channels, mask_tile_extents, mask_tile_strides, and **kwargs) so
the parent receives the required dtype argument.
🧹 Nitpick comments (1)
ratiopath/masks/mask_builders/receptive_field_manipulation.py (1)

248-261: Make dtype explicit in ScalarUniformTiledMaskBuilder.__init__.

The base class requires dtype, but this constructor doesn’t surface it, which makes the API surprising for callers.

♻️ Suggested adjustment
     def __init__(
         self,
         mask_extents: Int64[AccumulatorType, " N"],
         channels: int,
         mask_tile_extents: Int64[AccumulatorType, " N"],
         mask_tile_strides: Int64[AccumulatorType, " N"],
+        dtype: npt.DTypeLike,
         **kwargs: Any,
     ) -> None:
@@
         super().__init__(
-            mask_extents=adjusted_mask_extents, channels=channels, **kwargs
+            mask_extents=adjusted_mask_extents, channels=channels, dtype=dtype, **kwargs
         )

- dtype parameter missing from concrete classes
- bump version
- used version is yanked (warning by UV) due to backward compatibility bug
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Fix all issues with AI agents
In `@ratiopath/masks/mask_builders/__init__.py`:
- Around line 1-23: The import block in
ratiopath/masks/mask_builders/__init__.py is unsorted causing lint I001; reorder
the imports into standard groups (stdlib, third-party, local) and alphabetize
within groups so numpy (imported as np and numpy.typing as npt) comes before
jaxtyping.Int64, and third-party imports (np, npt, jaxtyping) are grouped
together before local package imports; specifically ensure Path and Any remain
in the stdlib group, numpy.typing (npt) and numpy (np) are ordered before Int64
from jaxtyping, and the local imports (AveragingMaskBuilderMixin,
MaxMaskBuilderMixin, AccumulatorType, AutoScalingConstantStrideMixin,
EdgeClippingMaskBuilder2DMixin, ScalarUniformTiledMaskBuilder,
NumpyArrayMaskBuilderAllocatorMixin, NumpyMemMapMaskBuilderAllocatorMixin)
follow—alternatively run `ruff format`/`isort` to auto-fix.

In `@ratiopath/masks/mask_builders/mask_builder.py`:
- Around line 89-99: The docstring for the abstract accumulator allocation
method in mask_builders/mask_builder.py (the method that documents parameters
mask_extents, channels, dtype, **kwargs) is missing a blank line before the
"Args:" section which triggers D411; fix it by inserting a single blank line
between the end of the summary paragraph ("how the accumulator(s) are allocated
and stored...") and the "Args:" header so the docstring conforms to the style
guide.

In `@ratiopath/masks/mask_builders/receptive_field_manipulation.py`:
- Around line 102-145: The __init__ docstring for the
receptive_field_manipulation mixin is missing documentation for the **kwargs
parameter which causes D417; update the __init__ docstring in
receptive_field_manipulation.py (the __init__ method shown) to include a brief
entry for **kwargs describing that additional keyword arguments are forwarded to
the superclass (e.g., EdgeClippingMaskBuilderMixin) and any expected
keys/behaviour (such as clip_start_indices/clip_end_indices being provided by
this wrapper), so CI lint D417 is satisfied.
- Around line 278-301: The __init__ docstring for the Receptive Field
Manipulation mask builder is missing documentation for the **kwargs parameter
which causes D417; update the Args section in the __init__ docstring to add a
brief line describing **kwargs (e.g., "Additional keyword arguments forwarded to
super().__init__") so callers know extra keyword arguments are accepted and
where they are passed (refer to the __init__ method and its call to
super().__init__).
- Around line 186-241: The __init__ docstring for the ReceptiveFieldManipulation
mixin is missing documentation for **kwargs causing D417; update the docstring
of __init__ to add a brief parameter entry for **kwargs (e.g., "**kwargs:
Additional keyword arguments forwarded to super().__init__.") and place it with
the other Args so the linter recognizes it, keeping the description concise and
matching the style used for other parameters in this method.
- Around line 32-52: The __init__ docstring is missing documentation for the
**kwargs parameter causing D417; update the docstring in the
receptive_field_manipulation.__init__ method to include a brief entry for
**kwargs explaining that additional keyword arguments are forwarded to the
superclass (super().__init__) and list any expected/used extra keys if relevant
(e.g., configuration hooks or mixin options), so callers know what extra
parameters are accepted.
♻️ Duplicate comments (2)
pyproject.toml (1)

30-30: Add version constraint for jaxtyping.

This dependency lacks a version specifier, unlike all other dependencies in the list (e.g., numpy>=2.2.2, torch>=2.6.0). Add a minimum version constraint such as jaxtyping>=0.3.5 for consistency and to prevent unexpected breaking changes or supply chain risks.

ratiopath/masks/mask_builders/mask_builder.py (1)

104-106: Typo in docstring: "This methods can" should be "This method can".

✏️ Proposed fix
-        This methods can be overridden by mixins or concrete builders to set up any necessary memory structures.
+        This method can be overridden by mixins or concrete builders to set up any necessary memory structures.
🧹 Nitpick comments (3)
ratiopath/masks/mask_builders/receptive_field_manipulation.py (1)

209-220: Consider adding defensive np.asarray() conversions for tuple inputs.

The arithmetic operations at lines 213, 215, 220, 223-231 assume NumPy array semantics. If any parameters (source_extents, source_tile_extents, source_tile_strides, mask_tile_extents) are passed as Python tuples, operations like source_tile_strides[:, np.newaxis] in update_batch (line 249) will raise TypeError.

The type hints provide no runtime enforcement. Consider adding explicit conversions similar to EdgeClippingMaskBuilderMixin (lines 51-52).

♻️ Proposed fix
     def __init__(
         self,
         source_extents: Int64[AccumulatorType, " N"],
         source_tile_extents: Int64[AccumulatorType, " N"],
         source_tile_strides: Int64[AccumulatorType, " N"],
         mask_tile_extents: Int64[AccumulatorType, " N"],
         channels: int,
         dtype: npt.DTypeLike,
         **kwargs: Any,
     ) -> None:
+        source_extents = np.asarray(source_extents, dtype=np.int64)
+        source_tile_extents = np.asarray(source_tile_extents, dtype=np.int64)
+        source_tile_strides = np.asarray(source_tile_strides, dtype=np.int64)
+        mask_tile_extents = np.asarray(mask_tile_extents, dtype=np.int64)
+
         # self.source_extents = source_extents
         self.source_tile_extents = source_tile_extents
         self.mask_tile_extents = mask_tile_extents
ratiopath/masks/mask_builders/__init__.py (2)

289-310: Document why super().setup_memory() is intentionally bypassed.

This setup_memory override allocates both accumulators directly with custom filepath handling, bypassing the cooperative chain. While this works correctly, it opts out of any setup_memory behaviors that future mixins might add. Consider adding a brief comment explaining this is intentional.

💡 Suggested documentation
     def setup_memory(
         self,
         mask_extents: Int64[AccumulatorType, " N"],
         channels: int,
         dtype: npt.DTypeLike = np.float32,
         accumulator_filepath: Path | None = None,
         overlap_counter_filepath: Path | None = None,
         **kwargs: Any,
     ) -> None:
+        # Note: We intentionally bypass super().setup_memory() because we need
+        # custom filepath handling for both the main accumulator and overlap counter.
         self.accumulator = self.allocate_accumulator(

312-324: Return type mismatch and TODO for tests.

  1. The return type annotation is tuple[float, float], but tuple(scale_factors) returns NumPy scalar types. Consider casting to Python floats for consistency with the type hint.

  2. The TODO comment indicates missing test coverage for this method.

💡 Proposed fix for return type
         scale_factors = (
             self.overflow_buffered_source_extents / self.accumulator.shape[1:]
         )  # H, W
-        return tuple(scale_factors)  # TODO: add tests for this method
+        return (float(scale_factors[0]), float(scale_factors[1]))  # TODO: add tests for this method

Would you like me to help generate unit tests for get_vips_scale_factors() or open an issue to track this?

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@ratiopath/masks/mask_builders/__init__.py`:
- Around line 139-140: Docstring example uses the wrong OpenSlide API: replace
the attribute access slide.dimensions[LEVEL] with slide.level_dimensions[LEVEL]
in the example; update the example lines that create the OpenSlide object (slide
= openslide.OpenSlide(...)) and the subsequent dimensions lookup so the code
shows slide.level_dimensions[LEVEL] (using the LEVEL constant) to match the
OpenSlide API.
- Around line 65-66: The docstring example uses the wrong OpenSlide API: replace
the indexed use of slide.dimensions (which always returns base-level (width,
height)) with slide.level_dimensions[LEVEL]; update the example that constructs
slide via openslide.OpenSlide and the subsequent access so it reads that
slide_extent_x, slide_extent_y = slide.level_dimensions[LEVEL] (referencing the
slide variable, openslide.OpenSlide, slide.dimensions -> slide.level_dimensions,
and the LEVEL constant).
♻️ Duplicate comments (6)
ratiopath/masks/mask_builders/receptive_field_manipulation.py (6)

300-306: Consider adding defensive np.asarray conversions for robustness.

While this class is typically used via composition with AutoScalingConstantStrideMixin (which would pass arrays), direct instantiation with tuples would cause np.gcd() to fail or behave unexpectedly.

♻️ Proposed fix
     def __init__(
         self,
         mask_extents: Int64[AccumulatorType, " N"],
         channels: int,
         mask_tile_extents: Int64[AccumulatorType, " N"],
         mask_tile_strides: Int64[AccumulatorType, " N"],
         dtype: npt.DTypeLike,
         **kwargs: Any,
     ) -> None:
+        mask_extents = np.asarray(mask_extents, dtype=np.int64)
+        mask_tile_extents = np.asarray(mask_tile_extents, dtype=np.int64)
+        mask_tile_strides = np.asarray(mask_tile_strides, dtype=np.int64)
+
         self.compression_factors = np.gcd(mask_tile_strides, mask_tile_extents)

43-49: Fix lint failure: D417 requires documenting **kwargs with the asterisks.

The docstring uses kwargs: but the parameter name is **kwargs. D417 expects exact parameter name matching.

✏️ Proposed fix
             clip_start_indices: Array of shape (N,) specifying pixels to clip from the start of each dimension.
             clip_end_indices: Array of shape (N,) specifying pixels to clip from the end of each dimension.
             dtype: Data type for the accumulator.
-            kwargs: Additional keyword arguments passed to the parent class.
+            **kwargs: Additional keyword arguments passed to the parent class.
         """

116-125: Fix lint failure: D417 requires documenting **kwargs with the asterisks.

Same issue as in the parent class.

✏️ Proposed fix
             clip: Specifies how many pixels to clip from each edge. Can be:
                 - Single int: clips that many pixels from all edges
                 - Tuple of 2 ints: (top/bottom, left/right)
                 - Tuple of 4 ints: (top, bottom, left, right)
             dtype: Data type for the accumulator.
-            kwargs: Additional keyword arguments passed to the parent class.
+            **kwargs: Additional keyword arguments passed to the parent class.
         """

203-214: Fix lint failure: D417 requires documenting **kwargs with the asterisks.

✏️ Proposed fix
             source_tile_strides: Stride between input/source tiles (optional, defaults to source_tile_extents).
             dtype: Data type for the accumulator.
-            kwargs: Additional keyword arguments passed to the next class in MRO.
+            **kwargs: Additional keyword arguments passed to the next class in MRO.
         """

212-223: Add np.asarray coercion to prevent TypeError with tuple inputs.

The arithmetic at line 216 (source_tile_strides * self.mask_tile_extents) requires NumPy array semantics. If tuples are passed, this will raise TypeError: can't multiply sequence by non-int of type 'tuple'. JAX type hints provide no runtime enforcement.

🐛 Proposed fix
     def __init__(
         self,
         source_extents: Int64[AccumulatorType, " N"],
         source_tile_extents: Int64[AccumulatorType, " N"],
         source_tile_strides: Int64[AccumulatorType, " N"],
         mask_tile_extents: Int64[AccumulatorType, " N"],
         channels: int,
         dtype: npt.DTypeLike,
         **kwargs: Any,
     ) -> None:
-        # self.source_extents = source_extents
-        self.source_tile_extents = source_tile_extents
-        self.mask_tile_extents = mask_tile_extents
+        source_extents = np.asarray(source_extents, dtype=np.int64)
+        source_tile_extents = np.asarray(source_tile_extents, dtype=np.int64)
+        source_tile_strides = np.asarray(source_tile_strides, dtype=np.int64)
+        mask_tile_extents = np.asarray(mask_tile_extents, dtype=np.int64)
+
+        self.source_tile_extents = source_tile_extents
+        self.mask_tile_extents = mask_tile_extents

         multiplied_ = source_tile_strides * self.mask_tile_extents

292-299: Fix lint failure: D417 requires documenting **kwargs with the asterisks.

✏️ Proposed fix
             mask_tile_extents: Size of tiles in each dimension in mask space at the original resolution.
             mask_tile_strides: Stride between tile positions in mask space for each dimension.
             dtype: Data type for the accumulator.
-            kwargs: Additional keyword arguments passed to the parent class.
+            **kwargs: Additional keyword arguments passed to the parent class.
         """
🧹 Nitpick comments (2)
ratiopath/masks/mask_builders/__init__.py (2)

289-310: Consider adding a docstring explaining why super().setup_memory() is not called.

This override directly allocates both accumulators without calling the parent's setup_memory(). While this appears intentional (to handle the dual-accumulator pattern with separate filepaths), a brief docstring would help future maintainers understand why the cooperative MRO pattern is bypassed here.

✏️ Proposed documentation
     def setup_memory(
         self,
         mask_extents: Int64[AccumulatorType, " N"],
         channels: int,
         dtype: npt.DTypeLike = np.float32,
         accumulator_filepath: Path | None = None,
         overlap_counter_filepath: Path | None = None,
         **kwargs: Any,
     ) -> None:
+        """Allocate memory-mapped accumulators for mask and overlap counter.
+
+        Note: This intentionally does not call super().setup_memory() because
+        this class requires custom dual-accumulator allocation with separate
+        filepaths for the main accumulator and overlap counter.
+        """
         self.accumulator = self.allocate_accumulator(

321-324: Address the TODO: add tests for get_vips_scale_factors.

This method has a TODO comment indicating missing test coverage.

Do you want me to generate unit tests for this method or open an issue to track this task?

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@ratiopath/masks/mask_builders/__init__.py`:
- Around line 289-310: The setup_memory method currently ignores the dtype and
additional kwargs; update the two calls to allocate_accumulator in setup_memory
(the one assigning self.accumulator and the one assigning self.overlap_counter)
to pass dtype=dtype and forward **kwargs so the memmap allocations respect the
requested dtype and any extra options; keep channels=1 for the overlap counter
but still pass dtype and **kwargs to allocate_accumulator.

…askBuilder2D with detailed docstring and dtype support
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@ratiopath/masks/mask_builders/__init__.py`:
- Around line 298-314: The docstring for setup_memory (the mixin method on line
~297) is missing required blank lines and has an incorrect Returns section;
update the docstring to follow D410/D411 by inserting a blank line between the
short description and the longer description/sections, and ensure there's a
blank line before the "Args:" block and before "Returns:". Correct the Returns
section to state that the method returns None (or remove the Returns section
entirely if not needed) to match the function signature of setup_memory.
Preserve the existing explanation about why super.setup_memory() is not called
and keep parameter descriptions unchanged.
🧹 Nitpick comments (1)
ratiopath/masks/mask_builders/__init__.py (1)

344-347: Address TODO: Add tests and consider casting to Python floats.

The TODO indicates missing test coverage. Also, tuple(scale_factors) returns NumPy scalar types (np.float64), not Python floats as the return type annotation suggests. This can cause issues with JSON serialization or non-NumPy APIs.

📝 Proposed fix for type consistency
         scale_factors = (
             self.overflow_buffered_source_extents / self.accumulator.shape[1:]
         )  # H, W
-        return tuple(scale_factors)  # TODO: add tests for this method
+        return (float(scale_factors[0]), float(scale_factors[1]))

Would you like me to generate unit tests for get_vips_scale_factors or open a new issue to track this task?

Comment on lines +48 to +51
mask_extents (tuple[int, int]): Spatial dimensions of the mask to build.
channels (int): Number of channels in the scalar values to be assembled into the mask.
mask_tile_extents (tuple[int, int]): Extents of the tiles in mask space for each spatial dimension (height, width).
mask_tile_strides (tuple[int, int]): Strides of the tiles in mask space for each spatial dimension (height, width).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove typings

Comment on lines +123 to +126
mask_extents (tuple[int, int]): Spatial dimensions of the mask to build.
channels (int): Number of channels in the scalar values to be assembled into the mask.
mask_tile_extents (tuple[int, int]): Extents of the tiles in mask space for each spatial dimension (height, width).
mask_tile_strides (tuple[int, int]): Strides of the tiles in mask space for each spatial dimension (height, width).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dtto

dtype=dtype,
)

def get_vips_scale_factors(self) -> tuple[float, float]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this not handled by the EdgeClipping class/mixin?


def __init__(
self,
source_extents: Int64[AccumulatorType, " N"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused

@Adames4 Adames4 self-requested a review January 28, 2026 09:54
overlap_vectorized = np.vectorize(overlap_fraction, otypes=[object])

return pa.array(overlap_vectorized(overlay))
return pa.array(overlap_vectorized(overlay)) # type: ignore [return-value]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why type: ignore?

Comment on lines +18 to +43
def compute_acc_slices(
coords_batch: Int64[AccumulatorType, "N B"],
mask_tile_extents: Int64[AccumulatorType, " N"],
) -> list[list[slice]]:
"""Compute slice objects for accumulator indexing.

Args:
coords_batch: Array of shape (N, B) with top-left coordinates for B tiles in N dimensions.
mask_tile_extents: Array of shape (N,) with tile size in mask space for each dimension.

Returns:
List of N lists, each containing B slice objects for indexing into accumulator.
"""
acc_ends = coords_batch + mask_tile_extents[:, np.newaxis] # shape (N, B)

acc_slices_batch_per_dim = []
for dimension in range(coords_batch.shape[0]):
acc_slices_batch_per_dim.append(
[
slice(start, end)
for start, end in zip(
coords_batch[dimension], acc_ends[dimension], strict=True
)
]
)
return acc_slices_batch_per_dim
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def compute_acc_slices(
coords_batch: Int64[AccumulatorType, "N B"],
mask_tile_extents: Int64[AccumulatorType, " N"],
) -> list[list[slice]]:
"""Compute slice objects for accumulator indexing.
Args:
coords_batch: Array of shape (N, B) with top-left coordinates for B tiles in N dimensions.
mask_tile_extents: Array of shape (N,) with tile size in mask space for each dimension.
Returns:
List of N lists, each containing B slice objects for indexing into accumulator.
"""
acc_ends = coords_batch + mask_tile_extents[:, np.newaxis] # shape (N, B)
acc_slices_batch_per_dim = []
for dimension in range(coords_batch.shape[0]):
acc_slices_batch_per_dim.append(
[
slice(start, end)
for start, end in zip(
coords_batch[dimension], acc_ends[dimension], strict=True
)
]
)
return acc_slices_batch_per_dim
def compute_acc_slices(
coords_batch: Int64[AccumulatorType, "N B"],
mask_tile_extents: Int64[AccumulatorType, " N"],
) -> list[list[slice]]:
"""Compute slice objects for accumulator indexing.
Args:
coords_batch: Array of shape (N, B) with top-left coordinates for B tiles in N dimensions.
mask_tile_extents: Array of shape (N,) with tile size in mask space for each dimension.
Returns:
List of N lists, each containing B slice objects for indexing into accumulator.
"""
tile_end_coords = coords_batch + mask_tile_extents[:, np.newaxis] # shape (N, B)
acc_slices_batch_per_dim = []
for dimension in range(coords_batch.shape[0]):
acc_slices_batch_per_dim.append(
[
slice(start, end)
for start, end in zip(
coords_batch[dimension], tile_end_coords[dimension], strict=True
)
]
)
return acc_slices_batch_per_dim

I think acc_ends is a misleading name for something that represents the coordinates of the tile’s bottom-left corner.

"""

def setup_memory(
self, mask_extents, channels, dtype: npt.DTypeLike, **kwargs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing type annotation on mask_extents and channels

# Perform base allocation then allocate the overlap counter.
super().setup_memory(mask_extents, channels, dtype=dtype, **kwargs)
self.overlap_counter = self.allocate_accumulator(
mask_extents=mask_extents, channels=1, dtype=dtype, **kwargs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dtype in overlap_counter should be integer for example np.uint16 or np.uint32

Comment on lines +80 to +83
def finalize(self) -> tuple[AccumulatorType, ...]:
# Average the accumulated mask by the overlap counts
self.accumulator /= self.overlap_counter.clip(min=1)
return self.accumulator, self.overlap_counter
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def finalize(self) -> tuple[AccumulatorType, ...]:
# Average the accumulated mask by the overlap counts
self.accumulator /= self.overlap_counter.clip(min=1)
return self.accumulator, self.overlap_counter
def finalize(self) -> tuple[AccumulatorType, ...]:
# Average the accumulated mask by the overlap counts
return (
self.accumulator / self.overlap_counter.clip(min=1),
self.overlap_counter,
)

This is just a personal preference: with this change, finalize becomes idempotent, meaning you can call it multiple times and get the same result each time.

@Adames4
Copy link
Contributor

Adames4 commented Jan 28, 2026

A usability/API suggestion we discussed and agreed on with others (@vejtek, @JakubPekar, @matejpekar): the functionality in ratiopath.masks.mask_builders would be easier to consume if there were a single public factory-style entry point (e.g. MaskBuilder / MaskBuilderFactory) with a unified set of parameters (aggregation: avg/max, storage: ndarray/memmap + filepaths, scaling/stride options, edge clipping, etc.) that internally chooses the right concrete implementation.

That would let users avoid reasoning about mixin composition/MRO and make the API more discoverable. To keep naming clear, the current abstract base MaskBuilder could be renamed to MaskBuilderABC (or similar).

More on naming: I’d strongly prefer we don’t expect users to instantiate ~50-character class names like AutoScalingAveragingClippingNumpyMemMapMaskBuilder2D directly—I find names that long pretty painful to work with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants