Skip to content

Bug: preprocessing resampling might result in tensor with dimensions 255 instead of 256 (set in initial_resolution) #22

@TMartinot

Description

@TMartinot

First of all compliments for the results achieved with this piece of software!

Bug

When I was using the software to register WSIs I encountered an error which was difficult to traceback but I think found the little bug. When registering the WSIs most of them worked perfectly fine, but at very few slide I got this error:

Exception: Argument #6: Padding size should be less than the corresponding input dimension, but got: padding (1, 1) at dimension 2 of input [1, 1, 1, 2]

Traceback (most recent call last):
  File "script.py", line 83, in <module>
    main()
  File "script.py", line 79, in main
    deeperhistreg_caller(args)
  File "script.py", line 55, in deeperhistreg_caller
    deeperhistreg.run_registration(**dhr_config)
  File "DeeperHistReg/deeperhistreg/run.py", line 55, in run_registration
    warped_name = [item for item in os.listdir(pathlib.Path(save_path) / experiment_name / "Results_Final") if "warped_source" in item][0]
IndexError: list index out of range

It seemed to happen at a very low proportion of the slides, but it kept failing to perform the registration for the same slides. Unfortunately, the python error and traceback is raised because the nonrigid registration was not performed and hence the final results could not be made. The actual error has no traceback since it is caught here:

### Run Registration ###
try:
registration_parameters['logging_path'] = pathlib.Path(save_path) / "logs.txt"
registration_parameters['case_name'] = experiment_name
pipeline = fr.DeeperHistReg_FullResolution(registration_parameters)
pipeline.run_registration(source_path, target_path, save_path)
except Exception as e:
print(f"Exception: {e}")

Since the program will fail anyway and the try except loop will only make it harder to traceback the actual error I would suggest removing it. Leaving it to the user to decide whether they want to catch errors if they are registering a batch of WSIs and removing possible confusion about errors. Also in:

After disabling the try except loop above I obtained the actual traceback:

Traceback (most recent call last):
  File "script.py", line 86, in <module>
    main()
  File "script.py", line 82, in main
    deeperhistreg_caller(args)
  File "script.py", line 56, in deeperhistreg_caller
    deeperhistreg.run_registration(**dhr_config)
  File "DeeperHistReg/deeperhistreg/run.py", line 44, in run_registration
    pipeline.run_registration(source_path, target_path, save_path)
  File "DeeperHistReg/deeperhistreg/dhr_pipeline/../dhr_pipeline/full_resolution.py", line 299, in run_registration
    self.nonrigid_registration()
  File "DeeperHistReg/deeperhistreg/dhr_pipeline/../dhr_pipeline/full_resolution.py", line 240, in nonrigid_registration
    self.run_nonrigid_registration()
  File "DeeperHistReg/deeperhistreg/dhr_pipeline/../dhr_pipeline/full_resolution.py", line 189, in run_nonrigid_registration
    self.nonrigid_displacement_field = nonrigid_registration_function(self.pre_source, self.pre_target, self.current_displacement_field, nonrigid_registration_params)
  File "DeeperHistReg/deeperhistreg/dhr_pipeline/../dhr_registration/nonrigid_registration_methods.py", line 66, in instance_optimization_nonrigid_registration
    return ion.instance_optimization_nonrigid_registration(source, target, initial_displacement_field, params)
  File "DeeperHistReg/deeperhistreg/dhr_pipeline/../dhr_registration/./dhr_nonrigid_registration/io_nonrigid.py", line 67, in instance_optimization_nonrigid_registration
    displacement_field = io.nonrigid_registration(warped_source, resampled_target, num_levels, used_levels, iterations, learning_rates, alphas,
  File "DeeperHistReg/deeperhistreg/dhr_pipeline/../dhr_registration/./dhr_building_blocks/instance_optimization.py", line 172, in nonrigid_registration
    source_pyramid = u.create_pyramid(source, num_levels=num_levels)
  File "DeeperHistReg/deeperhistreg/dhr_pipeline/../dhr_utils/utils.py", line 462, in create_pyramid
    new_tensor = resample_tensor_to_size(gaussian_smoothing(pyramid[i+1], 1), new_size, mode=mode)
  File "DeeperHistReg/deeperhistreg/dhr_pipeline/../dhr_utils/utils.py", line 90, in gaussian_smoothing
    return tr.GaussianBlur(kernel_size, sigma)(tensor)
  File "venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "venv/lib/python3.10/site-packages/torchvision/transforms/transforms.py", line 1812, in forward
    return F.gaussian_blur(img, self.kernel_size, [sigma, sigma])
  File "venv/lib/python3.10/site-packages/torchvision/transforms/functional.py", line 1380, in gaussian_blur
    output = F_t.gaussian_blur(t_img, kernel_size, sigma)
  File "envs/venv/lib/python3.10/site-packages/torchvision/transforms/_functional_tensor.py", line 760, in gaussian_blur
    img = torch_pad(img, padding, mode="reflect")
  File "venv/lib/python3.10/site-packages/torch/nn/functional.py", line 5096, in pad
    return torch._C._nn.pad(input, pad, mode, value)
RuntimeError: Padding size should be less than the corresponding input dimension, but got: padding (1, 1) at dimension 2 of input [1, 1, 1, 2]

After debugging for a long time I think the problem is here:

Origin of bug

For certain dimensions I obtained the error described above which was the result of a tensor that was resampled to a size of (1, 1, 255, width) instead of the (1, 1, 256, width) (as set in params["preprocessing_params"]["initial_resolution"]) resulting in Exception: Argument #6: Padding size should be less than the corresponding input dimension, but got: padding (1, 1) at dimension 2 of input [1, 1, 1, 2] after further down sampling.

source = u.resample(u.gaussian_smoothing(source, initial_smoothing), initial_resample_ratio)
target = u.resample(u.gaussian_smoothing(target, initial_smoothing), initial_resample_ratio)

def resample(tensor : tc.Tensor, resample_ratio : float, mode: str="bilinear") -> tc.Tensor:
"""
TODO
"""
return F.interpolate(tensor, scale_factor = 1 / resample_ratio, mode=mode, recompute_scale_factor=False, align_corners=False)
def resample_tensor_to_size(tensor: tc.Tensor, new_size: tc.Tensor, mode: str='bilinear') -> tc.Tensor:
"""
TODO
"""
return F.interpolate(tensor, size=new_size, mode=mode, align_corners=False)

This seems to appear because the resampling was done using the correctly calculated initial_resample_ratio but due to floating point precision and the PyTorch implementation the resulting height dimension will be 255.99999... which is cast to int (i.e. floored) in PyTorch to 255 resulting in the error described above.

See also:

Proposed bugfix

As the output size is most important in this case my suggestion would be to use your resample_tensor_to_size method instead as this resolves the behavior. This PyTorch form page suggested that using torch.nn.functional.interpolate with recompute_Scale_factor=True should also work, but when I tested it (see the test below) it still produced a tensor with height 255 instead of 256.

I implemented this change in:


Setup

  • OS: Rocky Linux 8.7 (Green Obsidian) - Slurm 23.02.8
  • GPUs: 3x RTX8000 (VRAM: 48GB) CUDA-compatible setup
  • Python: 3.10.16
  • Pip: pip 23.0.1 from .../venv/lib/python3.10/site-packages/pip (python 3.10)
    • torch 2.5.1+cu118
    • torchvision 0.20.1+cu118
    • torchio 0.21.0
    • torchstain 1.4.1
    • torchsummary 1.5.1

Stand-alone test

Below is a stand-alone test to reproduce the bug and test bugfixes:

import pathlib
import logging
import unittest

import torch as tc
import torch.nn.functional as F

from deeperhistreg.dhr_utils import utils as u


def resample_modified(tensor: tc.Tensor, ratio: float, mode="bilinear"):
    """Modified resample method: use `recompute_scale_factor=True`"""
    return F.interpolate(tensor, scale_factor=1 / ratio, mode=mode, recompute_scale_factor=True, align_corners=False)


def init_logger():
    """Basic logger"""
    log_path = pathlib.Path(__file__).with_suffix(".log")
    fmt = "[%(levelname)s] - %(asctime)s - %(name)s: %(message)s - %(pathname)s:%(lineno)d"

    logging.basicConfig(
        format=fmt,
        level=logging.INFO,
        handlers=[
            logging.FileHandler(log_path),
            logging.StreamHandler()
        ],
        force=True,  # override pytest/unittest handlers
    )

    return logging.getLogger("Resampling-test")


logger = init_logger()


class TestResampling(unittest.TestCase):
    """NOTE: This test mainly focused on height but the same might be applied to width"""

    def setUp(self):
        # Deterministic behavior
        tc.manual_seed(0)
        tc.backends.cudnn.deterministic = True
        tc.backends.cudnn.benchmark = False

        # Use CUDA if available
        self.device = tc.device("cuda" if tc.cuda.is_available() else "cpu")
        logger.info(f"Using device: {self.device}")

        self.initial_resolution = 256
        self.H = 16051
        self.W = 17600

        self.src = tc.rand(1, 3, self.H, self.W, device=self.device)
        self.trg = tc.rand(1, 3, self.H, self.W, device=self.device)
        logger.info(f"Initial shapes - source: {self.src.shape} | target: {self.trg.shape}")

        # Resampling parameters (taken from production code)
        self.ratio = u.calculate_resampling_ratio(
            (self.W, self.W),
            (self.H, self.H),
            self.initial_resolution,
        )
        self.sigma = max(self.ratio - 1, 0.1)

        logger.info(f"Resampling ratio: {self.ratio} with Gaussian smoothing sigma: {self.sigma}")

    def test_resample_without_recompute(self):
        logger.info("TEST RESAMPLE (current implementation)")

        src_smooth = u.gaussian_smoothing(self.src, self.sigma)
        trg_smooth = u.gaussian_smoothing(self.trg, self.sigma)
        logger.info(f"Gaussian-smoothed shapes (src|trg): {src_smooth.shape} | {trg_smooth.shape}")

        out_src = u.resample(src_smooth, self.ratio)
        out_trg = u.resample(trg_smooth, self.ratio)
        logger.info(f"Resampled shapes (src|trg): {out_src.shape} | {out_trg.shape}")

        # Check for mismatch bug
        expected_h = int(round(self.H / self.ratio))
        self.assertEqual(out_src.shape[2], out_trg.shape[2])
        self.assertIn(out_src.shape[2], [expected_h, expected_h - 1, expected_h + 1])

    def test_resample_with_recompute(self):
        logger.info("TEST RESAMPLE WITH RECOMPUTE SCALE RATIO (possible suggestion)")

        src_smooth = u.gaussian_smoothing(self.src, self.sigma)
        trg_smooth = u.gaussian_smoothing(self.trg, self.sigma)
        logger.info(f"Gaussian-smoothed shapes (src|trg): {src_smooth.shape} | {trg_smooth.shape}")

        out_src = resample_modified(src_smooth, self.ratio)
        out_trg = resample_modified(trg_smooth, self.ratio)
        logger.info(f"Resampled shapes (src|trg): {out_src.shape} | {out_trg.shape}")

        # Check output dimensions
        expected_h = int(round(self.H / self.ratio))
        self.assertEqual(out_src.shape[2], out_trg.shape[2])
        self.assertIn(out_src.shape[2], [expected_h])   # height is still 255, test might fail

    def test_resample_to_explicit_size(self):
        logger.info("TEST RESAMPLE TO EXPLICIT SIZE (best bugfix)")

        resample_dim = (self.initial_resolution, round(self.W * (1 / self.ratio)))
        logger.info(f"Target size: {resample_dim}")

        src_smooth = u.gaussian_smoothing(self.src, self.sigma)
        trg_smooth = u.gaussian_smoothing(self.trg, self.sigma)
        logger.info(f"Gaussian-smoothed shapes (src|trg): {src_smooth.shape} | {trg_smooth.shape}")

        out_src = u.resample_tensor_to_size(src_smooth, resample_dim)
        out_trg = u.resample_tensor_to_size(trg_smooth, resample_dim)
        logger.info(f"Resampled shapes (src|trg): {out_src.shape} | {out_trg.shape}")

        # Check resample height
        self.assertEqual(self.initial_resolution == round(self.H * (1 / self.ratio)))
        # Check output dimensions
        self.assertEqual(out_src.shape[2:], resample_dim)
        self.assertEqual(out_trg.shape[2:], resample_dim)


if __name__ == "__main__":
    unittest.main()

Slide dimensions at which this problem occurs

It happens only at specific and unfortunate slide dimensions. I ran a more extensive test on 2 slides:

SRC W SRC H TRG W TRG H Ratio W H Resampled H Err
88000 79744 87872 80256 0.2 17600 16051 255 Yes
88000 79744 87872 80256 0.2 17600 16052 256 No
88000 79744 87872 80256 0.2 17600 16053 256 No
88000 79744 87872 80256 0.2 17600 16054 256 No
88000 79744 87872 80256 0.2 17600 16055 256 No
88000 79744 87872 80256 0.1 8800 8026 256 No
88000 79744 87872 80256 0.1 8800 8027 256 No
88000 79744 87872 80256 0.1 8800 8028 256 No
88000 79744 87872 80256 0.1 8800 8029 256 No
88000 79744 87872 80256 0.1 8800 8030 256 No
88000 79744 87872 80256 0.1 8800 8031 256 No
88000 79744 87872 80256 0.1 8800 8032 256 No
88000 79744 87872 80256 0.1 8800 8033 256 No
88000 79744 87872 80256 0.1 8800 8034 256 No
88000 79744 87872 80256 0.1 8800 8035 255 Yes
88000 79744 87872 80256 0.1 8800 8039 256 No
88000 79744 87872 80256 0.1 8800 8040 255 Yes
88000 79744 87872 80256 0.15 13200 12038 256 No
88000 79744 87872 80256 0.15 13200 12047 256 No
88000 79744 87872 80256 0.15 13200 12048 255 Yes
88000 79744 87872 80256 0.15 13200 12049 256 No
88000 79744 87872 80256 0.15 13200 12050 256 No
88000 79744 87872 80256 0.15 13200 12051 256 No
88000 79744 87872 80256 0.15 13200 12052 256 No

Abbreviations:

  • SRC: source
  • TRG: target
  • W: width,
  • H: height,
  • Err: error

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions