Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #989 +/- ##
==========================================
+ Coverage 92.07% 92.11% +0.03%
==========================================
Files 48 48
Lines 7445 7440 -5
==========================================
- Hits 6855 6853 -2
+ Misses 590 587 -3
🚀 New features to boost your workflow:
|
melonora
left a comment
There was a problem hiding this comment.
Hey @ArneDefauw, thanks for the PR.
I see that rechunking is currently always performed, which could be potentially expensive.
Would it perhaps be better to import _check_regular_chunks before the for loop that does the rechunking to see whether rechunking is required in the first place? That way you can also just avoid the second for loop:
from dask.array.core import _check_regular_chunks
for scale in result:
data = result[scale]["image"].data
chunks = data.chunks
if not _check_regular_chunks(data.chunks):
data = data.rechunk(result[scale]["image"].data.chunksize)
if not _check_regular_chunks(data.chunks):
raise ValueError(f"Chunks are not regular for {scale} of the queried data: {chunks} and could also not be rechunked regularly. Please report this bug.")
result[scale]["image"].data = data I would also avoid asserts. These are only for development and when testing. We should start clearing the remaining asserts in src, and I do so whenever I touch code that contains them.
|
Hi @melonora , thanks for the review! I think |
from what I see in the dask codebase it is not necessarily a no op. It does not check first whether the chunking stays the same and then return early if it is the case. |
|
Will let @LucaMarconato have a small look, but I think this is good for merge, thanks! |
Fix for #988.