Volumina be lazy! by k-dominik · Pull Request #339 · ilastik/volumina

k-dominik · 2025-10-11T14:57:28Z

ilastik faces performance problems with bigger data, especially when zooming out. Background is the nature of how volumina requests data from ilastik: Whatever is visible, will be requested. While this is super cool when looking at small parts of images, this lazyness is too eager for zoomed out views on larger data. As a result the user would experience unresponsive ui for a long time with often tens of thousands of requests being submitted simultaneously. This is a draw on the RAM as intermediate results of waiting requests must be stored there.

This PR adds the LazyflowRequestBuffer class which closely interacts with TileProvider. TileProvider will not directly submit it's requests for tiles to the requestpool, but go via the lazyflow request buffer. The buffer takes care of submitting the requests up to a maximum number. This greatly reduces the number of simultaneous running requests.

This PR furthermore addresses "cancellation". Running tasks are never cancelled, but with the queue inside the request buffer, it is possible to remove unsubmitted requests and dramatically speed up ilastik in interactive mode during scrolling through z, time, or scrolling in plane.

As a result the UI stays more or less interactive all the time. Scrolling/zooming is not an issue anymore. In my benchmark, the memory consumption was reduced to half.

Benchmark results (currently my machine only, but in earlier versions I could see similar results on win/linux)

Step	duration main	duration PR	speedup
zoom out	0.4	0.2	2
live update	12.9	13.5	0.95
scroll-live-40	42.3	16.7	2.5
scroll-live+80	117.0	17.1	6.8

Benchmark on linux (alienware). Could not scroll-live+80 as it crashed (out of memory) on main, so I reduced to +60...

Step	duration main	duration PR	speedup
zoom out	0.7	0.4	1.75
live update	47.7	43.7	1.09
scroll-live-40	351.0	41.1	8.5
scroll-live+60	169.8	45.6	3.7

don't really have the time to do the tests also on a windows machine. But there is nothing that could lead me to expect different behavior.

note, the more scrolling is done, the greater the speedup because of omitted computations.

Demo: scrolling through cremi with live update on (wouldn't recommend to try this on main - goes out of ram on my mashine, note this also uses #338).

requestbuffer.mp4

Bottom line: this does not speed up any individual computation, but by being smarter about what to compute, it results in a much more responsive experience and overall reduction in computations.

Todos:

benchmark other machines
make benchmark project/script available (made available internally)

Edit: Updated with benchmark results on linux.

* make volumina request tiles more lazy by first submitting to a queue * tasks are submitted up to a limit to the lazyflow request pool * mechanism for cancelling/removing unsubmitted requests from the queue * As compared to before, ilastik will use way less memory (maybe half) This enables way better interactive usage of ilastik as number of requests that are submitted stays limited. Changing conditions (scrolling, dirtyness) trigger cancellation of queued requests that are not relevant anymore. Fixes ilastik#135 Fixes ilastik/ilastik#1735 Fixes ilastik/ilastik#1376, at least to a degree

codecov · 2025-10-11T14:59:20Z

Codecov Report

❌ Patch coverage is 84.02367% with 27 lines in your changes missing coverage. Please review.
✅ Project coverage is 36.90%. Comparing base (fe15d9e) to head (ff73255).
⚠️ Report is 9 commits behind head on main.

Files with missing lines	Patch %	Lines
volumina/tiling/tileprovider.py	77.77%	8 Missing ⚠️
volumina/utility/prioritizedThreadPool.py	33.33%	8 Missing ⚠️
volumina/volumeEditorWidget.py	0.00%	8 Missing ⚠️
volumina/imageScene2D.py	66.66%	2 Missing ⚠️
volumina/pixelpipeline/datasources/cachesource.py	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #339      +/-   ##
==========================================
+ Coverage   36.33%   36.90%   +0.57%     
==========================================
  Files         108      109       +1     
  Lines       11439    11565     +126     
==========================================
+ Hits         4156     4268     +112     
- Misses       7283     7297      +14

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

emilmelnikov

Wow, this is a neat solution to a long-standing request contention/cancellation problem!

* interface for tile submission slightly changed, getTiles takes second argument for viewport Rect * test_lazy: not really a fix, but stricter testing that helped debug some issues with cancellation * note to `test_tiling.py`: Seems to be necessary to clean up the threadpool, otherwise the test fails FIXUP somewhere, skips tests for lazyflowrequestbuffer if theres no lazyflow, patch rendererpool, probably sequence of test now different so it used to work coincidentally

time.time() is not guaranteed to be unique.

almost everything (except submit) was already operating under a lock, so the locking overhead of PriorityQueue can be omitted. Co-Authored-By: Emil Melnikov <emilmelnikov@users.noreply.github.com>

btbest · 2025-10-15T09:26:19Z

It's bonkers how much nicer this feels.

One unfortunate downside is that rendering requests for labels are now also subject to lazyflow queuing, which I think they were pretty much independent of before (?). Now newly drawn labels can take a while to show up, and in my tryout they usually showed up after the updated predictions (even within the same tile, visible in the last yellow brush stroke in the video). Feels a bit unintuitive, but I'd be happy to accept it for the speedup. I guess it might even be useful as feedback for the user that maybe they should consider not drawing more labels 😁

labels-after-predictions.mp4

k-dominik · 2025-10-15T11:11:19Z

It's bonkers how much nicer this feels.

One unfortunate downside is that rendering requests for labels are now also subject to lazyflow queuing, which I think they were pretty much independent of before (?). Now newly drawn labels can take a while to show up, and in my tryout they usually showed up after the updated predictions (even within the same tile, visible in the last yellow brush stroke in the video). Feels a bit unintuitive, but I'd be happy to accept it for the speedup. I guess it might even be useful as feedback for the user that maybe they should consider not drawing more labels 😁
labels-after-predictions.mp4

For this there is #338 - with this we can prioritize e.g. highest for raw, then labels, then rest.

btbest

🚀 let's gooooo

btbest · 2025-10-15T11:09:24Z

+    request_buffer = buffer = LazyflowRequestBuffer(1)
+    finished_evt = install_finish_even(buffer, "decr", n_calls=1)


Suggested change

request_buffer = buffer = LazyflowRequestBuffer(1)

finished_evt = install_finish_even(buffer, "decr", n_calls=1)

request_buffer = LazyflowRequestBuffer(1)

finished_evt = install_finish_even(request_buffer, "decr", n_calls=1)

not sure what this is for

cooooooopy paste :)

btbest · 2025-10-15T11:09:45Z

+    request_buffer = buffer = LazyflowRequestBuffer(1)
+    finished_evt = install_finish_even(buffer, "decr", n_calls=2)


Suggested change

request_buffer = buffer = LazyflowRequestBuffer(1)

finished_evt = install_finish_even(buffer, "decr", n_calls=2)

request_buffer = LazyflowRequestBuffer(1)

finished_evt = install_finish_even(request_buffer, "decr", n_calls=2)

btbest · 2025-10-15T11:10:16Z

+    # submit with higher priority for good measure
+    request_buffer.submit(
+        waiting_func2,
+        priority=(-120,),


Suggested change

priority=(-120,),

priority=(-120,), # higher prio than context.priority (-100)

if I understand correctly..?

yeah well, before I called it default_context, I had those comments, too. But figured it would be much clearer by establishing a set of default values, that are used in all the tests, except for the one that should be varied. But I guess this was not obvious...

btbest · 2025-10-15T11:10:37Z

+            default_context.priority,
+            viewport_ref=default_context.view_port,
+            stack_id=default_context.stack_id,
+            tile_no=1,


Suggested change

tile_no=1,

tile_no=1, # not context.tile_no (which is 0)

btbest · 2025-10-15T11:16:29Z

I did a bunch of renaming in this file for my reading, feel free to take or discard any of these:

default_context --> context

finished_evt --> buffer_finished_evt

default_waiting_func --> buffer_pre_running_func

test_requests_are_cancelled_bc_keep_tiles --> test_clear_requests_for_other_tile

test_requests_are_cancelled_bc_keep_different_stack --> test_clear_requests_for_other_stack

test_requests_other_vp_not_cancelled --> test_clear_keeps_requests_for_other_vp

test_requests_are_cancelled_bc_priority --> test_clear_duplicates_of_keep_tiles (I couldn't see the role priority plays anywhere...)

btbest · 2025-10-15T11:18:10Z

-    if USE_LAZYFLOW_THREADPOOL:
+if USE_LAZYFLOW_THREADPOOL:
+    from volumina.utility.lazyflowRequestBuffer import LazyflowRequestBuffer
+    from lazyflow.request import Request


Suggested change

from lazyflow.request import Request

duplicate of import above

btbest · 2025-10-15T11:20:30Z

+
+    renderer_pool = LazyflowRequestBuffer(Request.global_thread_pool.num_workers)
+
+    def clear_threadpool_vp(vp: "TileProvider", stack_id: StackId, keep_tiles: list[int]):


If you feel like doing some refactoring, the renderer instantiations and two functions here look like they want to be methods of the respective threadpool, with a common interface abc...

I think in the context of this PR it's fine to keep the actual threadpools hidden. But will consider this refactor for #340, as there this would be definitely motivated.

btbest · 2025-10-15T11:29:26Z

+                self._cleared_tasks += 1
+            self._queue = []
+
+    def clear_vp_res(self, viewport: "TileProvider", stack_id: StackId, keep_tiles: list[int]):


What's a "res"?

Maybe "clear_queue_outside"

honestly don't remember. maybe I'll go full verbose clear_non_submitted_tiles_outside_field_of_view :)

btbest · 2025-10-15T11:29:44Z

+                # task_o = tmp_queue.get((stack_id, task.tile_no))
+                # if task_o and task_o < task:


Suggested change

# task_o = tmp_queue.get((stack_id, task.tile_no))

# if task_o and task_o < task:

btbest · 2025-10-15T11:30:50Z

+                # Remove task outside current 2d slice from the queue
+                if task.vp == viewport and task.stack_id != stack_id:
+                    task.cancel()
+                    self._cleared_tasks += 1
+                    continue
+
+                # Remove task in the current slice, but no longer visible (not in keep_tiles)
+                if task.vp == viewport and task.stack_id == stack_id and task.tile_no not in keep_tiles:
+                    task.cancel()
+                    self._cleared_tasks += 1
+                    continue


Suggested change

# Remove task outside current 2d slice from the queue

if task.vp == viewport and task.stack_id != stack_id:

task.cancel()

self._cleared_tasks += 1

continue

# Remove task in the current slice, but no longer visible (not in keep_tiles)

if task.vp == viewport and task.stack_id == stack_id and task.tile_no not in keep_tiles:

task.cancel()

self._cleared_tasks += 1

continue

# Remove

# - tasks outside current 2d slice from the queue

# - tasks in the current slice, but no longer visible (not in keep_tiles)

if task.stack_id != stack_id or task.tile_no not in keep_tiles:

task.cancel()

self._cleared_tasks += 1

continue

Summary: * remove of superfluous buffer variable in tests * add more comments to test that deviate from the default_context * remove superfluous import of Requests * rename clear_threadpool_vp -> clear_non_submitted_tiles_outside_field_of_view * simplify boolean logic in cancel function * rename finished_evt -> buffer_finished_evt * get rid of the only warning in volumina tests?! Co-authored-by: Benedikt Best <63287233+btbest@users.noreply.github.com>

k-dominik added 2 commits October 11, 2025 14:50

Integrate LazyflowRequestBuffer

640ba2e

emilmelnikov reviewed Oct 11, 2025

View reviewed changes

Comment thread volumina/utility/lazyflowRequestBuffer.py Outdated

k-dominik commented Oct 11, 2025

View reviewed changes

Comment thread volumina/tiling/tileprovider.py Outdated

This was referenced Oct 11, 2025

Request cancel #340

Draft

Reset voluminas queued tasks when live update is disabled ilastik/ilastik#3082

Draft

k-dominik marked this pull request as ready for review October 13, 2025 06:59

k-dominik and others added 5 commits October 13, 2025 20:14

Reliable incrementing timestamps

ea8906e

time.time() is not guaranteed to be unique.

Add various type annotations that helped along the way

20d4210

Use heapq directly instead of PriorityQueue

26f936e

almost everything (except submit) was already operating under a lock, so the locking overhead of PriorityQueue can be omitted. Co-Authored-By: Emil Melnikov <emilmelnikov@users.noreply.github.com>

Sync up LazyflowRequestBuffer with lazyflow threadpool

38301bd

k-dominik force-pushed the volumina-be-lazy branch from 32167c7 to 38301bd Compare October 13, 2025 18:25

btbest approved these changes Oct 15, 2025

View reviewed changes

k-dominik commented Oct 16, 2025

View reviewed changes

Comment thread volumina/tiling/tileprovider.py Outdated

k-dominik force-pushed the volumina-be-lazy branch from eb9240c to ff73255 Compare October 16, 2025 16:11

k-dominik merged commit 637cf63 into ilastik:main Oct 16, 2025
20 checks passed

k-dominik mentioned this pull request Oct 16, 2025

Cancel obsolete requests #268

Closed

k-dominik mentioned this pull request Nov 7, 2025

Add links for 1.4.2b5 ilastik/ilastik.github.io#323

Merged

		request_buffer = buffer = LazyflowRequestBuffer(1)
		finished_evt = install_finish_even(buffer, "decr", n_calls=1)

	priority=(-120,),
	priority=(-120,), # higher prio than context.priority (-100)


		renderer_pool = LazyflowRequestBuffer(Request.global_thread_pool.num_workers)

		def clear_threadpool_vp(vp: "TileProvider", stack_id: StackId, keep_tiles: list[int]):

		# task_o = tmp_queue.get((stack_id, task.tile_no))
		# if task_o and task_o < task:

Conversation

k-dominik commented Oct 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Oct 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

emilmelnikov left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

btbest commented Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

k-dominik commented Oct 15, 2025

Uh oh!

btbest left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

btbest Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

k-dominik commented Oct 11, 2025 •

edited

Loading

codecov bot commented Oct 11, 2025 •

edited

Loading

btbest commented Oct 15, 2025 •

edited

Loading

btbest left a comment •

edited

Loading

btbest Oct 15, 2025 •

edited

Loading