Conversation
* make volumina request tiles more lazy by first submitting to a queue * tasks are submitted up to a limit to the lazyflow request pool * mechanism for cancelling/removing unsubmitted requests from the queue * As compared to before, ilastik will use way less memory (maybe half) This enables way better interactive usage of ilastik as number of requests that are submitted stays limited. Changing conditions (scrolling, dirtyness) trigger cancellation of queued requests that are not relevant anymore. Fixes ilastik#135 Fixes ilastik/ilastik#1735 Fixes ilastik/ilastik#1376, at least to a degree
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #339 +/- ##
==========================================
+ Coverage 36.33% 36.90% +0.57%
==========================================
Files 108 109 +1
Lines 11439 11565 +126
==========================================
+ Hits 4156 4268 +112
- Misses 7283 7297 +14 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
emilmelnikov
left a comment
There was a problem hiding this comment.
Wow, this is a neat solution to a long-standing request contention/cancellation problem!
* interface for tile submission slightly changed, getTiles takes second argument for viewport Rect * test_lazy: not really a fix, but stricter testing that helped debug some issues with cancellation * note to `test_tiling.py`: Seems to be necessary to clean up the threadpool, otherwise the test fails FIXUP somewhere, skips tests for lazyflowrequestbuffer if theres no lazyflow, patch rendererpool, probably sequence of test now different so it used to work coincidentally
time.time() is not guaranteed to be unique.
almost everything (except submit) was already operating under a lock, so the locking overhead of PriorityQueue can be omitted. Co-Authored-By: Emil Melnikov <emilmelnikov@users.noreply.github.com>
32167c7 to
38301bd
Compare
|
It's bonkers how much nicer this feels. One unfortunate downside is that rendering requests for labels are now also subject to lazyflow queuing, which I think they were pretty much independent of before (?). Now newly drawn labels can take a while to show up, and in my tryout they usually showed up after the updated predictions (even within the same tile, visible in the last yellow brush stroke in the video). Feels a bit unintuitive, but I'd be happy to accept it for the speedup. I guess it might even be useful as feedback for the user that maybe they should consider not drawing more labels 😁 labels-after-predictions.mp4 |
For this there is #338 - with this we can prioritize e.g. highest for raw, then labels, then rest. |
| request_buffer = buffer = LazyflowRequestBuffer(1) | ||
| finished_evt = install_finish_even(buffer, "decr", n_calls=1) |
There was a problem hiding this comment.
| request_buffer = buffer = LazyflowRequestBuffer(1) | |
| finished_evt = install_finish_even(buffer, "decr", n_calls=1) | |
| request_buffer = LazyflowRequestBuffer(1) | |
| finished_evt = install_finish_even(request_buffer, "decr", n_calls=1) |
not sure what this is for
There was a problem hiding this comment.
cooooooopy paste :)
| request_buffer = buffer = LazyflowRequestBuffer(1) | ||
| finished_evt = install_finish_even(buffer, "decr", n_calls=2) |
There was a problem hiding this comment.
| request_buffer = buffer = LazyflowRequestBuffer(1) | |
| finished_evt = install_finish_even(buffer, "decr", n_calls=2) | |
| request_buffer = LazyflowRequestBuffer(1) | |
| finished_evt = install_finish_even(request_buffer, "decr", n_calls=2) |
| # submit with higher priority for good measure | ||
| request_buffer.submit( | ||
| waiting_func2, | ||
| priority=(-120,), |
There was a problem hiding this comment.
| priority=(-120,), | |
| priority=(-120,), # higher prio than context.priority (-100) |
if I understand correctly..?
There was a problem hiding this comment.
yeah well, before I called it default_context, I had those comments, too. But figured it would be much clearer by establishing a set of default values, that are used in all the tests, except for the one that should be varied. But I guess this was not obvious...
| default_context.priority, | ||
| viewport_ref=default_context.view_port, | ||
| stack_id=default_context.stack_id, | ||
| tile_no=1, |
There was a problem hiding this comment.
| tile_no=1, | |
| tile_no=1, # not context.tile_no (which is 0) |
There was a problem hiding this comment.
I did a bunch of renaming in this file for my reading, feel free to take or discard any of these:
- default_context --> context
- finished_evt --> buffer_finished_evt
- default_waiting_func --> buffer_pre_running_func
- test_requests_are_cancelled_bc_keep_tiles --> test_clear_requests_for_other_tile
- test_requests_are_cancelled_bc_keep_different_stack --> test_clear_requests_for_other_stack
- test_requests_other_vp_not_cancelled --> test_clear_keeps_requests_for_other_vp
- test_requests_are_cancelled_bc_priority --> test_clear_duplicates_of_keep_tiles (I couldn't see the role priority plays anywhere...)
| if USE_LAZYFLOW_THREADPOOL: | ||
| if USE_LAZYFLOW_THREADPOOL: | ||
| from volumina.utility.lazyflowRequestBuffer import LazyflowRequestBuffer | ||
| from lazyflow.request import Request |
There was a problem hiding this comment.
| from lazyflow.request import Request |
duplicate of import above
|
|
||
| renderer_pool = LazyflowRequestBuffer(Request.global_thread_pool.num_workers) | ||
|
|
||
| def clear_threadpool_vp(vp: "TileProvider", stack_id: StackId, keep_tiles: list[int]): |
There was a problem hiding this comment.
If you feel like doing some refactoring, the renderer instantiations and two functions here look like they want to be methods of the respective threadpool, with a common interface abc...
There was a problem hiding this comment.
I think in the context of this PR it's fine to keep the actual threadpools hidden. But will consider this refactor for #340, as there this would be definitely motivated.
| self._cleared_tasks += 1 | ||
| self._queue = [] | ||
|
|
||
| def clear_vp_res(self, viewport: "TileProvider", stack_id: StackId, keep_tiles: list[int]): |
There was a problem hiding this comment.
What's a "res"?
Maybe "clear_queue_outside"
There was a problem hiding this comment.
honestly don't remember. maybe I'll go full verbose clear_non_submitted_tiles_outside_field_of_view :)
| # task_o = tmp_queue.get((stack_id, task.tile_no)) | ||
| # if task_o and task_o < task: |
There was a problem hiding this comment.
| # task_o = tmp_queue.get((stack_id, task.tile_no)) | |
| # if task_o and task_o < task: |
| # Remove task outside current 2d slice from the queue | ||
| if task.vp == viewport and task.stack_id != stack_id: | ||
| task.cancel() | ||
| self._cleared_tasks += 1 | ||
| continue | ||
|
|
||
| # Remove task in the current slice, but no longer visible (not in keep_tiles) | ||
| if task.vp == viewport and task.stack_id == stack_id and task.tile_no not in keep_tiles: | ||
| task.cancel() | ||
| self._cleared_tasks += 1 | ||
| continue |
There was a problem hiding this comment.
| # Remove task outside current 2d slice from the queue | |
| if task.vp == viewport and task.stack_id != stack_id: | |
| task.cancel() | |
| self._cleared_tasks += 1 | |
| continue | |
| # Remove task in the current slice, but no longer visible (not in keep_tiles) | |
| if task.vp == viewport and task.stack_id == stack_id and task.tile_no not in keep_tiles: | |
| task.cancel() | |
| self._cleared_tasks += 1 | |
| continue | |
| # Remove | |
| # - tasks outside current 2d slice from the queue | |
| # - tasks in the current slice, but no longer visible (not in keep_tiles) | |
| if task.stack_id != stack_id or task.tile_no not in keep_tiles: | |
| task.cancel() | |
| self._cleared_tasks += 1 | |
| continue | |
Summary: * remove of superfluous buffer variable in tests * add more comments to test that deviate from the default_context * remove superfluous import of Requests * rename clear_threadpool_vp -> clear_non_submitted_tiles_outside_field_of_view * simplify boolean logic in cancel function * rename finished_evt -> buffer_finished_evt * get rid of the only warning in volumina tests?! Co-authored-by: Benedikt Best <63287233+btbest@users.noreply.github.com>
eb9240c to
ff73255
Compare
ilastik faces performance problems with bigger data, especially when zooming out. Background is the nature of how volumina requests data from ilastik: Whatever is visible, will be requested. While this is super cool when looking at small parts of images, this lazyness is too eager for zoomed out views on larger data. As a result the user would experience unresponsive ui for a long time with often tens of thousands of requests being submitted simultaneously. This is a draw on the RAM as intermediate results of waiting requests must be stored there.
This PR adds the
LazyflowRequestBufferclass which closely interacts withTileProvider.TileProviderwill not directly submit it's requests for tiles to the requestpool, but go via the lazyflow request buffer. The buffer takes care of submitting the requests up to a maximum number. This greatly reduces the number of simultaneous running requests.This PR furthermore addresses "cancellation". Running tasks are never cancelled, but with the queue inside the request buffer, it is possible to remove unsubmitted requests and dramatically speed up ilastik in interactive mode during scrolling through z, time, or scrolling in plane.
As a result the UI stays more or less interactive all the time. Scrolling/zooming is not an issue anymore. In my benchmark, the memory consumption was reduced to half.
Benchmark results (currently my machine only, but in earlier versions I could see similar results on win/linux)
Benchmark on linux (alienware). Could not scroll-live+80 as it crashed (out of memory) on main, so I reduced to +60...
don't really have the time to do the tests also on a windows machine. But there is nothing that could lead me to expect different behavior.
note, the more scrolling is done, the greater the speedup because of omitted computations.
Demo: scrolling through cremi with live update on (wouldn't recommend to try this on main - goes out of ram on my mashine, note this also uses #338).
requestbuffer.mp4
Bottom line: this does not speed up any individual computation, but by being smarter about what to compute, it results in a much more responsive experience and overall reduction in computations.
Todos:
Edit: Updated with benchmark results on linux.