Add sycl_khr_free_function_commands extension#922
Add sycl_khr_free_function_commands extension#922slawekptak wants to merge 77 commits intoKhronosGroup:mainfrom
Conversation
This extension provides an alternative mechanism for submitting commands to a device via free-functions that require developers to opt-in to the creation of event objects. It also proposes alternative names for several commands (e.g., launch) and simplifies some concepts (e.g., by removing the need for the nd_range class).
Previous "0 or more" wording only made sense when reductions could be optionally provided to functions like parallel_for; now that there are dedicated *_reduce functions, at least one reduction is required.
"is" is more consistent with ISO C++ wording.
Co-authored-by: Greg Lueck <gregory.m.lueck@intel.com>
There is no need to constrain T here because T must be device-copyable in order to construct the accessor passed as an argument.
Renaming sycl::nd_item is not a necessary part of the API redesign for submitting work, so it should be moved to its own extension. This will also give us more time to consider the design and naming of any proposed replacement(s), including how they should interact with new functionality proposed in other KHRs.
There are currently no backends that define interop for reductions, so we can remove these functions for now. If we decide later that these functions are necessary, we can release a revision of the KHR.
Co-authored-by: Andrey Alekseenko <al42and@gmail.com>
|
The WG discussed this, and feel we need a solution for local memory in this KHR. |
|
Regarding local memory: to me, it seems like the least invasive strategy (as in, it doesn't depend on many other changes) that fits with the current specification of this extension would be using requirements for local accessors - since it's a natural fit with how non-local accessors are proposed to be handled. A future extension for e.g. static work group memory could then make that superfluous where it applies. |
Revamp the proposed specification to provide convenience APIs that are similar to CUDA's `cudaEventRecord` and `cudaStreamWaitEvent` because this is the immediate request from our customer. I think we do still want to add a `record_event` property, but I think we could add that separately as part of the KHR being proposed in KhronosGroup/SYCL-Docs#922, or as a separate oneapi extension based on that KHR.
Revamp the proposed specification to provide convenience APIs that are similar to CUDA's `cudaEventRecord` and `cudaStreamWaitEvent` because this is the immediate request from our customer. I think we do still want to add a `record_event` property, but I think we could add that separately as part of the KHR being proposed in KhronosGroup/SYCL-Docs#922, or as a separate oneapi extension based on that KHR.
|
Agree with @PeterTh , would like to keep the change of this PR "minimal" so we can merge it and then we can discuss new feature. I want to avoid the feature creep problem. This PR is immensely useful as if, so no need to do everything in one go :) |
The function names for memory operations now follow the "enqueue_*" pattern, to indicate that these operations are added to the queue and not executed immediately.
- Changed the return type of the functions to void (signal_event should be used to track completion). - Added the signal_event, wait_event and wait_events structs to be used with the requirements object. - Added the following functions: make_event, enqueue_signal_event, enqueue_wait_event, enqueue_wait_events, enqueue_barrier. - Removed the following functions: command_barrier, event_barrier. - Updated the code example.
…gnal_event function.
Co-authored-by: Greg Lueck <gregory.m.lueck@intel.com>
Co-authored-by: Greg Lueck <gregory.m.lueck@intel.com>
Co-authored-by: Greg Lueck <gregory.m.lueck@intel.com>
Co-authored-by: Greg Lueck <gregory.m.lueck@intel.com>
|
Thanks a bunch for the efforts to push this to finish line. My recollection, prefetch_host is sparingly used ATM. And these few apps (qmcpack, exachem, etc) use it at production-scale. My two-cents is towards free function standardization of this API just to reduce the cycles over non-free |
This is a new, follow-up PR to #644, originally created by John Pennycook. All the future work related to that PR will be continued here. The reason for creating a new PR is that the PR ownership transfer is required.
This extension provides an alternative mechanism for submitting commands to a device via free-functions that require developers to opt-in to the creation of event objects.
It also proposes alternative names for several commands (e.g., launch) and simplifies some concepts (e.g., by removing the need for the nd_range class).