Conversation
|
I should also mention that this could allow us to take multiple callables in |
|
I think it looks fine. The only concern is the terminology ... |
b31e2e1 to
491b6c5
Compare
We now template the `ttg::device::Task` on the `ExecutionSpace` so that we can determine whether it's a host or device task based on the space. We can then optimize away the select and kernel-wait suspension points. We could remove the send suspension point but we use coroutines for storing the final sends anyway and we don't have access to the task return type in `ttg::device::send()`. This allows tasks to be written once for both host and device without duplicating much of the code. Host tasks that are not coroutines will continue to be supported. Signed-off-by: Joseph Schuchart <joseph.schuchart@stonybrook.edu>
Signed-off-by: Joseph Schuchart <joseph.schuchart@stonybrook.edu>
491b6c5 to
91e1d3d
Compare
I'm not sure about that. We probably don't want to make coroutine tasks the default. So what are they? For me, these tasks are still device tasks but now we have the ability to execute the same task structure on the host. I agree that the naming is a bit funny though. |
|
That being said, |
Signed-off-by: Joseph Schuchart <joseph.schuchart@stonybrook.edu>
I agree we don't want to make |
To me the word "task" already implies "asynchrony" of execution, by decoupling statement of what to do from when to do it (i.e. after submitting task there is no way to ensure that work has completed unless you await for the result/completion). So
I like Perhaps what we need is to make cotasks and tasks more symmetric? |
Device ID alone does not uniquely identify a device. The host always has ID 0. Signed-off-by: Joseph Schuchart <joseph.schuchart@stonybrook.edu>
Host tasks only suspend at the very end. We perfom all communication and then destroy the coroutine handle because there is no reason to keep it around. This may enable compiler optimizations and enables backends that otherwise do not handle device tasks to work with host-enabled device tasks. Yes, this needs renaming. Signed-off-by: Joseph Schuchart <joseph.schuchart@stonybrook.edu>
Signed-off-by: Joseph Schuchart <joseph.schuchart@stonybrook.edu>
Signed-off-by: Joseph Schuchart <joseph.schuchart@stonybrook.edu>
|
Proposal from the EPEXA meeting: |
This was decided at the December '24 EPEXA meeting. We need to figure out what to do with the existing resumable task. Signed-off-by: Joseph Schuchart <joseph.schuchart@stonybrook.edu>
| #include "chrono.h" | ||
|
|
||
| #if defined(TTG_HAVE_CUDA) | ||
| #if defined(CHAIN_CUDA) |
There was a problem hiding this comment.
What's the difference between CHAIN_CUDA/HIP and ENABLE_CUDA/HIP?
| #include "../devblas_helper.h" | ||
|
|
||
| #if (defined(TTG_ENABLE_CUDA) || defined(TTG_ENABLE_HIP)) | ||
| #if (defined(TTG_ENABLE_CUDA) || defined(TTG_ENABLE_HIP) || defined(TTG_ENABLE_DEV_HOST)) |
There was a problem hiding this comment.
Naming is inconsistent, TTG_ENABLE_CPU ?
|
|
||
| #include "cuda_kernel.h" | ||
|
|
||
| #if defined(TTG_HAVE_CUDA) |
There was a problem hiding this comment.
shouldn't that be TTG_ENABLE_CUDA, or CHAIN_CUDA?
| template <typename Key, ttg::Runtime Runtime = ttg::ttg_runtime> | ||
| inline detail::send_t sendk(std::size_t i, const Key& key) { | ||
| auto *terminal_ptr = ttg::detail::get_out_terminal<Key, void>(i, "ttg::device::send(i, key, value)"); | ||
| return detail::send_t{detail::sendk_coro(key, *terminal_ptr)}; |
There was a problem hiding this comment.
key parameter moves from 2nd to first parameter depending on level.
a2878ce to
a49eb91
Compare
We now template the
ttg::device::Taskon theExecutionSpaceso that we can determine whether it's a host or device task based on the space. We can then optimize away the select and kernel-wait suspension points. We could remove the send suspension point but we use coroutines for storing the final sends anyway and we don't have access to the task return type inttg::device::send().This allows tasks to be written once for both host and device without duplicating much of the code. Host tasks that are not coroutines will continue to be supported.