Add AUTO in PTG which dynamically allocate suitable buffer to receive…#759
Add AUTO in PTG which dynamically allocate suitable buffer to receive…#759QingleiCao wants to merge 2 commits intoICLDisco:masterfrom
Conversation
bosilca
left a comment
There was a problem hiding this comment.
I understand what you are trying to do here, and I think it is a good step in the right direction. However, I have two issues with the proposed approach:
- it loses the type of the incoming data. You are forced to drop the type and rollback to receive data as packed (potentially breaking heterogeneity). I understand that the arena must be set to
AUTObut I think in that case the remote_type should be set to the expected type of the data instead of allowing it to bevoid*. - I am really not sure about the way you compute the device_index ? Without running the code, I would think you should mostly get 0, as the task's inputs are not set during the get_datatype function.
parsec/remote_dep_mpi.c
Outdated
| copy->device_private = NULL; | ||
| } | ||
|
|
||
| static void remote_dep_auto_gpu_copy_release(parsec_data_copy_t *copy, int device) |
There was a problem hiding this comment.
The name is confusing as it implies you are releasing the data copy but instead you are just releasing the private memory owned by it.
Why do you need the device ? A coipy is always associated with a device, and it would not make sense to release the memory of a different copy than the one you pass as argument.
| output->data.remote.dst_count = remote_size; | ||
| output->data.remote.src_displ = 0; | ||
| output->data.remote.dst_displ = 0; | ||
| output->data.remote.device_index = remote_dep_auto_pick_gpu_device(newcontext); |
There was a problem hiding this comment.
We have a proper accessor for the task class to identify the device it wants to use. You should be calling the newcontext->taskclass.data_affinity
parsec/remote_dep_mpi.c
Outdated
| } | ||
| parsec_task_t tmp = *task; | ||
| tmp.selected_device = NULL; | ||
| if( PARSEC_SUCCESS != parsec_select_best_device(&tmp) ) { |
There was a problem hiding this comment.
Are you sure this returns what you expect ? For me parsec_select_best_device only works once the task is properly initialized and has all it's input data copies properly set. Before that it will basically count the possible incarnations and returns the ownership of any of the copies. In other words it is mostly ransom or maybe always zero if the input copies are not yet set (which is the most likely).
Thank you for your comments @bosilca. This is still a draft, and I’ll be making updates intermittently. 😄 |
… data