Skip to content

OpenACC Merge#4

Open
rsearles35 wants to merge 40 commits intowdj:masterfrom
rsearles35:master
Open

OpenACC Merge#4
rsearles35 wants to merge 40 commits intowdj:masterfrom
rsearles35:master

Conversation

@rsearles35
Copy link
Copy Markdown

Merging the OpenACC version of Minisweep.

rsearles35 and others added 30 commits June 7, 2017 09:22
…inline a bunch of the function calls in order to begin OpenACC implementation
* creating a new version of minisweep that uses OpenACC

This change addresses the need by:

* inlined function calls that could not be turned into OpenACC routines

Related/future task(s):

* Parallelize with OpenACC!
… face-initializations). It is slow right now due to too much data transfer. Will optimize this once all portions of the compute are running on the GPU and producing the correct results.
…alculated. Next step is to figure out how to parallelize over energy groups.
… we only launch one kernel per octant iteration instead of 3
… all 5 loops. Also collapsed some of the inner-computational loops. Still need to resolve the issue of the spacial loops. Can't collapse when using not-equals...
* Could not parallelize spacial dimensions due to the unpredictable direction of the sweep.

This change addresses the need by:

* Rewrote the sweep to only sweep in one direction. This allowed me to parallelize all 3 loops.

Related/future task(s):

* Potentially tweaking which loops are collapsed in the gang layer and which ones are collapsed in the vector layer. We are at the tuning/optimization stage now.
…massive data overhead because the local array must increase in size dramatically
…t array access. This should give us better memory coalescence.
…ll 8 directions asynchronously. Each octant runs a gang-parallel KBA wavefront iteration with vector-parallel in-gridcell computations
… in your cmake file will enable the OpenACC version of the code
…as well as devices within nodes if enough ranks are used.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant