Skip to content

Merge MPI version with changes in develop#35

Open
JamieJQuinn wants to merge 83 commits intodevelopfrom
merge-develop-mpi
Open

Merge MPI version with changes in develop#35
JamieJQuinn wants to merge 83 commits intodevelopfrom
merge-develop-mpi

Conversation

@JamieJQuinn
Copy link
Collaborator

@JamieJQuinn JamieJQuinn commented Jun 2, 2021

This will contain all the changes required for MPI with TROVE. More functionality will be added via additional pull requests targetting this branch (merge-develop-mpi).
The comments below reflect the original changes (until 4096f62)


This PR implements optional MPI in TROVE. This includes

  • initial development by Arjen
  • modification of MPI implementation to allow for compilation and running on non-MPI systems
  • merging of changes made to develop since the MPI development started in early 2019

This has been tested against the existing benchmarks using the following combinations:

  • non-MPI gfortran
  • non-MPI intel
  • Intel MPI + ifort (on CSD3)
  • OpenMPI + gfortran (on local laptop)

The intensity calculation currently does not work with MPI enabled.

ArjenTamerus and others added 30 commits February 4, 2019 16:22
…ting of split files, reading of MPI-IO formatted files, some pblas experiments
… sure data is distributed on read in perturbation.f90
[perturbation.f90] mpi_aux compatibility update
…arge cases, work around by transposing manually and passing as 'T' instead of 'N'
…r call to symm_mat_element_vector_k. Old method caused massive (3x) slowdown, now eliminated.
- blacs_ctxt not used elsewhere, no need to be public
@JamieJQuinn JamieJQuinn marked this pull request as ready for review June 3, 2021 16:13
@JamieJQuinn JamieJQuinn requested a review from ageorgou June 3, 2021 16:13
ageorgou and others added 8 commits June 29, 2021 17:01
* Start running with Intel compiler

* Use correct compiler command

I don't like this but it seems there's no way to conditionally
set the environment variable. See also
actions/runner#409
Will possibly change to conditional steps later.

* Remember Intel-related parameters

The scripts set the PATH (and maybe other variables?) but those are
not preserved across steps. Since the build step is becoming more
distinct between the two compilers, let's try splitting it in two
conditional ones.

* Try caching Intel installation

* Look for Intel libraries before tests

* Try to cache whole directory

* Allow running tests with MPI

* Fix syntax

* Install MPI compilers when needed

* Try to force use of ifort with MPI

By default, the Intel version of mpif90 uses gfortran. We can
either switch to using mpiifort instead, or try to control the
underlying compiler this way. See also, for example:
https://www.hpc.cineca.it/center_news/important-use-intel-mpi-wrappers-mpif90-mpicc-mpicxx

* Make sure cache names don't clash with/without MPI

* Update variables so MKL and BLACS can be found

* Fix MKL installation

* Show some more info and build faster on CI

* Only use one MPI process on CI for debugging

See if the error still happens when testing.

* Avoid reading from stdin

* Build faster on CI with gfortran

* Avoid segmentation fault with quick gfortran build

This reverts commit b8413c6.

* Replicate execution on cluster temporarily

Adding the mpiio option may be required for now but will eventually
be removed.

* Don't test file_intensity with MPI

It's currently failing when run with MPI and > 1 processes
(not just on GitHub Actions, also on CSD3).

* Simplify USE_MPI in Makefile and CI

Now behaving similar to the other Makefile variables. This also
lets us make the GitHub Actions job a bit simpler.

* Don't install MKL twice when using MPI, clean up
Previously there were two versions of this function,
`symm_mat_element_vector` and `symm_mat_element_vector_k`, each dealing
with a different symmetry in the molecule. These have been refactored
into one function. This means molecules with euler symmetry can be
processed with the MPI version of TROVE (at least for e.g. file1 of the
CH4 benchmark).
perturbation.f90 Outdated
end do
!
! Collect all pre-calculated hcontr values to MPI root. Non-local values have been initialised to 0 so it's safe to just do
! MPI_SUM.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this removed on purpose?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, didn't seem particularly relevant.

hcontr(ideg,jdeg,jrow) = 0.0_rk
else
hcontr(ideg,jdeg,jrow) = func(icontr,jcontr,jrot,k_i,k_j,tau_i,tau_j)
if (job%rotsym_do) then
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't checked but I'm assuming job is in scope here. Even so, Is it worth having an explicit extra argument do_rotsym in the subroutine instead of relying on the global?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Job seems to be a variable at module-scope so yes, it should be available here. I think it probably is worth having this as an explicit argument actually, yes.

Comment on lines +4394 to +4395
if (trim(trove%symmetry)=='C2VN') then
if (sym%N<job%bset(0)%range(2)) then
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just for readability? Does the .and. not shortcircuit?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes a segfault so I'm guessing either sym%N or job%bset(0)%range(2) are only allocated when trove%symmetry)=='C2VN'.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assumed it wouldn't matter but apparently short-circuting is not standard! (even though gfortran with optimisations supports it, for instance)

kindex = 0 ; kindex(imode) = 1
!
coordtransform(:,imode) = FLvect_finitediffs(job_is,trove%Ncoords,kindex,q_eq,step,irho)
coordtransform(:,imode) = FLvect_finitediffs(job_is,trove%Nmodes,kindex,q_eq,step,irho)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a big thing to have been missed for so long! Do the benchmark problems have Nmodes == Ncoords?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changed argument actually sets the size of the output of the function FLvect_finitediffs, which is then assigned to coordtransform. The first dim of coordtransform is of length trove%Nmodes which is inconsistent with the value passed. In this situation, intel fortran was broadcasting only what fit into the output.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi Jamie,
Please be careful with this parameter. I can see that there is a conflict with the definition of the array coordtransform. However trove%Ncoords was correct here. I am pretty sure using trove%Nmodes will break cases where trove%Nmodes/=trove%Ncoords such as CH4.
I will try to understand the source of this bug. Most likely it is the definition of coordtransform.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never mind, Jamie. I think you are right and trove%Nmodes should give the correct size for these vectors.

JamieJQuinn and others added 2 commits August 11, 2021 11:39
- fixes incorrect filename `matelem.chk`
- fixes unassociated array access
- fixes unit tests to only check intensity output if MPI disabled
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

Comments