Merge MPI version with changes in develop#35
Conversation
…ting of split files, reading of MPI-IO formatted files, some pblas experiments
… sure data is distributed on read in perturbation.f90
[perturbation.f90] mpi_aux compatibility update
…gle-file only for now)
…ion.f90, clean output
…arge cases, work around by transposing manually and passing as 'T' instead of 'N'
…duce-to-root is fine)
…r call to symm_mat_element_vector_k. Old method caused massive (3x) slowdown, now eliminated.
- blacs_ctxt not used elsewhere, no need to be public
Co-authored-by: ageorgou <1186102+ageorgou@users.noreply.github.com>
Refactor comparison script and add testing of intensity log
…tell which files are differing and where
* Start running with Intel compiler * Use correct compiler command I don't like this but it seems there's no way to conditionally set the environment variable. See also actions/runner#409 Will possibly change to conditional steps later. * Remember Intel-related parameters The scripts set the PATH (and maybe other variables?) but those are not preserved across steps. Since the build step is becoming more distinct between the two compilers, let's try splitting it in two conditional ones. * Try caching Intel installation * Look for Intel libraries before tests * Try to cache whole directory * Allow running tests with MPI * Fix syntax * Install MPI compilers when needed * Try to force use of ifort with MPI By default, the Intel version of mpif90 uses gfortran. We can either switch to using mpiifort instead, or try to control the underlying compiler this way. See also, for example: https://www.hpc.cineca.it/center_news/important-use-intel-mpi-wrappers-mpif90-mpicc-mpicxx * Make sure cache names don't clash with/without MPI * Update variables so MKL and BLACS can be found * Fix MKL installation * Show some more info and build faster on CI * Only use one MPI process on CI for debugging See if the error still happens when testing. * Avoid reading from stdin * Build faster on CI with gfortran * Avoid segmentation fault with quick gfortran build This reverts commit b8413c6. * Replicate execution on cluster temporarily Adding the mpiio option may be required for now but will eventually be removed. * Don't test file_intensity with MPI It's currently failing when run with MPI and > 1 processes (not just on GitHub Actions, also on CSD3). * Simplify USE_MPI in Makefile and CI Now behaving similar to the other Makefile variables. This also lets us make the GitHub Actions job a bit simpler. * Don't install MKL twice when using MPI, clean up
Previously there were two versions of this function, `symm_mat_element_vector` and `symm_mat_element_vector_k`, each dealing with a different symmetry in the molecule. These have been refactored into one function. This means molecules with euler symmetry can be processed with the MPI version of TROVE (at least for e.g. file1 of the CH4 benchmark).
perturbation.f90
Outdated
| end do | ||
| ! | ||
| ! Collect all pre-calculated hcontr values to MPI root. Non-local values have been initialised to 0 so it's safe to just do | ||
| ! MPI_SUM. |
There was a problem hiding this comment.
Was this removed on purpose?
There was a problem hiding this comment.
Yep, didn't seem particularly relevant.
| hcontr(ideg,jdeg,jrow) = 0.0_rk | ||
| else | ||
| hcontr(ideg,jdeg,jrow) = func(icontr,jcontr,jrot,k_i,k_j,tau_i,tau_j) | ||
| if (job%rotsym_do) then |
There was a problem hiding this comment.
Haven't checked but I'm assuming job is in scope here. Even so, Is it worth having an explicit extra argument do_rotsym in the subroutine instead of relying on the global?
There was a problem hiding this comment.
Job seems to be a variable at module-scope so yes, it should be available here. I think it probably is worth having this as an explicit argument actually, yes.
| if (trim(trove%symmetry)=='C2VN') then | ||
| if (sym%N<job%bset(0)%range(2)) then |
There was a problem hiding this comment.
Is this just for readability? Does the .and. not shortcircuit?
There was a problem hiding this comment.
This fixes a segfault so I'm guessing either sym%N or job%bset(0)%range(2) are only allocated when trove%symmetry)=='C2VN'.
There was a problem hiding this comment.
I assumed it wouldn't matter but apparently short-circuting is not standard! (even though gfortran with optimisations supports it, for instance)
| kindex = 0 ; kindex(imode) = 1 | ||
| ! | ||
| coordtransform(:,imode) = FLvect_finitediffs(job_is,trove%Ncoords,kindex,q_eq,step,irho) | ||
| coordtransform(:,imode) = FLvect_finitediffs(job_is,trove%Nmodes,kindex,q_eq,step,irho) |
There was a problem hiding this comment.
This seems like a big thing to have been missed for so long! Do the benchmark problems have Nmodes == Ncoords?
There was a problem hiding this comment.
The changed argument actually sets the size of the output of the function FLvect_finitediffs, which is then assigned to coordtransform. The first dim of coordtransform is of length trove%Nmodes which is inconsistent with the value passed. In this situation, intel fortran was broadcasting only what fit into the output.
There was a problem hiding this comment.
hi Jamie,
Please be careful with this parameter. I can see that there is a conflict with the definition of the array coordtransform. However trove%Ncoords was correct here. I am pretty sure using trove%Nmodes will break cases where trove%Nmodes/=trove%Ncoords such as CH4.
I will try to understand the source of this bug. Most likely it is the definition of coordtransform.
There was a problem hiding this comment.
Never mind, Jamie. I think you are right and trove%Nmodes should give the correct size for these vectors.
- fixes incorrect filename `matelem.chk` - fixes unassociated array access - fixes unit tests to only check intensity output if MPI disabled
This will contain all the changes required for MPI with TROVE. More functionality will be added via additional pull requests targetting this branch (
merge-develop-mpi).The comments below reflect the original changes (until 4096f62)
This PR implements optional MPI in TROVE. This includes
This has been tested against the existing benchmarks using the following combinations:
The intensity calculation currently does not work with MPI enabled.