Skip to content

release/1.9: Bug fixes for scotch and crtm-fix, update Ursa site config#1660

Merged
climbfuji merged 12 commits intoJCSDA:release/1.9.0from
climbfuji:bugfix/rel19_scotch_oneapi_use_gcc
Jun 10, 2025
Merged

release/1.9: Bug fixes for scotch and crtm-fix, update Ursa site config#1660
climbfuji merged 12 commits intoJCSDA:release/1.9.0from
climbfuji:bugfix/rel19_scotch_oneapi_use_gcc

Conversation

@climbfuji
Copy link
Collaborator

@climbfuji climbfuji commented Jun 4, 2025

Summary

  1. Update of spack with three changes for release/1.9 (see dependencies below)
  2. Update templates/config files for changes in 1.
  3. Update ursa site config - builds without duplicates with the spack changes from 1. This introduces the use of MKL as the virtual provider for blas/lapack/fftw when using Intel/oneAPI, but keeps OpenBLAS/FFTW for GNU. Similar to what NRL systems do, what CI uses, and what we should move (back) to on all systems before the spack-stack 2.0.x releases.

Note. After this is merged, we need a final round of testing with all UFS regression tests. If successful, create release 1.9.2 and merge necessary changes back to develop.

Testing

Applications affected

release/1.9 applications that will use spack-stack-1.9.2 (UFS, JEDI, ...)

Systems affected

Ursa

Dependencies

Issue(s) addressed

@ulmononian @rickgrubin-noaa please add issues here

Checklist

  • This PR addresses one issue/problem/enhancement, or has a very good reason for not doing so.
  • These changes have been tested on the affected systems and applications.
  • All dependency PRs/issues have been resolved and this PR can be merged.

fc: /usr/bin/gfortran
flags: {}
operating_system: rocky9
target: x86_64
Copy link
Collaborator Author

@climbfuji climbfuji Jun 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The target entries are the magic that fixes all the version conflicts.

- spec: gcc-runtime@11.4.1%gcc@11.4.1
prefix: /usr

# If using intel-oneapi-mkl, make appropriate changes below
Copy link
Collaborator Author

@climbfuji climbfuji Jun 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a new platform, let's just move straight back to Intel MKL instead of OpenBLAS+FFTW with Intel.

The GCC build still uses OpenBLAS and FFTW, as intended.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what Jessica Meixner tested in ufs-community/ufs-weather-model#2650 (comment)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@climbfuji will going "straight back to Intel MKL instead of OpenBLAS+FFTW with Intel" on ursa but not on other problems cause any inconsistencies? trying to understand if we are going to need to rebuild 1.9.1 entirely on all platforms or if we can hopefully just update scotch...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it won't. It's entirely transparent to the users or downstream application. I've asked for this move in general for a very long time. Before we had OpenBLAS with Intel, hpc-stack and jedi-stack used MKL with Intel by default. When spack-stack started, we had problems with getting MKL to work and out of necessity switched to OpenBLAS.

In fact, we already switched to MKL on all NRL systems with Intel (and use OpenBLAS with GNU), and nobody noticed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spack-stack CI runners are like that, too, by the way.

- spec: gcc-runtime@11.4.1%gcc@11.4.1
prefix: /usr

ectrans:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant

@climbfuji climbfuji self-assigned this Jun 6, 2025
@climbfuji climbfuji changed the title release/1.9: bug fixes for scotch and crtm-fix (WIP) release/1.9: Bug fixes for scotch and crtm-fix, update Ursa site config Jun 9, 2025
@rickgrubin-noaa
Copy link
Collaborator

rickgrubin-noaa commented Jun 9, 2025

@climbfuji would you please also add the following to this PR:

    mapl:
      require: '@2.53.0 ~shared ~f2py'
      variants: '+pflogger'

to

    mapl:
      require: '@2.53 ~shared ~f2py'
      variants: '+pflogger' 
  • add mapl@2.53.4 to mapl/package.py
    • version("2.53.4", sha256="da38348a72fcbaa2b888578bfa630ab36261206136d33700344ed6792f9f9aeb")

@climbfuji
Copy link
Collaborator Author

@climbfuji would you please also add the following to this PR:

    mapl:
      require: '@2.53.0 ~shared ~f2py'
      variants: '+pflogger'

to

    mapl:
      require: '@2.53 ~shared ~f2py'
      variants: '+pflogger' 
  • add mapl@2.53.4 to mapl/package.py

    • version("2.53.4", sha256="da38348a72fcbaa2b888578bfa630ab36261206136d33700344ed6792f9f9aeb")

Thanks, will do in a bit!

Copy link
Collaborator

@ulmononian ulmononian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @climbfuji

Copy link
Collaborator

@rickgrubin-noaa rickgrubin-noaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only comment is to revert spack branch in .gitmodules when done.

scotch:
require: '@7.0.4 +mpi+metis~shared~threads~mpi_thread+noarch+esmumps'
require:
- '@7.0.4 +mpi+metis~shared~threads~mpi_thread+noarch+esmumps'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean we are using scotch 7.0.4 or are we moving to 7.0.7? (sorry this is my lack of spack-stack knowledge question)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still scotch 7.0.4. If it works, then there is no need to update it. The purpose of spack-stack release 1.9.2 is to provide a bug fix / hot fix for the existing 1.9.0 and 1.9.1 releases. Bug fix releases only contain the necessary changes to fix bugs, no changes that aren't necessary.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know if scotch 7.0.4 works on other machines? Or just Ursa... I'm assuming it's going to work on other machines but don't know. Either way WW3 is ready to handle a mix of the two if necessary.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am pretty sure it will work. We'll install the 1.9.2 release candidate on two machines of your choice with both GNU and oneAPI, and hopefully that will confirm my guess. Ursa + ???

@climbfuji
Copy link
Collaborator Author

Only comment is to revert spack branch in .gitmodules when done.

Done.

@climbfuji climbfuji merged commit 7aeb006 into JCSDA:release/1.9.0 Jun 10, 2025
8 of 9 checks passed
@climbfuji climbfuji deleted the bugfix/rel19_scotch_oneapi_use_gcc branch June 10, 2025 15:50
@climbfuji
Copy link
Collaborator Author

I created tag spack-stack-1.9.2rc2 for spack-stack only. That is sufficient for testing, since the submodule pointer for spack is uniquely referenced by the hash stored in spack-stack.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

No open projects

Development

Successfully merging this pull request may close these issues.

4 participants