-
Notifications
You must be signed in to change notification settings - Fork 23
Description
Hi,
I'd like to report an error message I'm getting when running certain hashes of MPAS-Model's gsl/develop branch. The error message is related to SMIOL and generally has this form:
ERROR: SMIOLf_define_dim failed with error -2
ERROR: invalid subroutine argument
ERROR: MPAS IO Error: SMIOL error -2: invalid subroutine argument
I started seeing this after upgrading from hash 9ce949e (PR #161) to hash 4d97971 (PR #168). There are several commits between these, so I ran with different ones to see where the error message first shows up. It turns out it's caused by the commit immediately after 9ce949e, which is 3cb859c (PR #121).
In all cases, I am building and running MPAS via the mpas_app. The base directory I'm working under is
/scratch4/BMC/fv3lam/Gerard.Ketefian/DTC_MPAS_stoch/gsk_code_Ning_latest
Under this, you can find the code and a sample experiment for each of the two hashes in the following locations:
Hash 9ce949ec (this does not generate the error)
------------------------------------------------
mpas_app location: mpas_app.Anders_9ce949ec
MPAS-Model location: mpas_app.Anders_9ce949ec/src/MPAS-Model
mpas_app experiment location: expt_dirs/conus_15km.forecast_only.Anders_9ce949ec
forecast location under experiment: expt_dirs/conus_15km.forecast_only.Anders_9ce949ec/2023091500/forecast
Hash 3cb859cb (this does generate the error)
--------------------------------------------
mpas_app location: mpas_app.Haiqin_3cb859cb
MPAS-Model location: mpas_app.Haiqin_3cb859cb/src/MPAS-Model
mpas_app experiment location: expt_dirs/conus_15km.forecast_only.Haiqin_3cb859cb
forecast location under experiment: expt_dirs/conus_15km.forecast_only.Haiqin_3cb859cb/2023091500/forecast
You can see the error in these log files:
/scratch4/BMC/fv3lam/Gerard.Ketefian/DTC_MPAS_stoch/gsk_code_Ning_latest/expt_dirs/conus_15km.forecast_only.Haiqin_3cb859cb/2023091500/forecast/log.atmosphere.00*.err
Also, I noticed that the history*.nc files in these two runs have very different sizes. The ones from hash 9ce949e (no error message) are 942 mb, while the ones from hash 3cb859c (with error message) are only 67 mb. I suspect this is likely due to the error message, but I haven't dug into it to find out for sure.
I wonder if anyone else has seen this error and whether the latest commit fixes it (or whether there is a fix in the works).
Thanks.
@clark-evans @JeffBeck-NOAA I could not tag Will and Ning in this issue. They'll probably be interested in this. Do we need to/Can we add them to the ufs-community organization?