-
Notifications
You must be signed in to change notification settings - Fork 6
Examples
We provide some examples on the Efficient Elementary Effects and how they are used to identify noninformative parameters. One can either
- use the individual, modular scripts in the
codes/subfolder to perform a traditional Elementary Effects method to estimate the sensitivity of model parameters or - use the wrapper
__run_eee.shto basically loop over the modules of the traditional method and hence perform the Efficient Elementary Effects method to distinguish informative parameters from the noninformative ones.
The examples and workflows explained in the following are:
- Ishigami-Homma function (very detailed explanations)
- Oakley-O'Hagan function (compressed explanations)
- RAVEN model with HMETS setup (a real model)
The traditional method was introduced by Morris (1991)[1]. This Sensitivity Analysis (SA) method is also called the Morris method.
As in every SA one needs to sample first some parameter sets. The calling sequence for the modular script is:
python codes/1_create_parameter_sets.py -t 100 -d examples/ishigami-homma/parameters.dat -o examples/ishigami-homma/parameter_sets
It will sample 100 trajectories (option -t) for the three parameters of the Ishigami-Homma function[2]. The three parameters x1, x2, and x3 are uniformly distributed in the interval [-Pi,Pi]. This is specified in the parameters info file (option -d). The sampled trajectories will contain N+1=4 parameter sets each. Hence 400 parameter sets will be sampled. The parameter sets will be stored in files with a specified pattern (option -o). Three files are created: One containing unscaled parameter sets (either uniform between [0,1] or standard Gaussian distributed N[0,1]), the second containing scaled parameter sets that will be used to run the model (here: all parameters ~U[-Pi,Pi]), and one file indicating which parameter was changed between consecutive parameter sets within each trajectory. The parameter information file (option -d) needs to be adjusted to run your own model.
Next, the model needs to be run. The model will be run in total (3+1) × 100 = 400 times. The calling sequence for the modular script is:
python codes/2_run_model_ishigami-homma.py -i examples/ishigami-homma/parameter_sets_1_scaled_para3_M.dat -o examples/ishigami-homma/model_output.pkl
The sampled (scaled) parameter sets are provided (option -i) to the script which will run the model, collects all model outputs in the order of the according parameter sets and stores the model outputs as a dictionary in a binary Pickle file (option -o). This script needs to be adjusted to run the method for your own model.
Last step is to derive the Elementary Effects for each parameter (and each model output in case there are multiple available in the dictionary stored in the Pickle file). The calling sequence for the modular script is:
python codes/3_derive_elementary_effects.py -i examples/ishigami-homma/model_output.pkl -k out1 -d examples/ishigami-homma/parameters.dat -m examples/ishigami-homma/parameter_sets_1_para3_M.dat -v examples/ishigami-homma/parameter_sets_1_para3_v.dat -o examples/ishigami-homma/eee_results.dat
This module will use the model outputs provided (option -i), the information about the parameters (option -d), the unscaled parameter sets (option -m), and the information which parameter got changed between consecutive parameter sets (option -v) to derive the Elementary Effects. The results are stored in an ASCII file specified (option -o). In case multiple model outputs are derived by the model and stored in the Pickle file, one can choose the desired model output as target for the analysis (option -k). Model outputs can either be scalar or one-dimensional. If one wants to determine the Elementary Effect regarding each model output, the option can be set to -k All. The output file (option -o) will then contain multiple columns for Elementary Effects.
The method is not deterministic. Hence the results of the analysis stored in examples/ishigami-homma/eee_results.dat will vary slightly. An example result is given below:
# model output #1: out1
# ii para_name elemeffect(ii),ii=1:3,jj=1:1 counter(ii),ii=1:3,jj=1:1
0 x_1 81.16198538 100
1 x_2 1.15163666 100
2 x_3 15.24403683 100
The iterator ii indicates the number of parameters (here 3). The iterator jj indicates the number of model output variables analysed (here 1: 'out1'). The counter indicates how many trajectories have been used to derive the Elementary Effect estimate (here 100 for each parameter).
The results show that parameter x1 is the most influential while x2 is the least influential and hence very likely noninformative. The question is whether x3 should be considered influential or not and if one indeed needs 100 trajectories to make this decision. Cuntz & Mai et al. (2015)[3] therefore proposed the Efficient Elementary Effects method.
The Efficient Elementary Effects method[3] which performs the traditional EE method using a smaller number of trajectories, estimates the Elementary Effects, derives automatically a cutoff threshold, and repeats the analysis with only parameters that are below the threshold (hence not yet shown any importance) and checks if they might be important somewhere else in the parameter space. The length of the trajectories (and hence number of model runs) is determined by the number of model parameters considered. Since each iteration only focusses only on parameters that have not shown any importance yet (Elementary Effect above the threshold), the length of trajectories is decreasing over time. The sequential application of the EE terminates as soon as no additional parameter above the threshold is detected anymore. A number of trajectories is run to confirm this. Then the method aborts.
So, the Efficient Elementary Effects[3] are looping over the modular scripts of the traditional method plus derive a threshold at the end and determine which parameter is below the threshold (not yet informative) or above (parameter regarded to be informative). The calling sequence for the modular script is:
python codes/4_derive_threshold.py -e examples/ishigami-homma/eee_results.dat -m examples/ishigami-homma/parameters.dat -c -1 -p examples/ishigami-homma/eee.pdf -t
This script uses the previously derived Elementary Effects estimates (option -e) and the parameter information (option -d) to fit a logistic function to derive the cutoff threshold (if option -c -1; else cutoff provided will be used). The cutoff will be only derived once after the first iteration. A threshold for each model output will be derived. Optionally, one can plot the results as a PDF (option -p) using LaTeX fonts (if -t is set; otherwise standard fonts are used). The option -n will suppress any plotting irrespective of the settings of options -p and -t. An example is shown in Figure 1.
The script will write a file that contains the derived cutoff values (cutoff_*.dat; one file for each model output) and a new parameter information file (parameters.dat.new). The latter determines which parameters will be considered in the next iteration, i.e. only parameters that are still noninformative (last column is set to 1). An example new parameter information file parameters.dat.new is given below:
# para dist lower upper default informative(0)_or_noninformative(1)
# mean stddev
x_1 uniform -3.14159 3.14159 0.0 0
x_2 uniform -3.14159 3.14159 0.0 1
x_3 uniform -3.14159 3.14159 0.0 0
Hence, the next iteration will only consider parameter x2. The parameters x1 and x3 are already regarded to be informative.
The wrapper script __run_eee.sh is provided to ease the whole process of the sequential setup of new iterations. The calling sequence is:
./__run_eee.sh -s out1 -x 2_run_model_ishigami-homma.py examples/ishigami-homma/
One can specify the model output of interest (option -s; default: All) that needs to be a key of the dictionary save in the Pickle file in the script located in codes/ that runs the model (option -x; default: 2_run_model_ishigami-homma.py ). The only mandatory argument is the folder indication a run folder where all iterations and results will be stored. This folder needs to contain a parameter information file (obey the formatting) named parameters.dat. A different filename can be specified using option -m.
The wrapper will output a summary of results on the terminal and stores them in a file eee_info.dat:
Info about EEE analysis
------------------------------------------------
Number of informative parameters :: 2
Number of model runs in total :: 32
Number of iterations :: 3
It will also produce the final parameter information file parameters.dat.final which uses the last column to indicate if a parameter is noninformative (1) or informative (0):
# para dist lower upper default informative(0)_or_noninformative(1)
# mean stddev
x_1 uniform -3.14159 3.14159 0.0 0
x_2 uniform -3.14159 3.14159 0.0 1
x_3 uniform -3.14159 3.14159 0.0 0
The three modular scripts to use for the traditional method are explained in detail for the Ishigami-Homma function. Only minor changes need to be made to run the example for the Oakley-O'Hagan function[4]. This benchmark function has 15 parameters that are all Gaussian distributed with N[0,1]. The settings for the parameters are stored in examples/oakley-ohagan/parameters.dat. The script that runs this model is codes/2_run_model_oakley-ohagan.py
Sampling 100 trajectories:
python codes/1_create_parameter_sets.py -t 100 -d examples/oakley-ohagan/parameters.dat -o examples/oakley-ohagan/parameter_sets
Run model ((15+1) × 100 = 1600 model runs):
python codes/2_run_model_oakley-ohagan.py -i examples/oakley-ohagan/parameter_sets_1_scaled_para15_M.dat -o examples/oakley-ohagan/model_output.pkl
Derive Elementary Effects:
python codes/3_derive_elementary_effects.py -i examples/oakley-ohagan/model_output.pkl -k out1 -d examples/oakley-ohagan/parameters.dat -m examples/oakley-ohagan/parameter_sets_1_para15_M.dat -v examples/oakley-ohagan/parameter_sets_1_para15_v.dat -o examples/oakley-ohagan/eee_results.dat
Results stored in examples/oakley-ohagan/eee_results.dat:
# model output #1: out1
# ii para_name elemeffect(ii),ii=1:15,jj=1:1 counter(ii),ii=1:15,jj=1:1
0 x_{1} 2.27652936 100
1 x_{2} 3.16814959 100
2 x_{3} 2.00213529 100
3 x_{4} 2.89000059 100
4 x_{5} 1.29779320 100
5 x_{6} 1.53076332 100
6 x_{7} 1.45876640 100
7 x_{8} 1.94229479 100
8 x_{9} 3.14191055 100
9 x_{10} 1.73458308 100
10 x_{11} 2.61002674 100
11 x_{12} 1.86470439 100
12 x_{13} 3.48909637 100
13 x_{14} 2.77272976 100
14 x_{15} 2.56175439 100
The wrapper script to run the Efficient Elementary Effects __run_eee.sh is provided to ease the whole process of the sequential setup of new iterations. The calling sequence is:
./__run_eee.sh -s All -x 2_run_model_oakley-ohagan.py examples/oakley-ohagan/
The results after the first iteration which is the iteration that determines the cutoff threshold are plotted and shown in Figure 2.
Before you can use this example you need to replace the RAVEN executable examples/raven-hmets/model/Raven.exe by an executable that is compiled on your system. You can download executables from the Raven webpage. Please test if they run on your system. If not, please download the source code from the webpage and compile it using make. Then put the executable in the folder examples/raven-hmets/model/. It needs to be named Raven.exe.
The three modular scripts to use for the traditional method are explained in detail for the Ishigami-Homma function. Only minor changes need to be made such that your can run the analysis with the hydrologic modelling framework RAVEN[5]. The configuration of RAVEN provided, is the emulation of the hydrologic model HMETS[6]. This model has 21 parameters that are all uniformly distributed in different intervals. The details- including which parameter corresponds to which hydrologic process- are specified in examples/raven-hmets/parameters.dat. The model is setup over the Salmon River watershed located about 700km north of Vancouver the Rocky Mountains of British Columbia, Canada. The watershed has a drainage area of about 4230 km2. The elevation is 606 m above sea level at the streamflow gauge station of the Salmon River near Prince George (WSC gauge # 08KC001).
The script that runs the RAVEN framework and gathers the model outputs Nash-Sutcliffe efficiency nse, streamflow Q, and infiltration infiltration is codes/2_run_model_raven-hmets.py
Sampling 100 trajectories:
python codes/1_create_parameter_sets.py -t 100 -d examples/raven-hmets/parameters.dat -o examples/raven-hmets/parameter_sets
Run model ((21+1) × 100 = 2200 model runs):
python codes/2_run_model_raven-hmets.py -i examples/raven-hmets/parameter_sets_1_scaled_para21_M.dat -o examples/raven-hmets/model_output.pkl
Derive Elementary Effects (for all three model outputs):
python codes/3_derive_elementary_effects.py -i examples/raven-hmets/model_output.pkl -k All -d examples/raven-hmets/parameters.dat -m examples/raven-hmets/parameter_sets_1_para21_M.dat -v examples/raven-hmets/parameter_sets_1_para21_v.dat -o examples/raven-hmets/eee_results.dat
Results stored in examples/raven-hmets/eee_results.dat:
# model output #1: Q
# model output #2: nse
# model output #3: infiltration
# ii para_name elemeffect(ii),ii=1:21,jj=1:3 counter(ii),ii=1:21,jj=1:3
0 x_{01} 2.54554278 0.13297532 0.00000000 100 100 100
1 x_{02} 4.27775411 0.43124328 0.00000000 100 100 100
2 x_{03} 0.45490522 0.02007915 0.00000000 100 100 100
3 x_{04} 2.76840395 0.13228917 0.00000000 100 100 100
4 x_{05} 24.56462733 2.28217199 1.03493986 100 100 100
5 x_{06} 5.61057228 0.33327705 0.21359006 100 100 100
6 x_{07} 36.22319719 3.52295715 0.68957607 100 100 100
7 x_{08} 4.32480290 0.37956917 0.04512969 100 100 100
8 x_{09} 1.71513465 0.30849645 0.03531561 100 100 100
9 x_{10} 1.24891342 0.19245682 0.02841514 100 100 100
10 x_{11} 1.76195779 0.45844115 0.03242172 100 100 100
11 x_{12} 1.35358088 0.09446752 0.07424898 100 100 100
12 x_{13} 7.80307293 0.53965099 0.15902838 100 100 100
13 x_{14} 1.26910122 0.02752010 0.10690308 100 100 100
14 x_{15} 34.49337594 3.51326781 0.44500839 100 100 100
15 x_{16} 16.36373398 2.97578715 0.07537469 100 100 100
16 x_{17} 3.16906820 0.14483982 0.03745722 100 100 100
17 x_{18} 25.21024661 1.08926495 0.18938519 100 100 100
18 x_{19} 10.64999714 0.50688198 0.22960276 100 100 100
19 x_{20} 43.19056140 9.40029330 0.84881332 100 100 100
20 x_{21} 2.32017696 0.16474733 0.00756527 100 100 100
The wrapper script to run the Efficient Elementary Effects __run_eee.sh is provided to ease the whole process of the sequential setup of new iterations. The calling sequence to run the analysis with all three model outputs Nash-Sutcliffe efficiency nse, streamflow Q, and infiltration infiltration in a multi-objective fashion is:
./__run_eee.sh -s All -x 2_run_model_raven-hmets.py examples/raven-hmets/
The results after the first iteration, which is the iteration that determines the cutoff threshold, are plotted and shown in Figure 3 (for NSE nse), Figure 4 (for streamflow Q), and Figure 5 (for infiltration).
After several iterations only two of the 21 parameters are identified to be not informative regarding any of the three model outputs.
Info about EEE analysis
------------------------------------------------
Number of informative parameters :: 19
Number of model runs in total :: 146
Number of iterations :: 6
The two non-informative parameters are x3 and x4:
- x3: Convolution (delayed runoff) parameter
GAMMA_SHAPE2in RAVEN andalpha2in HMETS - x4: Convolution (delayed runoff) parameter
GAMMA_SCALE2in RAVEN andbeta2in HMETS
Be aware that the results will differ in case the EEE analysis is performed only regarding one of the three model outputs. For example an analysis regarding streamflow Q only using:
./__run_eee.sh -s Q -x 2_run_model_raven-hmets.py examples/raven-hmets/
will result in:
Info about EEE analysis
------------------------------------------------
Number of informative parameters :: 15
Number of model runs in total :: 172
Number of iterations :: 4
which means that 15 of the 21 parameters are informative. The 6 non-informative parameters regarding streamflow are:
- x3: Convolution (delayed runoff) parameter
GAMMA_SHAPE2in RAVEN andalpha2in HMETS - x4: Convolution (delayed runoff) parameter
GAMMA_SCALE2in RAVEN andbeta2in HMETS - x9: Snow balance parameter
SNOW_SWI_MINin RAVEN andfcminin HMETS - x10: Snow balance parameter
SNOW_SWI_MAXin RAVEN andfcmin_plusin HMETS - x11: Snow balance parameter
SWI_REDUCT_COEFFin RAVEN andCcumin HMETS - x14: Snow balance parameter
REFREEZE_EXPin RAVEN andexp_fein HMETS
- Morris, M. D. (1991). Factorial sampling plans for preliminary computational experiments. Technometrics, 33(2), 161–174.
- Ishigami, T., & Homma, T. (1990). An importance quantification technique in uncertainty analysis for computer models. Uncertainty Modeling and Analysis.
- Cuntz, M., Mai, J., Zink, M., Thober, S., Kumar, R., Schäfer, D., et al. (2015). Computationally inexpensive identification of noninformative model parameters by sequential screening. Water Resources Research, 51(8), 6417–6441.
- Oakley, J. E., & O'Hagan, A. (2004). Probabilistic sensitivity analysis of complex models: a Bayesian approach. J. R. Stat. Soc. Ser. B, (66), 751–769.
- Craig, J. R. (2019). Raven: User’s and Developer’s Manual v2.9.1 (pp. 1–182). University of Waterloo.
- Martel, J.-L., Demeester, K., Brissette, F., Poulin, A., & Arsenault, R. (2017). HMETS—A Simple and Efficient Hydrology Model for Teaching Hydrological Modelling, Flow Forecasting and Climate Change Impacts. International Journal of Engineering Education, 33(4), 1307–1316.
© 2019 - Efficient Elementary Effects (EEE) - juliane.mai@uwaterloo.ca