Conversation
nusbaume
left a comment
There was a problem hiding this comment.
Apologies @Isaaciwd and @brianpm for taking almost six months longer then I originally promised, but I finally got around to reviewing this PR. It's definitely some great work!
I have a fairly large amount of change requests, but hopefully most (all?) of them should be easy to resolve. Of course please let me know if you have any questions or concerns with any of my requests or suggestions, and thanks again for the effort!
|
|
||
| MODIS_emd_centers: | ||
| category: "Clouds" | ||
| obs_file: 'MODIS_emd-means_n_init5_centers_1.np' |
There was a problem hiding this comment.
I think this is missing an extra y at the end:
| obs_file: 'MODIS_emd-means_n_init5_centers_1.np' | |
| obs_file: 'MODIS_emd-means_n_init5_centers_1.npy' |
| import xarray as xr | ||
| import matplotlib as mpl | ||
| from mpl_toolkits.axes_grid1 import make_axes_locatable | ||
| from numba import njit |
There was a problem hiding this comment.
Is numba required for this script to run? If so then we should probably add it to the ADF-provided env/conda_environment.yaml file so that this script can be used with that environment as well.
| import os | ||
|
|
||
|
|
||
| #global num_iter, n_samples, data, ds, ht_var_name, tau_var_name, k, height_or_pressure |
There was a problem hiding this comment.
Is this line still needed? If not then I would go ahead and remove it.
| def cloud_regime_analysis(adf, wasserstein_or_euclidean = "euclidean", data_product='all', premade_cloud_regimes=None, lat_range=None, lon_range=None, only_ocean_or_land=False): | ||
| """ | ||
| This script/function is designed to generate 2-D lat/lon maps of Cloud Regimes (CRs), as well as plots of the CR | ||
| centers themselves. It can fit data into CRs using either Wassertstein (AKA Earth Movers Distance) or the more conventional |
There was a problem hiding this comment.
Typo here:
| centers themselves. It can fit data into CRs using either Wassertstein (AKA Earth Movers Distance) or the more conventional | |
| centers themselves. It can fit data into CRs using either Wasserstein (AKA Earth Movers Distance) or the more conventional |
| import glob | ||
| from math import ceil | ||
| import time | ||
| import dask |
There was a problem hiding this comment.
dask is not currently included in the ADF-provided env/conda_environment.yaml file. I would recommend adding it in this PR unless the use of dask is optional for this script.
| # Opening an initial dataset | ||
| init_ds_b = xr.open_dataset(files[0]) | ||
|
|
||
| print(f' -Starting {data} CAM baseline data') #testing |
| # Variable that gets set to true if var is missing in the data file, and is used to skip processing that dataset | ||
| missing_var = False | ||
|
|
||
| # Trying to open time series files from cam)ts_loc |
There was a problem hiding this comment.
Typo here:
| # Trying to open time series files from cam)ts_loc | |
| # Trying to open time series files from cam_ts_loc |
| # Masking out the land or water | ||
| if only_ocean_or_land == 'L': ds_b = ds_b.where(land == 1) | ||
| elif only_ocean_or_land == 'O': ds_b = ds_b.where(land == 0) | ||
| else: raise Exception('Invalid option for only_ocean_or_land: Please enter "O" for ocean only, "L" for land only, or set to False for both land and water') |
There was a problem hiding this comment.
Don't want to kill the ADF here:
| else: raise Exception('Invalid option for only_ocean_or_land: Please enter "O" for ocean only, "L" for land only, or set to False for both land and water') | |
| else: | |
| print('Invalid option for only_ocean_or_land: Please enter "O" for ocean only, "L" for land only, or set to False for both land and water') | |
| return |
| weights_b=weights_b[valid_indicies_b] | ||
|
|
||
| if np.min(mat_b < 0): | ||
| raise Exception (f'Found negative value in ds_b.{var_name}, if this is a fill value for missing data, convert to nans and try again') |
There was a problem hiding this comment.
Don't want to kill the ADF here:
| raise Exception (f'Found negative value in ds_b.{var_name}, if this is a fill value for missing data, convert to nans and try again') | |
| print(f'Found negative value in ds_b.{var_name}, if this is a fill value for missing data, convert to nans and try again') | |
| return |
| @@ -0,0 +1,1369 @@ | |||
| #%% | |||
| print() | |||
There was a problem hiding this comment.
I would delete this line, as I suspect this print statement will be called at a weird time in the ADF run, if it is called at all.
Add a cloud regime analysis script to the ADF.