Conversation
|
Let me know if you have any performance issues with ROC in the scores package. I have some ideas on how to significantly speed up the AUC calculation in the scores package! |
|
@nicholasloveday I think I ran into a scores bug with dask here. It seems that there's code in ...
if fcst.max().item() > 1 or fcst.min().item() < 0:
...Where Traceback: """
Traceback (most recent call last):
File "/home/taylor/code/ExtremeWeatherBench/.venv/lib/python3.13/site-packages/xarray/computation/ops.py", line 198, in _call_possibly_missing_method
method = getattr(arg, name)
AttributeError: 'Array' object has no attribute 'item'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/taylor/code/ExtremeWeatherBench/.venv/lib/python3.13/site-packages/joblib/externals/loky/process_executor.py", line 490, in _process_worker
r = call_item()
File "/home/taylor/code/ExtremeWeatherBench/.venv/lib/python3.13/site-packages/joblib/externals/loky/process_executor.py", line 291, in __call__
return self.fn(*self.args, **self.kwargs)
~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/taylor/code/ExtremeWeatherBench/.venv/lib/python3.13/site-packages/joblib/parallel.py", line 607, in __call__
return [func(*args, **kwargs) for func, args, kwargs in self.items]
~~~~^^^^^^^^^^^^^^^^^
File "/home/taylor/code/ExtremeWeatherBench/src/extremeweatherbench/evaluate.py", line 409, in compute_case_operator
_evaluate_metric_and_return_df(
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
forecast_ds=aligned_forecast_ds,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...<5 lines>...
**metric_kwargs,
^^^^^^^^^^^^^^^^
)
^
File "/home/taylor/code/ExtremeWeatherBench/src/extremeweatherbench/evaluate.py", line 565, in _evaluate_metric_and_return_df
metric_result = metric.compute_metric(
forecast_data,
target_data,
**kwargs,
)
File "/home/taylor/code/ExtremeWeatherBench/src/extremeweatherbench/metrics.py", line 44, in _compute_metric_with_docstring
return _original_compute_metric(self, *args, **kwargs)
File "/home/taylor/code/ExtremeWeatherBench/src/extremeweatherbench/metrics.py", line 44, in _compute_metric_with_docstring
return _original_compute_metric(self, *args, **kwargs)
File "/home/taylor/code/ExtremeWeatherBench/src/extremeweatherbench/metrics.py", line 44, in _compute_metric_with_docstring
return _original_compute_metric(self, *args, **kwargs)
File "/home/taylor/code/ExtremeWeatherBench/src/extremeweatherbench/metrics.py", line 138, in compute_metric
return self._compute_metric(forecast, target, **kwargs)
~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/taylor/code/ExtremeWeatherBench/src/extremeweatherbench/metrics.py", line 708, in _compute_metric
roc_curve_data = super()._compute_metric(forecast, target, **kwargs)
File "/home/taylor/code/ExtremeWeatherBench/src/extremeweatherbench/metrics.py", line 684, in _compute_metric
return scores.probability.roc_curve_data(
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
binary_forecast,
^^^^^^^^^^^^^^^^
...<3 lines>...
weights=None,
^^^^^^^^^^^^^
)
^
File "/home/taylor/code/ExtremeWeatherBench/.venv/lib/python3.13/site-packages/scores/plotdata/roc_impl.py", line 132, in roc
if fcst.max().item() > 1 or fcst.min().item() < 0:
~~~~~~~~~~~~~~~^^
File "/home/taylor/code/ExtremeWeatherBench/.venv/lib/python3.13/site-packages/xarray/computation/ops.py", line 210, in func
return _call_possibly_missing_method(self.data, name, args, kwargs)
File "/home/taylor/code/ExtremeWeatherBench/.venv/lib/python3.13/site-packages/xarray/computation/ops.py", line 200, in _call_possibly_missing_method
duck_array_ops.fail_on_dask_array_input(arg, func_name=name)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^
File "/home/taylor/code/ExtremeWeatherBench/.venv/lib/python3.13/site-packages/xarray/core/duck_array_ops.py", line 117, in fail_on_dask_array_input
raise NotImplementedError(msg % func_name)
NotImplementedError: 'item' is not yet a valid method on dask arrays
""" |
|
Hi @aaTman , you need to set |
|
The error message indicates that the dask team would ideally implement the required functionality if time allowed (NotImplementedError: 'item' is not yet a valid method on dask arrays) ... we can put something into |
|
Thanks @nicholasloveday and @tennlee! I need to consider exactly how to implement |
|
No problem. I wasn't really across this issue until 20 minutes ago, so I don't have anything to contribute on recommended workarounds until I've had some time to fully grok what's going on. |
|
Okay, I think we fixed this issue, and the new release of scores 2.5.0 is just freshly published. If you can try the new version and let us know if there is still any issue, that would be great. |
|
It's also worth noting; this fixes the dask error, but if you want to just calculate ROC AUC and not the actual ROC plots, then there is a wayyy faster way of calculating it (using the Mann-Whitney U approach). I'm hoping to add that to the following scores release. |
EWB Pull Request
Description
Adds Receiver Operating Characteristic Skill Score metric implementation.
This probabilistic metric has been found to be relatively insensitive to the rarity of hydro-climatological events, which makes it suitable for addition into the benchmarking suite. [ref]
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
Unit tests
Checklist: