-
Notifications
You must be signed in to change notification settings - Fork 20
Improve error handling for numpy.histogram binning errors #68
Description
Description
I encountered a ValueError when using rio_stac to process certain datasets (usually small tiles, please check attached S2 tile). The error originates from a call to numpy.histogram, specifically with the message "Too many bins for data range. Cannot create 10 finite-sized bins." (https://github.com/numpy/numpy/blob/0532af47d6a815298b7841de00bdbc547104b237/numpy/lib/_histograms_impl.py#L449)
This occurs when the data range is extremely small (e.g., all values are identical or nearly identical), which makes it impossible for numpy to calculate a histogram with the requested number of bins.
While this is an issue with the underlying data, the raw numpy error message is not very user-friendly. It would be beneficial for rio_stac to catch this specific ValueError and handle it more gracefully, providing a more informative message to the user about the data's characteristics and why a histogram could not be generated.
It will be raised here
Line 163 in a942d08
| sample, edges = numpy.histogram(arr[~arr.mask]) |
What we currently use is a
np.allclose call and then using the minimal schema conform histogram information (3 bins, min, max and first bucket entry all the unique value.Please mind in order to reproduce this use e.g. public.ecr.aws/docker/library/python:3.11.8. Depending on the numpy build and linked libs, it can also not occur.
Thans, Erik