-
Notifications
You must be signed in to change notification settings - Fork 14
Open
Labels
Description
For the use case of collecting execution time samples during a program run (and ultimately reporting quantiles), I'd like fill() to be fairly fast.
physt fill() execution time seem to be independent of binning strategy (trivially using constant data value in my tests). I was surprised that bin search is implemented via np.searchsorted() in all cases, even fixed_width binning.
$ python -m timeit -s 'from physt import h1; h = h1(None, "exponential", 100, range=(1e-6, 1))' 'h.fill(.1)'
10000 loops, best of 5: 38 usec per loop
$ python -m timeit -s 'from physt import h1; h = h1(None, "fixed_width", .01, range=(0, .5))' 'h.fill(.1)'
10000 loops, best of 5: 36.9 usec per loop
Comparing to (unmaintained) https://github.com/carsonfarmer/streamhist:
python -m timeit -s 'from streamhist import StreamHist; h = StreamHist()' 'h.update(.1)'
50000 loops, best of 5: 7.52 usec per loop
(aside: streamhist is quite nice about managing binning and being able to report arbitrary quantiles. Perhaps some of it could be adopted.)