Skip to content

fill() performance #57

@belm0

Description

@belm0

For the use case of collecting execution time samples during a program run (and ultimately reporting quantiles), I'd like fill() to be fairly fast.

physt fill() execution time seem to be independent of binning strategy (trivially using constant data value in my tests). I was surprised that bin search is implemented via np.searchsorted() in all cases, even fixed_width binning.

$ python -m timeit -s 'from physt import h1; h = h1(None, "exponential", 100, range=(1e-6, 1))' 'h.fill(.1)'
10000 loops, best of 5: 38 usec per loop

$ python -m timeit -s 'from physt import h1; h = h1(None, "fixed_width", .01, range=(0, .5))' 'h.fill(.1)'
10000 loops, best of 5: 36.9 usec per loop

Comparing to (unmaintained) https://github.com/carsonfarmer/streamhist:

python -m timeit -s 'from streamhist import StreamHist; h = StreamHist()' 'h.update(.1)'
50000 loops, best of 5: 7.52 usec per loop

(aside: streamhist is quite nice about managing binning and being able to report arbitrary quantiles. Perhaps some of it could be adopted.)

Metadata

Metadata

Assignees

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions