Skip to content

Conversation

@yuema137
Copy link
Collaborator

This PR enforces deterministic sorting behavior by implementing the sort_enforcement module. Specifically, it:

  1. Makes mergesort the default sorting algorithm
  2. Disables non-deterministic sorting algorithms (quicksort and heapsort) to prevent their usage
  3. Wraps np.sort and np.argsort to enforce the behaviors mentioned above

The implementation is robust to import order - the enforcement takes effect whenever strax is imported, regardless of whether numpy is imported before or after. This ensures consistent sorting behavior across all strax operations, addressing the non-deterministic sorting issues reported in #916.

A unit test is also added accordingly.

@yuema137 yuema137 requested a review from dachengx October 23, 2024 22:48
@yuema137 yuema137 marked this pull request as draft October 23, 2024 22:48
@coveralls
Copy link

coveralls commented Oct 23, 2024

Coverage Status

coverage: 90.074% (+0.009%) from 90.065%
when pulling 248a2a8 on set_default_as_mergesort
into c3dd2e1 on master.

@yuema137
Copy link
Collaborator Author

Unfortunately, the kind for np.sort is hard-coded as quicksort in numba now:
https://github.com/numba/numba/blob/0f363d1b2dd19f2aa1a8cec5f0a99c3dd95512f8/numba/np/arrayobj.py#L6524

For np.argsort, mergesort and quicksort are both supported though.

So we have two options:

  1. Make a PR to numba to enable mergesort for np.sort (need to check if it's easy)
  2. Move all the sorting in strax outside of the numba decorators (need to check performance)

I'm checking the possibility for both ways.

@dachengx dachengx changed the title Add enforcement for np.sort and np.argsort Add enforcement for np.sort and np.argsort Nov 13, 2024
@yuema137 yuema137 requested a review from dachengx November 13, 2024 16:04
@dachengx dachengx marked this pull request as ready for review November 13, 2024 16:05
@yuema137 yuema137 requested a review from dachengx November 14, 2024 05:34
Copy link
Collaborator

@dachengx dachengx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dachengx
Copy link
Collaborator

I should mention that this PR is following up XENONnT/straxen#1176, for future reference.

@dachengx dachengx merged commit 8489aa2 into master Nov 14, 2024
6 checks passed
@dachengx dachengx deleted the set_default_as_mergesort branch November 14, 2024 07:37
dachengx added a commit that referenced this pull request May 9, 2025
* set mergesort as default and disable unstable kinds

* add unittest

* formatting

* formatting

* change name to sort_enforcement

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* break long error messages

* keep the original sorting in numpy

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* reemove unused import

* always use stablesort

* add numba-supported version of stableargsort

* use better naming for stablesort

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* use jitable to allow both regular function and numba-decorated function for highest_density_region

* remove redundant numba_sort

* explicitly import stablesort from strax for numba decorated functions

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* consistent import style within one module

* remove unused import

* add sorting error

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable numba support for stable_sort

* consistent import style for stable sort

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add kwargs

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* modify docstring for stable_sort

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove kwargs

* update variable name

* update test_sort with hypothesis

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rewrite hithest_density_region to decoupld stable_sort from numba part

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove unused import

* break long lines

* remove numba decorator for the main function

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix typo

* rewrite hitlets to use non-numba HDR region

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* format hitlets.py

* unify growing_result import to fix mypy error

* remove redundant space

* Remove unnecessary indent

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: dachengx <dx2227@columbia.edu>
dachengx added a commit that referenced this pull request May 9, 2025
* set mergesort as default and disable unstable kinds

* add unittest

* formatting

* formatting

* change name to sort_enforcement

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* break long error messages

* keep the original sorting in numpy

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* reemove unused import

* always use stablesort

* add numba-supported version of stableargsort

* use better naming for stablesort

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* use jitable to allow both regular function and numba-decorated function for highest_density_region

* remove redundant numba_sort

* explicitly import stablesort from strax for numba decorated functions

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* consistent import style within one module

* remove unused import

* add sorting error

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable numba support for stable_sort

* consistent import style for stable sort

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add kwargs

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* modify docstring for stable_sort

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove kwargs

* update variable name

* update test_sort with hypothesis

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rewrite hithest_density_region to decoupld stable_sort from numba part

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove unused import

* break long lines

* remove numba decorator for the main function

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix typo

* rewrite hitlets to use non-numba HDR region

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* format hitlets.py

* unify growing_result import to fix mypy error

* remove redundant space

* Remove unnecessary indent

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: dachengx <dx2227@columbia.edu>
dachengx added a commit that referenced this pull request May 9, 2025
* set mergesort as default and disable unstable kinds

* add unittest

* formatting

* formatting

* change name to sort_enforcement

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* break long error messages

* keep the original sorting in numpy

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* reemove unused import

* always use stablesort

* add numba-supported version of stableargsort

* use better naming for stablesort

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* use jitable to allow both regular function and numba-decorated function for highest_density_region

* remove redundant numba_sort

* explicitly import stablesort from strax for numba decorated functions

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* consistent import style within one module

* remove unused import

* add sorting error

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable numba support for stable_sort

* consistent import style for stable sort

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add kwargs

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* modify docstring for stable_sort

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove kwargs

* update variable name

* update test_sort with hypothesis

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rewrite hithest_density_region to decoupld stable_sort from numba part

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove unused import

* break long lines

* remove numba decorator for the main function

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix typo

* rewrite hitlets to use non-numba HDR region

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* format hitlets.py

* unify growing_result import to fix mypy error

* remove redundant space

* Remove unnecessary indent

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: dachengx <dx2227@columbia.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants