Skip to content

Conversation

@cakedev0
Copy link
Contributor

Fixes: #389

To test sort stability of array libraries, you usually need arrays of more than 16 elements (because for less than 16, libs often default to an inherently stable sort, even for unstable algorithms).

Because of this was not tested properly, a gap in array-api-compat was not detected by this test suite. See issue data-apis/array-api-compat#354. You can try running:

ARRAY_API_TESTS_VERSION="2024.12" ARRAY_API_TESTS_MODULE=array_api_compat.torch pytest array_api_tests/test_sorting_functions.py

with this branch of array-api-tests to see it indeed fails.

@ev-br
Copy link
Member

ev-br commented Oct 27, 2025

Confirmed it errors, out but I don't understand the failure TBH. What goes wrong with sorting an array full or unsigned int8 zeros, could you please explain?

>                       ph.assert_scalar_equals("argsort", type_=int, idx=idx, out=int(out[idx]), expected=o, kw=kw)
E                       AssertionError: out[(0,)]=8, but should be 0 [argsort()]
E                       
E                       ========== FAILING CODE SNIPPET:
E                       xp.argsort(tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=torch.uint8), **kw) with kw = {}
E                       ====================
E                       
E                       Falsifying example: test_argsort(
E                           x=tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=torch.uint8),
E                           data=data(...),
E                       )
E                       Draw 1 (kw): {}
E                       Explanation:
E                           These lines were always and only run by failing examples:
E                               /home/br/repos/array-api-tests/array_api_tests/test_sorting_functions.py:92

@cakedev0
Copy link
Contributor Author

When you're stable-argsorting a constant array, you expected the output to be [0, 1, 2, ..., n - 1] but if the sort is actually not stable, it won't be the case (the result will just be some arbitrary permutation of [0, 1, 2, ..., n - 1]), hence the error. It's exactly what's described in data-apis/array-api-compat#354

Note: The test would be much more explicit without using hypothesis, it would look like:
assert xp.all(xp.argsort(xp.asarray([0] * 30)) == xp.arange(30))
Maybe I can write a simple regression test like this?

@ev-br
Copy link
Member

ev-br commented Oct 28, 2025

Thanks for the explanation!
Indeed, TIL that the default argsorting strategy in pytorch differs between 16 and 17 elements:

In [1]: import torch

In [2]: t = torch.zeros(16)

In [3]: torch.argsort(t)
Out[3]: tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15])

In [4]: t = torch.zeros(17)

In [5]: torch.argsort(t)
Out[5]: tensor([ 8, 16, 15, 14, 13, 12, 11, 10,  9,  0,  7,  6,  5,  4,  3,  2,  1])

Note: The test would be much more explicit without using hypothesis, it would look like:
assert xp.all(xp.argsort(xp.asarray([0] * 30)) == xp.arange(30))
Maybe I can write a simple regression test like this?

That would be nice indeed. We tend to keep library-specific tests for behaviors which differ between "bare" and array-api-compat wrapped libraries in array-api-compat itself, so this test would be a great addition to
https://github.com/data-apis/array-api-compat/blob/main/tests/test_torch.py
In data-apis/array-api-compat#356, which needs a small tweak anyway.

@cakedev0
Copy link
Contributor Author

cakedev0 commented Oct 28, 2025

the default argsorting strategy in pytorch differs between 16 and 17 elements:

Note that it's probably a common trait of sorting algorithms, in numpy:

np.argsort(np.zeros(16, dtype='int8'))  # stable
np.argsort(np.zeros(17, dtype='int8'))  # unstable

(you don't see this behavior for dtypes that benefit from SIMD optimization though)

It seems that it's a common optimization to switch to insertion sort for small arrays, and insertion sort is inherently stable. See https://github.com/numpy/numpy/blob/main/numpy/_core/src/npysort/quicksort.cpp#L70 in numpy.

Copy link
Member

@betatim betatim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me.

I had to go clicking through the many layers of hypothesis and the array API helpers but finally got to https://github.com/HypothesisWorks/hypothesis/blob/7b93827d14d21ec27cf3e2fd45ee5ace904082f1/hypothesis-python/src/hypothesis/extra/_array_helpers.py#L81 which helped me understand what min_side and max_side do :D

The fact that you have to click a lot to get there and somehow min_side doesn't scream "this is the size of the array" to me I think it would be good to add a comment about what the goal of the max_side=50 is (you have to understand what "side" means and that 16/17 is a magic number. Neither is really explicit from max_side=50 if. you ask me). Just to make the life of people from the future easier.

The best best solution would be if we could explicitly add a "small size" and a "large size" in addition to those that hypothesis generates. That way it would be clear what the magic numbers are (16 and 17). But maybe this is too big a change for the scope of this PR (which is nice and compact).

@cakedev0
Copy link
Contributor Author

I agree this would be quite hard to understand for future readers. I added a short comment that reference the issue #389, that should already be much easier to understand now.

@ev-br ev-br merged commit fa42c24 into data-apis:master Oct 28, 2025
3 checks passed
@ev-br
Copy link
Member

ev-br commented Oct 28, 2025

Thanks @cakedev0 for getting to the bottom of this, thanks @betatim for the review!
I agree both that beef meaning of these parameters is very well hidden in the layers and layers of cabbage hypothesis and test suite helpers, and that the patch is nice and concise. Thus lets land this as is, and if somebody feels like further improve it with "large" and "small" arrays, that'd be a great follow-up.

ev-br added a commit to ev-br/array-api-compat that referenced this pull request Oct 28, 2025
cross-ref data-apis#356
which wrapped torch.argsort to fix the default, and
data-apis/array-api-tests#390
which made a matching change in the array-api-test suite.
ev-br added a commit to ev-br/array-api-compat that referenced this pull request Oct 28, 2025
cross-ref data-apis#356
which wrapped torch.argsort to fix the default, and
data-apis/array-api-tests#390
which made a matching change in the array-api-test suite.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

argsort: default value of stable not tested

3 participants