Skip to content

Conversation

@MartinBraquet
Copy link
Contributor

@MartinBraquet MartinBraquet commented May 19, 2025

PR is ready for review.

@MartinBraquet MartinBraquet requested a review from Dr-Irv as a code owner May 19, 2025 11:11
@MartinBraquet MartinBraquet changed the title Added :DataFrame.nsorted to select top n rows according to column-dependent order Added DataFrame.nsorted to select top n rows according to column-dependent order May 19, 2025
@MartinBraquet MartinBraquet changed the title Added DataFrame.nsorted to select top n rows according to column-dependent order ENH: Added DataFrame.nsorted to select top n rows according to column-dependent order May 19, 2025
@github-actions
Copy link
Contributor

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label Jun 19, 2025
@MartinBraquet
Copy link
Contributor Author

The PR has been waiting for reviewers since then.

@MartinBraquet
Copy link
Contributor Author

MartinBraquet commented Jul 20, 2025

@rhshadrach @snitish @Dr-Irv

Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Some minor comments. Can you also add this API docs in doc/source/reference/frame.rst and series.rst.

Comment on lines 419 to 420
# Arguments for nsorted, nsmallest and nlargest.
NsmallestNlargestKeep = Literal["first", "last", "all"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove TypeAlias? Also, can you keep the style here consistent by not ending with a period .

Copy link
Contributor Author

@MartinBraquet MartinBraquet Jul 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason, good catch! Note to myself: add the type after extracting a constant in Pycharm, because the IDE won't do it for me (so far).

Comment on lines 7751 to 7752
This method is equivalent to
``df.nsorted(n, columns, ascending=False)``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer if this line was removed; including nsorted in the See Also section below seems sufficient.

Comment on lines 7895 to 7896
This method is equivalent to
``df.nsorted(n, columns, ascending=True)``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

dtype = self.obj.dtype
if not self.is_valid_dtype_n_method(dtype):
raise TypeError(f"Cannot use method '{method}' with dtype {dtype}")
raise TypeError(f"Cannot use n-sorting with dtype {dtype}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is okay but wonder if Cannot sort dtype {dtype} might be more clear?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to agree with your point. I updated the message.

Comment on lines +229 to +231
Series.nlargest
Series.nsmallest
Series.nsorted
Copy link
Contributor Author

@MartinBraquet MartinBraquet Jul 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added nsorted and moved the sorting methods to the "Reshaping, sorting" section

Comment on lines +3843 to +3869
"""
Return the top `n` elements, sorted in ascending or descending order.
Parameters
----------
n : int, default 5
Return this many descending sorted values.
ascending : bool
If True, return the smallest `n` elements. If False, return
largest `n` elements.
keep : {'first', 'last', 'all'}, default 'first'
When there are duplicate values that cannot all fit in a
Series of `n` elements:
- ``first`` : return the first `n` occurrences in order
of appearance.
- ``last`` : return the last `n` occurrences in reverse
order of appearance.
- ``all`` : keep all occurrences. This can result in a Series of
size larger than `n`.
Returns
-------
Series
The `n` top values in the Series.
See Also
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since nsorted got added to the series API docs, I believe it requires its proper docstring as well. There is a lot of redundancy with nlargest and nsmallest. Let me know if it should be done differently.

@MartinBraquet
Copy link
Contributor Author

MartinBraquet commented Jul 22, 2025

I applied the comments and the checks passed, so the PR should be ready for re-review.

Copy link
Contributor

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some small doc things I saw. Otherwise I'm OK if other maintainers are OK

@MartinBraquet MartinBraquet requested a review from rhshadrach July 29, 2025 18:19
@MartinBraquet
Copy link
Contributor Author

@rhshadrach Any ETA for review?

Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm - @mroeschke would you also be able to take a look.

@rhshadrach rhshadrach added this to the 3.0 milestone Sep 2, 2025
@rhshadrach rhshadrach added Enhancement Sorting e.g. sort_index, sort_values and removed Stale labels Sep 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Enhancement Sorting e.g. sort_index, sort_values

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ENH: Enable nsmallest/nlargest on object dtype

3 participants