Skip to content

Conversation

@rhshadrach
Copy link
Member

@rhshadrach rhshadrach commented Oct 25, 2025

Continuation of #62752

There are three things I'd like to accomplish here:

  1. Users passing sort=False get an unsorted result. This is a bug and should be fixed without deprecation.
  2. Users not specifying sort do not get a change in behavior, but will get warned when enforcing the deprecation will change behavior.
  3. Changing behavior of sort here with pandas' internal usage of concat does not break user code.

It seems (3) will be very hard to ascertain. There are many places where concat is used without specifying sort but (a) the index cannot be Datetime or (b) alignment has already been done so sort has no impact. I plan to take a deeper look into internal usage to see if this deprecation can impact other parts of the API, I've so far only found one in groupby.shift that I think we can call a bug.

To accomplish these, it seems to me that we need to do a somewhat expensive check to see if not sorting impacts the result as otherwise users will get many spurious warnings.

@rhshadrach rhshadrach added Bug Deprecate Functionality to remove in pandas Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Oct 25, 2025
@rhshadrach
Copy link
Member Author

@jbrockmendel - this is ready for a look. I still need to look more into the internal usages of concat as mentioned in the OP, but going to hold off until this looks to be the way forward.

When all objects passed to :func:`concat` have a :class:`DatetimeIndex`,
passing ``sort=False`` will now result in the non-concatenation axis not
being sorted. Previously, the result would always be sorted along
the non-concatenation axis even when ``sort=False`` is passed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GH ref here to the issue?



def union_indexes(indexes, sort: bool | None = True) -> Index:
def union_indexes(indexes, sort: bool | None | lib.NoDefault = True) -> Index:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is None still needed here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By both type checking and test coverage it looks like no, will remove.

intersect
or any(not isinstance(index, DatetimeIndex) for index in non_concat_axis)
or all(
id(prev) == id(curr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prev is curr?

@jbrockmendel
Copy link
Member

I still need to look more into the internal usages of concat as mentioned in the OP, but going to hold off until this looks to be the way forward.

I'm fine with this. Though seeing the increased complexity, I'd also be OK with the previous PR's breaking change approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bug Deprecate Functionality to remove in pandas Reshaping Concat, Merge/Join, Stack/Unstack, Explode

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DEPR: pd.concat special cases DatetimeIndex to sort even when sort=False

2 participants