BUG: Do not ignore sort in concat for DatetimeIndex #62752

rhshadrach · 2025-10-19T12:38:34Z

closes DEPR: pd.concat special cases DatetimeIndex to sort even when sort=False #57335 (Replace xxxx with the GitHub issue number)
Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added type annotations to new arguments/methods/functions.
Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

jbrockmendel · 2025-10-19T15:06:52Z

I'm fine with this, but given the discussion on the issue would also be OK with doing this as a deprecation if that's viable.

rhshadrach · 2025-10-19T19:02:02Z

I'm on board with deprecating instead. Two things:

I think we should make it so that specifying sort=False does not sort when the deprecation is introduced. Only unspecified sort will still sort by default for DatetimeIndex.
In order to make this less noisy, it seems beneficial to determine if concat-without-sorting would give a different result and only emit the warning when it does. I'm using union_indexes, not sure yet if that will be correct in all cases. Trying to determine this deeper in the code seems hard. Regardless this will be somewhat of an expensive op. Before I sink more time, I'd like to make sure there is agreement that this is worth the computation.

jbrockmendel · 2025-10-19T19:35:24Z

Yah, IIRC deprecating that behavior was tricky because there was also a non-concat codepath that led to that particular unwanted sort = True line.

If anyone else wants to merge this as is I won't complain. Otherwise we can bring it up at the dev call on Wednesday.

jbrockmendel · 2025-10-22T19:29:49Z

Discussed on today’s dev call. Consensus was “deprecation might be nice if not too difficult but I don’t feel strongly about it and am willing to defer to Richard”

rhshadrach · 2025-10-23T12:25:42Z

Are we good with warning anytime the non-concat index is datetime and the sort argument is not specified? This would be my preferred approach, contrary to what I mentioned above. It seems difficult and computationally expensive to not warn anytime the behavior wouldn't change because the result would be naturally sorted.

jbrockmendel · 2025-10-23T15:59:09Z

you want that warning in place forever, not as a deprecation? im not wild about that

rhshadrach · 2025-10-23T16:15:44Z

No, just to deprecate.

jbrockmendel · 2025-10-23T16:24:54Z

Oh ok. Then its a tradeoff between potentially false-positive warnings and difficult/costly is_monotonic_increasing calls?

rhshadrach · 2025-10-23T20:27:34Z

[Demonstrating here with integer index; same applies to datetimes]

Yes, and note that monotone increasing can be sufficient but not necessary:

ser1 = pd.Series([1, 2], index=[1, 2])
ser2 = pd.Series([3, 4], index=[2, 1])
print(ser2.index.is_monotonic_increasing)
# False
print(pd.concat([ser1, ser2], sort=False, axis=1))
#    0  1
# 1  1  4
# 2  2  3

So there are still false positives. One also needs to check that the tail element of each is less than the head of the next, so monotonic increasing alone is not sufficient.

ser1 = pd.Series([1, 2], index=[1, 3])
ser2 = pd.Series([3, 4], index=[2, 4])
print(ser1.index.is_monotonic_increasing)
# True
print(ser2.index.is_monotonic_increasing)
# True
print(pd.concat([ser1, ser2], sort=False, axis=1))
#      0    1
# 1  1.0  NaN
# 3  2.0  NaN
# 2  NaN  3.0
# 4  NaN  4.0

jbrockmendel · 2025-10-23T22:13:34Z

Ok. Per the call, I’m happy to defer to you on this. Let me know when ready for review

rhshadrach · 2025-10-25T15:17:52Z

In the commits Deprecate with union_indexes and More perfomrant, more warnings, there are two versions of this deprecation. One where we use union_indexes to check if sort will have an impact on the result, and one where we avoid this computation but still check two common cases. I'm 50/50 on which one to go with.

In either case, we need to handle the 60ish places where we call concat internally. I spent a while trying to produce bugs due to the sorting behavior of datetimes on main from internal uses of concat, haven't found any yet. It needs to be a situation where the indices are not equal, and many places we align prior to calling concat. Do we just pass sort=False everywhere internally? Should this be noted in the whatsnew? E.g.

There are many places where pandas internally uses concat and this behavior change can appear. While we have not been able to find any behavior differences, users with data that have an unsorted DatetimeIndex may end up with results that remain unsorted whereas in prior versions of pandas the result was sorted.

Admittedly not great.

Edit: Here is an example of what I think is a bug.

idx = date_range("1/1/2000", periods=3, freq="h")
df = DataFrame(
    {"a": [1, 2, 3], "b": [True, True, False]},
    index=[idx[2], idx[0], idx[1]],
)
print(df)
print(df.groupby("b")[["a"]].shift([0, 1], freq="h"))
#                      a      b
# 2000-01-01 02:00:00  1   True
# 2000-01-01 00:00:00  2   True
# 2000-01-01 01:00:00  3  False
#                      a_0  a_1
# 2000-01-01 00:00:00  2.0  NaN
# 2000-01-01 01:00:00  3.0  2.0
# 2000-01-01 02:00:00  1.0  3.0
# 2000-01-01 03:00:00  NaN  1.0

BUG: Do not ignore sort in concat for DatetimeIndex

9b2c476

rhshadrach added Datetime Datetime data dtype Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Oct 19, 2025

rhshadrach requested a review from jbrockmendel October 19, 2025 13:26

jbrockmendel approved these changes Oct 19, 2025

View reviewed changes

rhshadrach marked this pull request as draft October 19, 2025 19:03

rhshadrach mentioned this pull request Oct 25, 2025

PERF: Assign unique for monotonic but not strictly monotonic index #62839

Merged

5 tasks

rhshadrach added 3 commits October 25, 2025 09:28

Deprecate with union_indexes

32128eb

More performant, more warnings

3b35993

Fixup

ac0dc97

Cleanups

52dbb3d

rhshadrach closed this Oct 25, 2025

rhshadrach mentioned this pull request Oct 25, 2025

DEPR/BUG: Do not ignore sort in concat for DatetimeIndex #62843

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: Do not ignore sort in concat for DatetimeIndex #62752

BUG: Do not ignore sort in concat for DatetimeIndex #62752

Uh oh!

rhshadrach commented Oct 19, 2025

Uh oh!

jbrockmendel commented Oct 19, 2025

Uh oh!

rhshadrach commented Oct 19, 2025 •

edited

Loading

Uh oh!

jbrockmendel commented Oct 19, 2025

Uh oh!

jbrockmendel commented Oct 22, 2025

Uh oh!

rhshadrach commented Oct 23, 2025

Uh oh!

jbrockmendel commented Oct 23, 2025

Uh oh!

rhshadrach commented Oct 23, 2025

Uh oh!

jbrockmendel commented Oct 23, 2025

Uh oh!

rhshadrach commented Oct 23, 2025

Uh oh!

jbrockmendel commented Oct 23, 2025

Uh oh!

rhshadrach commented Oct 25, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

BUG: Do not ignore sort in concat for DatetimeIndex #62752

BUG: Do not ignore sort in concat for DatetimeIndex #62752

Uh oh!

Conversation

rhshadrach commented Oct 19, 2025

Uh oh!

jbrockmendel commented Oct 19, 2025

Uh oh!

rhshadrach commented Oct 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jbrockmendel commented Oct 19, 2025

Uh oh!

jbrockmendel commented Oct 22, 2025

Uh oh!

rhshadrach commented Oct 23, 2025

Uh oh!

jbrockmendel commented Oct 23, 2025

Uh oh!

rhshadrach commented Oct 23, 2025

Uh oh!

jbrockmendel commented Oct 23, 2025

Uh oh!

rhshadrach commented Oct 23, 2025

Uh oh!

jbrockmendel commented Oct 23, 2025

Uh oh!

rhshadrach commented Oct 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rhshadrach commented Oct 19, 2025 •

edited

Loading

rhshadrach commented Oct 25, 2025 •

edited

Loading