Skip to content

Conversation

loicdiridollou
Copy link
Member

day1 = pd.Timedelta(1, unit="D")
day10 = pd.Timedelta(10, unit="D")
pd.timedelta_range(
day1, day10, periods=10, freq="D" # type: ignore[call-overload] # pyright: ignore[reportArgumentType]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing all four should be banned

Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've added new overloads, but I don't see the benefit of them, without tests that don't work with the current stubs, but would work with your changes.

Comment on lines 106 to 120
def date_range(
end: str | DateAndDatetimeLike,
periods: int,
freq: str | timedelta | Timedelta | BaseOffset | None = None,
tz: TimeZones = None,
normalize: bool = False,
name: Hashable | None = None,
inclusive: IntervalClosedType = "both",
unit: TimeUnit | None = None,
) -> DatetimeIndex: ...
@overload
def date_range(
start: str | DateAndDatetimeLike,
periods: int,
freq: str | timedelta | Timedelta | BaseOffset | None = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These would only work if the arguments are keyword arguments, so I think you need to put an asterisk (*) right after the opening parenthesis.

Comment on lines 128 to 144
def date_range(
start: str | DateAndDatetimeLike | None,
end: str | DateAndDatetimeLike | None,
freq: str | timedelta | Timedelta | BaseOffset | None = None,
tz: TimeZones = None,
normalize: bool = False,
name: Hashable | None = None,
inclusive: IntervalClosedType = "both",
unit: TimeUnit | None = None,
) -> DatetimeIndex: ...
@overload
def date_range(
start: str | DateAndDatetimeLike | None = None,
end: str | DateAndDatetimeLike | None = None,
periods: int | None = None,
freq: str | timedelta | Timedelta | BaseOffset = "D",
start: str | DateAndDatetimeLike,
end: str | DateAndDatetimeLike,
periods: int,
freq: None = None,
tz: TimeZones = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the point here of having separate overloads. Also, you need to put back the default values.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the original goal was to only allow three out of the four values among start, end, periods and freq according to the docs.
Although the freq argument is often not passed.

Comment on lines 84 to 97
start: TimedeltaConvertibleTypes,
end: TimedeltaConvertibleTypes,
periods: int,
freq: None = None,
name: Hashable | None = None,
closed: Literal["left", "right"] | None = None,
*,
unit: None | str = ...,
) -> TimedeltaIndex: ...
@overload
def timedelta_range(
start: TimedeltaConvertibleTypes,
periods: int,
freq: None = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're going to have a similar issue in terms of mixing positional vs. keyword arguments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now fixed.

@loicdiridollou
Copy link
Member Author

I have added two tests for Type invalid because this is what we want to check (that passing four arguments is not allowed).

Comment on lines 108 to 109
end: str | DateAndDatetimeLike = ...,
periods: int = ...,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would allow pd.date_range() to be accepted.

The error message is " Of the four parameters: start, end, periods, and freq, exactly three must be specified".

So if you want this to work, you have to have overloads where 3 are required. You can't have = ... on any of them (or default values either), because that means the argument is optional.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point where I am a bit unsure is that most of the usage I have seen it just about passing start and end, freq is often left alone so the usage does not reflect the docs.
Should we force the user to use three out of the four or allow only two?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think the docs and error message are a bit misleading.

You'll need to do a test to determine when 2 are allowed. I think the rule is as follows:

  • Whether or not freq is specified, you need at least 2 out of the 3 of start, end, periods

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I look at the docs, they don't even use three exactly, when looking at the code they are using the fact that freq defaults to None but is then replaced by D for day.
https://pandas.pydata.org/docs/dev/reference/api/pandas.timedelta_range.html#pandas.timedelta_range

See https://github.com/pandas-dev/pandas/blob/78ec27668163808b3488da6ed2c64d0b8e9bfbc8/pandas/core/indexes/timedeltas.py#L330-L331 for the code where they change freq=None to freq="D"
Should we force the user to do 3 args even if the docs example only uses 2.
Happy to open a PR in the pandas repo to ask for clarification.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry took a bit of time to find all the docs and you had answered before.
Let me setup a quick test to validate that we need two out of three of start, end, and periods.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the freq is indeed the argument that does not need to be passed (it is also the last one in the order).
I think the correct way to type hint is like you said forcing 2 out of the 3 with freq being an str | None = ... without the need to be passed when you pass two other. I have also tried passing four and that works as long as one of the three is None (yet it would be a strange use case of passing None since it is a default).

>>> pd.date_range("2023-04-05", "2023-04-07", freq="D")
DatetimeIndex(['2023-04-05', '2023-04-06', '2023-04-07'], dtype='datetime64[ns]', freq='D')
>>> pd.date_range("2023-04-05", "2023-04-07")
DatetimeIndex(['2023-04-05', '2023-04-06', '2023-04-07'], dtype='datetime64[ns]', freq='D')

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pandas-dev/pandas#62161

I have asked for clarifications.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have read and reread the docs, it is a bit confusing at first glance but I will add all the tests for it.

Comment on lines 84 to 86
start: TimedeltaConvertibleTypes = ...,
end: TimedeltaConvertibleTypes = ...,
periods: int = ...,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same issue here as with date_range()

Comment on lines 1566 to 1571
if TYPE_CHECKING_INVALID_USAGE:
day1 = pd.Timedelta(1, unit="D")
day10 = pd.Timedelta(10, unit="D")
pd.timedelta_range(
day1, day10, 10, "D" # type: ignore[call-overload] # pyright: ignore[reportArgumentType]
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should add checks where only 0, 1 or 2 of the 3 arguments are specified, as that should fail as well.

Comment on lines 1753 to 1758
if TYPE_CHECKING_INVALID_USAGE:
day1 = pd.Timestamp("2023-04-03")
day10 = pd.Timedelta("2023-04-08")
pd.date_range(
day1, day10, 10, "D" # type: ignore[call-overload] # pyright: ignore[reportCallIssue]
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment - need to check that 0, 1 or 2 arguments specified causes a failure.

@loicdiridollou
Copy link
Member Author

Rewrote the tests and the overloads properly with 2 (when omitting freq), 3 and forbidding 1 and 4 arguments among start, end, periods and freq.

Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Dr-Irv Dr-Irv merged commit e799ec1 into pandas-dev:main Aug 22, 2025
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Rewrite timedelta_range to use overloads in core/indexes/timedelta
2 participants