Skip to content

Commit fd56024

Browse files
committed
Merge remote-tracking branch 'upstream/main' into aijams-take-function-invalid-dtype
2 parents acdfb62 + d597079 commit fd56024

File tree

9 files changed

+293
-60
lines changed

9 files changed

+293
-60
lines changed

.github/workflows/unit-tests.yml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -181,8 +181,7 @@ jobs:
181181
timeout-minutes: 90
182182
strategy:
183183
matrix:
184-
# Note: Don't use macOS latest since macos 14 appears to be arm64 only
185-
os: [macos-13, macos-14, windows-2025]
184+
os: [macos-15-intel, macos-15, windows-2025]
186185
env_file: [actions-311.yaml, actions-312.yaml, actions-313.yaml]
187186
fail-fast: false
188187
runs-on: ${{ matrix.os }}

.github/workflows/wheels.yml

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -98,10 +98,9 @@ jobs:
9898
- [ubuntu-24.04, musllinux_x86_64]
9999
- [ubuntu-24.04-arm, manylinux_aarch64]
100100
- [ubuntu-24.04-arm, musllinux_aarch64]
101-
- [macos-13, macosx_x86_64]
102-
# Note: M1 images on Github Actions start from macOS 14
103-
- [macos-14, macosx_arm64]
104-
- [windows-2022, win_amd64]
101+
- [macos-15-intel, macosx_x86_64]
102+
- [macos-15, macosx_arm64]
103+
- [windows-2025, win_amd64]
105104
- [windows-11-arm, win_arm64]
106105
python: [["cp311", "3.11"], ["cp312", "3.12"], ["cp313", "3.13"], ["cp313t", "3.13"], ["cp314", "3.14"], ["cp314t", "3.14"]]
107106
include:

asv_bench/benchmarks/ctors.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ def gen_of_str(arr):
2323

2424

2525
def arr_dict(arr):
26-
return dict(zip(range(len(arr)), arr))
26+
return dict(zip(range(len(arr)), arr, strict=True))
2727

2828

2929
def list_of_tuples(arr):

asv_bench/benchmarks/series_methods.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ def setup(self):
1616
self.idx = date_range(
1717
start=datetime(2015, 10, 26), end=datetime(2016, 1, 1), freq="50s"
1818
)
19-
self.data = dict(zip(self.idx, range(len(self.idx))))
19+
self.data = dict(zip(self.idx, range(len(self.idx)), strict=True))
2020
self.array = np.array([1, 2, 3])
2121
self.idx2 = Index(["a", "b", "c"])
2222

@@ -407,7 +407,9 @@ def setup(self, num_to_replace):
407407
self.to_replace_list = np.random.choice(self.arr, num_to_replace)
408408
self.values_list = np.random.choice(self.arr1, num_to_replace)
409409

410-
self.replace_dict = dict(zip(self.to_replace_list, self.values_list))
410+
self.replace_dict = dict(
411+
zip(self.to_replace_list, self.values_list, strict=True)
412+
)
411413

412414
def time_replace_dict(self, num_to_replace):
413415
self.ser.replace(self.replace_dict)

doc/source/whatsnew/v3.0.0.rst

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -515,6 +515,22 @@ If we had passed ``pd.Int64Dtype()`` or ``"int64[pyarrow]"`` for the dtype in th
515515

516516
With ``"mode.nan_is_na"`` set to ``False``, ``ser.to_numpy()`` (and ``frame.values`` and ``np.asarray(obj)``) will convert to ``object`` dtype if :class:`NA` entries are present, where before they would coerce to ``NaN``. To retain a float numpy dtype, explicitly pass ``na_value=np.nan`` to :meth:`Series.to_numpy`.
517517

518+
The ``__module__`` attribute now points to public modules
519+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
520+
521+
The ``__module__`` attribute on functions and classes in the public API has been
522+
updated to refer to the preferred public module from which to access the object,
523+
rather than the module in which the object happens to be defined (:issue:`55178`).
524+
525+
This produces more informative displays in the Python console for classes, e.g.,
526+
instead of ``<class 'pandas.core.frame.DataFrame'>`` you now see
527+
``<class 'pandas.DataFrame'>``, and in interactive tools such as IPython, e.g.,
528+
instead of ``<function pandas.io.parsers.readers.read_csv(...)>`` you now see
529+
``<function pandas.read_csv(...)>``.
530+
531+
This may break code that relies on the previous ``__module__`` values (e.g.
532+
doctests inspecting the ``type()`` of a DataFrame object).
533+
518534
.. _whatsnew_300.api_breaking.deps:
519535

520536
Increased minimum version for Python

pandas/core/indexes/base.py

Lines changed: 109 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1185,12 +1185,12 @@ def astype(self, dtype: Dtype, copy: bool = True):
11851185
How to handle negative values in `indices`.
11861186
11871187
* False: negative values in `indices` indicate positional indices
1188-
from the right (the default). This is similar to
1189-
:func:`numpy.take`.
1188+
from the right (the default). This is similar to
1189+
:func:`numpy.take`.
11901190
11911191
* True: negative values in `indices` indicate
1192-
missing values. These values are set to `fill_value`. Any other
1193-
other negative values raise a ``ValueError``.
1192+
missing values. These values are set to `fill_value`. Any other
1193+
other negative values raise a ``ValueError``.
11941194
11951195
fill_value : scalar, default None
11961196
If allow_fill=True and fill_value is not None, indices specified by
@@ -1216,7 +1216,6 @@ def astype(self, dtype: Dtype, copy: bool = True):
12161216
Index(['c', 'c', 'b', 'c'], dtype='str')
12171217
"""
12181218

1219-
@Appender(_index_shared_docs["take"] % _index_doc_kwargs)
12201219
def take(
12211220
self,
12221221
indices,
@@ -1225,6 +1224,51 @@ def take(
12251224
fill_value=None,
12261225
**kwargs,
12271226
) -> Self:
1227+
"""
1228+
Return a new Index of the values selected by the indices.
1229+
1230+
For internal compatibility with numpy arrays.
1231+
1232+
Parameters
1233+
----------
1234+
indices : array-like
1235+
Indices to be taken.
1236+
axis : int, optional
1237+
The axis over which to select values, always 0.
1238+
allow_fill : bool, default True
1239+
How to handle negative values in `indices`.
1240+
1241+
* False: negative values in `indices` indicate positional indices
1242+
from the right (the default). This is similar to
1243+
:func:`numpy.take`.
1244+
1245+
* True: negative values in `indices` indicate
1246+
missing values. These values are set to `fill_value`. Any
1247+
other negative values raise a ``ValueError``.
1248+
1249+
fill_value : scalar, default None
1250+
If allow_fill=True and fill_value is not None, indices specified by
1251+
-1 are regarded as NA. If Index doesn't hold NA, raise ValueError.
1252+
**kwargs
1253+
Required for compatibility with numpy.
1254+
1255+
Returns
1256+
-------
1257+
Index
1258+
An index formed of elements at the given indices. Will be the same
1259+
type as self, except for RangeIndex.
1260+
1261+
See Also
1262+
--------
1263+
numpy.ndarray.take: Return an array formed from the
1264+
elements of a at the given indices.
1265+
1266+
Examples
1267+
--------
1268+
>>> idx = pd.Index(["a", "b", "c"])
1269+
>>> idx.take([2, 2, 1, 2])
1270+
Index(['c', 'c', 'b', 'c'], dtype='str')
1271+
"""
12281272
if kwargs:
12291273
nv.validate_take((), kwargs)
12301274
if is_scalar(indices):
@@ -1272,26 +1316,27 @@ def _maybe_disallow_fill(self, allow_fill: bool, fill_value, indices) -> bool:
12721316
allow_fill = False
12731317
return allow_fill
12741318

1275-
_index_shared_docs["repeat"] = """
1276-
Repeat elements of a %(klass)s.
1319+
def repeat(self, repeats, axis: None = None) -> Self:
1320+
"""
1321+
Repeat elements of a Index.
12771322
1278-
Returns a new %(klass)s where each element of the current %(klass)s
1323+
Returns a new Index where each element of the current Index
12791324
is repeated consecutively a given number of times.
12801325
12811326
Parameters
12821327
----------
12831328
repeats : int or array of ints
12841329
The number of repetitions for each element. This should be a
12851330
non-negative integer. Repeating 0 times will return an empty
1286-
%(klass)s.
1331+
Index.
12871332
axis : None
12881333
Must be ``None``. Has no effect but is accepted for compatibility
12891334
with numpy.
12901335
12911336
Returns
12921337
-------
1293-
%(klass)s
1294-
Newly created %(klass)s with repeated elements.
1338+
Index
1339+
Newly created Index with repeated elements.
12951340
12961341
See Also
12971342
--------
@@ -1300,17 +1345,14 @@ def _maybe_disallow_fill(self, allow_fill: bool, fill_value, indices) -> bool:
13001345
13011346
Examples
13021347
--------
1303-
>>> idx = pd.Index(['a', 'b', 'c'])
1348+
>>> idx = pd.Index(["a", "b", "c"])
13041349
>>> idx
13051350
Index(['a', 'b', 'c'], dtype='object')
13061351
>>> idx.repeat(2)
13071352
Index(['a', 'a', 'b', 'b', 'c', 'c'], dtype='object')
13081353
>>> idx.repeat([1, 2, 3])
13091354
Index(['a', 'b', 'b', 'c', 'c', 'c'], dtype='object')
13101355
"""
1311-
1312-
@Appender(_index_shared_docs["repeat"] % _index_doc_kwargs)
1313-
def repeat(self, repeats, axis: None = None) -> Self:
13141356
repeats = ensure_platform_int(repeats)
13151357
nv.validate_repeat((), {"axis": axis})
13161358
res_values = self._values.repeat(repeats)
@@ -5993,10 +6035,61 @@ def _should_fallback_to_positional(self) -> bool:
59936035
(array([-1, 1, 3, 4, -1]), array([0, 2]))
59946036
"""
59956037

5996-
@Appender(_index_shared_docs["get_indexer_non_unique"] % _index_doc_kwargs)
59976038
def get_indexer_non_unique(
59986039
self, target
59996040
) -> tuple[npt.NDArray[np.intp], npt.NDArray[np.intp]]:
6041+
"""
6042+
Compute indexer and mask for new index given the current index.
6043+
6044+
The indexer should be then used as an input to ndarray.take to align the
6045+
current data to the new index.
6046+
6047+
Parameters
6048+
----------
6049+
target : Index
6050+
An iterable containing the values to be used for computing indexer.
6051+
6052+
Returns
6053+
-------
6054+
indexer : np.ndarray[np.intp]
6055+
Integers from 0 to n - 1 indicating that the index at these
6056+
positions matches the corresponding target values. Missing values
6057+
in the target are marked by -1.
6058+
missing : np.ndarray[np.intp]
6059+
An indexer into the target of the values not found.
6060+
These correspond to the -1 in the indexer array.
6061+
6062+
See Also
6063+
--------
6064+
Index.get_indexer : Computes indexer and mask for new index given
6065+
the current index.
6066+
Index.get_indexer_for : Returns an indexer even when non-unique.
6067+
6068+
Examples
6069+
--------
6070+
>>> index = pd.Index(["c", "b", "a", "b", "b"])
6071+
>>> index.get_indexer_non_unique(["b", "b"])
6072+
(array([1, 3, 4, 1, 3, 4]), array([], dtype=int64))
6073+
6074+
In the example below there are no matched values.
6075+
6076+
>>> index = pd.Index(["c", "b", "a", "b", "b"])
6077+
>>> index.get_indexer_non_unique(["q", "r", "t"])
6078+
(array([-1, -1, -1]), array([0, 1, 2]))
6079+
6080+
For this reason, the returned ``indexer`` contains only integers equal to -1.
6081+
It demonstrates that there's no match between the index and the ``target``
6082+
values at these positions. The mask [0, 1, 2] in the return value shows that
6083+
the first, second, and third elements are missing.
6084+
6085+
Notice that the return value is a tuple contains two items. In the example
6086+
below the first item is an array of locations in ``index``. The second
6087+
item is a mask shows that the first and third elements are missing.
6088+
6089+
>>> index = pd.Index(["c", "b", "a", "b", "b"])
6090+
>>> index.get_indexer_non_unique(["f", "b", "s"])
6091+
(array([-1, 1, 3, 4, -1]), array([0, 2]))
6092+
"""
60006093
target = self._maybe_cast_listlike_indexer(target)
60016094

60026095
if not self._should_compare(target) and not self._should_partial_index(target):

0 commit comments

Comments
 (0)