Skip to content

BUG: Fix Series.str.contains with compiled regex on Arrow string dtype #61946

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
f3829fd
BUG : Fix Series.str.contains with compiled regex on Arrow string
Aniketsy Jul 25, 2025
c2a64fa
BUG: Fix handling of compiled regex in Series.str.contains for Arrow-…
Aniketsy Jul 25, 2025
838b1c5
BUG: Fix handling of compiled regex in Series.str.contains for Arrow-…
Aniketsy Jul 25, 2025
563f1f1
STYLE: Fix formatting and docstring issues in str.contains
Aniketsy Jul 25, 2025
fda5619
Fixed ruff format
Aniketsy Jul 25, 2025
324e609
Move fix into _str_contains of ArrowExtensionArray
Aniketsy Jul 26, 2025
b474604
Move fix into _str_contains of ArrowExtensionArray
Aniketsy Jul 26, 2025
3345bc7
Revert changes to pandas/core/strings/accessor.py from PR #61946
Aniketsy Jul 26, 2025
9f06042
Move fix into _str_contains of ArrowExtensionArray
Aniketsy Jul 26, 2025
cbab096
Move fix into _str_contains of ArrowExtensionArray
Aniketsy Jul 26, 2025
d88f8d1
BUG: Fix Series.str.contains with compiled regex and arrow strings (#…
Aniketsy Jul 28, 2025
a0decbc
Revert changes to pandas/core/arrays/arrow/array.py in PR
Aniketsy Jul 28, 2025
8fc81e0
BUG: Fix Series.str.contains with compiled regex on Arrow string dtyp…
Aniketsy Jul 28, 2025
8e226cd
BUG: Fix Series.str.contains with compiled regex on Arrow string dtyp…
Aniketsy Jul 28, 2025
6768fb1
BUG: Fix Series.str.contains with compiled regex on Arrow string dtyp…
Aniketsy Jul 29, 2025
05ae24f
BUG: Fix Series.str.contains with compiled regex on Arrow string dtyp…
Aniketsy Jul 29, 2025
702384d
Revert changes to test_strings.py
Aniketsy Jul 29, 2025
0be9a18
BUG: Fix Series.str.contains with compiled regex on Arrow string dtyp…
Aniketsy Jul 29, 2025
4ddc7db
BUG: Fix Series.str.contains with compiled regex on Arrow string dtyp…
Aniketsy Jul 29, 2025
9a7e640
BUG: Fix Series.str.contains with compiled regex on Arrow string dtyp…
Aniketsy Jul 29, 2025
b00fbe0
BUG: Fix Series.str.contains with compiled regex on Arrow string dtyp…
Aniketsy Jul 29, 2025
76f741c
Revert test_strings.py changes and remove accidental whatsnew file
Aniketsy Jul 29, 2025
8e65078
Revert test_strings.py changes and remove accidental whatsnew file
Aniketsy Jul 29, 2025
4912758
Merge remote-tracking branch 'upstream/main' into fix-arrow-contains-…
Aniketsy Jul 30, 2025
0e620ca
BUG: Fix Series.str.contains with compiled regex on Arrow string dtyp…
Aniketsy Jul 30, 2025
915b38f
BUG: Fix Series.str.contains with compiled regex on Arrow string dtyp…
Aniketsy Jul 30, 2025
ff10123
Fix test_str_contains_compiled_regex_arrow_dtype to properly handle d…
Aniketsy Aug 13, 2025
da5f363
Fix test_str_contains_compiled_regex_arrow_dtype to properly handle d…
Aniketsy Aug 13, 2025
966f43a
Fix test_str_contains_compiled_regex_arrow_dtype to properly handle d…
Aniketsy Aug 13, 2025
a3c2a82
Merge remote-tracking branch 'upstream/main' into fix-arrow-contains-…
jorisvandenbossche Aug 14, 2025
bb4319b
consistent test name
jorisvandenbossche Aug 14, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions doc/source/whatsnew/v2.3.2.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,8 @@ Bug fixes
"string" type in the JSON Table Schema for :class:`StringDtype` columns
(:issue:`61889`)
- Boolean operations (``|``, ``&``, ``^``) with bool-dtype objects on the left and :class:`StringDtype` objects on the right now cast the string to bool, with a deprecation warning (:issue:`60234`)
- Fixed ``~Series.str.match`` and ``~Series.str.fullmatch`` with compiled regex
for the Arrow-backed string dtype (:issue:`61964`)
- Fixed ``~Series.str.match``, ``~Series.str.fullmatch`` and ``~Series.str.contains``
with compiled regex for the Arrow-backed string dtype (:issue:`61964`, :issue:`61942`)

.. ---------------------------------------------------------------------------
.. _whatsnew_232.contributors:
Expand Down
2 changes: 2 additions & 0 deletions pandas/core/arrays/string_arrow.py
Original file line number Diff line number Diff line change
Expand Up @@ -346,6 +346,8 @@ def _str_contains(
):
if flags:
return super()._str_contains(pat, case, flags, na, regex)
if isinstance(pat, re.Pattern):
pat = pat.pattern

return ArrowStringArrayMixin._str_contains(self, pat, case, flags, na, regex)

Expand Down
13 changes: 13 additions & 0 deletions pandas/tests/strings/test_find_replace.py
Original file line number Diff line number Diff line change
Expand Up @@ -281,6 +281,19 @@ def test_contains_nan(any_string_dtype):
tm.assert_series_equal(result, expected)


def test_contains_compiled_regex(any_string_dtype):
# GH#61942
ser = Series(["foo", "bar", "baz"], dtype=any_string_dtype)
pat = re.compile("ba.")
result = ser.str.contains(pat)

expected_dtype = (
np.bool_ if is_object_or_nan_string_dtype(any_string_dtype) else "boolean"
)
expected = Series([False, True, True], dtype=expected_dtype)
tm.assert_series_equal(result, expected)


# --------------------------------------------------------------------------------------
# str.startswith
# --------------------------------------------------------------------------------------
Expand Down
Loading