-
-
Notifications
You must be signed in to change notification settings - Fork 18.9k
Open
Labels
BugNumeric OperationsArithmetic, Comparison, and Logical operationsArithmetic, Comparison, and Logical operationsStringsString extension data type and string dataString extension data type and string data
Milestone
Description
At the moment you can freely compare with mixed object dtype column:
>>> ser_string = pd.Series(["a", "b"])
>>> ser_mixed = pd.Series([1, "b"])
>>> ser_string == ser_mixed
0 False
1 True
dtype: bool
But with the string dtype enabled (using pyarrow), this now raises an error:
>>> pd.options.future.infer_string = True
>>> ser_string = pd.Series(["a", "b"])
>>> ser_mixed = pd.Series([1, "b"])
>>> ser_string == ser_mixed
...
File ~/scipy/repos/pandas/pandas/core/arrays/arrow/array.py:510, in ArrowExtensionArray._box_pa_array(cls, value, pa_type, copy)
...
--> 510 pa_array = pa.array(value, from_pandas=True)
...
ArrowInvalid: Could not convert 'b' with type str: tried to convert to int64
This happens because the ArrowEA tries to convert the other
operand to Arrow as well, which fails for mixed types.
In general, I think our rule is that ==
comparison never fails, but then just gives False for when values are not comparable.
Metadata
Metadata
Assignees
Labels
BugNumeric OperationsArithmetic, Comparison, and Logical operationsArithmetic, Comparison, and Logical operationsStringsString extension data type and string dataString extension data type and string data