-
-
Notifications
You must be signed in to change notification settings - Fork 19.2k
ENH: add ignore index logic for df.isin() function #62620
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
dd28cec to
57284f0
Compare
|
pyodide unit test build fail isn't relevant to the changes I have proposed and seems to be a general issue affecting other PRs too |
57284f0 to
9cce864
Compare
|
@saakshimore, thanks for working on this. Could you please add a note in Regarding your tests:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| raise ValueError("cannot compute isin with a duplicate axis.") | ||
| result = self.eq(values.reindex_like(self)) | ||
| if ignore_index: | ||
| result = self.isin(values.to_dict("list")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Converting a DataFrame to a Python dictionary is going to be a huge performance hit, and may introduce different comparison semantics than what pandas offers naturally. Is there a way to accomplish this using the built-in indexers, much like the not ignore_index case?
doc/source/whatsnew/vX.X.X.rstfile if fixing a bug or adding a new feature.Added ignore_index feature for df.isin() as part of #7258. Next step would be to add the check_entire_row param.