fix kfintech conflict resolver dropping group columns#8
Merged
sandeeprjs92 merged 1 commit intomainfrom Apr 14, 2026
Merged
Conversation
resolve_kfintech_conflicts used groupby.apply with include_groups=False to silence a pandas 2.2 FutureWarning. That flag strips the group columns (transaction_number, folio_number) from the apply result, so the downstream composite_key builder crashed with KeyError on any KFintech file that reached this step. Rewrite as a manual row-index scan: loop over groups, collect indices of rows to keep, return df.loc[kept]. Keeps every column, no deprecated api, same semantics. Regression test in test_cleaner.py runs a minimal KFintech df through Cleaner.run and asserts the expected columns survive.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
resolve_kfintech_conflictsusedgroupby.apply(include_groups=False)to silence a pandas 2.2 FutureWarning. That flag also strips the group columns (transaction_number,folio_number) from the apply result, so the downstream composite-key builder crashed withKeyError: 'transaction_number'on any KFintech file that reached this step — tools/end_to_end_demo.py hit it onkfintech_sample.csv.Rewrite as a manual row-index scan: loop over groups, collect indices of rows to keep, return
df.loc[kept]. Keeps every column, no deprecated API, same semantics.Regression test in
test_cleaner.pyruns a minimal KFintech DataFrame throughCleaner.runand asserts thattransaction_number,folio_number, andcomposite_keyall survive.Verified end-to-end locally: 96 unit tests pass,
tools/end_to_end_demo.pyproduces 49 transactions across all three sample files (CAMS CSV, KFintech CSV, KFintech DBF).