Skip to content

[SC-9912] Fix index issue when using input_grid#335

Merged
AnilSorathiya merged 2 commits intomainfrom
juan/sc-9112/fix-indices-error-in-weakspots-diagnosis-when-using-input_grid
Mar 11, 2025
Merged

[SC-9912] Fix index issue when using input_grid#335
AnilSorathiya merged 2 commits intomainfrom
juan/sc-9112/fix-indices-error-in-weakspots-diagnosis-when-using-input_grid

Conversation

@juanmleng
Copy link
Contributor

Internal Notes for Reviewers

Fix InvalidIndexError: Non-unique indices during DataFrame concatenation. The error occurs because pandas is trying to concatenate DataFrames where the indices are not unique, which happens when we're combining results from multiple features and their bins.

External Release Notes

@juanmleng juanmleng added the bug Something isn't working label Mar 11, 2025
@juanmleng juanmleng self-assigned this Mar 11, 2025
@juanmleng juanmleng added the internal Not to be externalized in the release notes label Mar 11, 2025
@github-actions
Copy link
Contributor

PR Summary

This pull request refactors the WeakspotsDiagnosis function in the WeakspotsDiagnosis.py file to improve data handling and metric computation. The key changes include:

  1. Renaming Columns: The column previously named "Shape" has been renamed to "Number of Records" to better reflect its content.

  2. DataFrame Initialization: The results DataFrames results_1 and results_2 are now initialized outside the loop, allowing for the accumulation of results across iterations.

  3. Refactoring Metric Computation: The metric computation logic has been refactored to use temporary dictionaries r1 and r2 for storing intermediate results, which are then concatenated into the main results DataFrames.

  4. Improved Sorting and Indexing: The final DataFrame is now reset for its index and sorted by "Feature" and "Dataset" columns, ensuring a consistent and clear output format.

These changes aim to enhance the clarity and efficiency of the code, particularly in how results are accumulated and presented.

Test Suggestions

  • Verify that the column name change from 'Shape' to 'Number of Records' is correctly reflected in all outputs.
  • Test the WeakspotsDiagnosis function with datasets containing both categorical and continuous features to ensure correct binning and metric computation.
  • Check that the final DataFrame output is correctly sorted by 'Feature' and 'Dataset'.
  • Ensure that the function handles empty dataframes correctly without errors.
  • Validate that the visual assessment logic for weak spots remains functional and accurate.

Copy link
Contributor

@cachafla cachafla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

@AnilSorathiya AnilSorathiya merged commit b7f49a3 into main Mar 11, 2025
6 checks passed
@johnwalz97 johnwalz97 deleted the juan/sc-9112/fix-indices-error-in-weakspots-diagnosis-when-using-input_grid branch August 20, 2025 17:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working internal Not to be externalized in the release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants