[SC-13352] Small fixes to TrainingTestDegradation test by juanmleng · Pull Request #452 · validmind/validmind-library

juanmleng · 2025-11-25T13:44:22Z

Pull Request Description

What and why?

What

Removed incorrect references to "accuracy" metric from the docstring. The test only evaluates precision, recall, and f1-score per class, not accuracy.

Why

The test docstring is used as context for LLM-based test descriptions. The inaccuracy in the docstring causes the LLM to incorrectly state that accuracy is being computed, which results in low faithfulness scores when evaluating the generated descriptions against the actual test implementation.

How to test

Run the TrainingTestDegradation test—using the customer churn or the credit risk scorecard notebook—and check that the generated description does not reference accuracy as one of the computed metrics.

Before:

After:

What needs special review?

Dependencies, breaking changes, and deployment notes

Release notes

Checklist

github-actions · 2025-11-25T13:44:54Z

PR Summary

This PR updates the documentation within the TrainingTestDegradation test file by removing references to the accuracy metric. The changes clarify that the test now focuses solely on precision, recall, and f1 score to evaluate the degradation between the training and test datasets. Additionally, the explanation of the threshold for acceptable degradation was updated to explicitly include the default value (0.10). These modifications aim to improve clarity and consistency in the test's description without altering the underlying test logic.

Test Suggestions

Run the tests to verify that model performance is only evaluated using precision, recall, and f1 score.
Validate that the output table correctly lists the train score, test score, degradation percentage, and pass/fail status for the specified metrics.
Check that documentation changes are reflected accurately in any auto-generated test reports or rendered documentation views.

AnilSorathiya

thanks!

Updated docstring

880be93

juanmleng self-assigned this Nov 25, 2025

juanmleng added bug Something isn't working internal Not to be externalized in the release notes labels Nov 25, 2025

juanmleng requested a review from AnilSorathiya November 25, 2025 14:36

AnilSorathiya approved these changes Nov 25, 2025

View reviewed changes

juanmleng merged commit 0b849ec into main Nov 25, 2025
19 checks passed

juanmleng deleted the juan/sc-13352/small-fixes-to-training-test-degradation-test branch November 25, 2025 15:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SC-13352] Small fixes to TrainingTestDegradation test#452

[SC-13352] Small fixes to TrainingTestDegradation test#452
juanmleng merged 1 commit intomainfrom
juan/sc-13352/small-fixes-to-training-test-degradation-test

juanmleng commented Nov 25, 2025

Uh oh!

github-actions bot commented Nov 25, 2025

Uh oh!

AnilSorathiya left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

juanmleng commented Nov 25, 2025

Pull Request Description

What and why?

What

Why

How to test

What needs special review?

Dependencies, breaking changes, and deployment notes

Release notes

Checklist

Uh oh!

github-actions bot commented Nov 25, 2025

PR Summary

Test Suggestions

Uh oh!

AnilSorathiya left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants