Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
95bd0fd
Draft notebook for model validation quickstart
validbeck May 21, 2025
e8a46eb
Validator intro
validbeck May 21, 2025
a1e1746
Headings
validbeck May 21, 2025
bc75cc5
Champion model export for validation quickstart
validbeck May 21, 2025
f535c26
Verify data quality WIP
validbeck May 21, 2025
06923fd
Save point
validbeck May 21, 2025
a6438b0
Save point
validbeck May 21, 2025
2482178
Save point
validbeck May 21, 2025
9863af1
Edit
validbeck May 21, 2025
d81a049
Editing test example
validbeck May 21, 2025
8214ace
Data comparison tests
validbeck May 21, 2025
9eda2cb
Data comparison tests edit
validbeck May 21, 2025
b75ae8b
Import champion wip
validbeck May 21, 2025
8d0549a
Modified validator credentials for app scorecard
validbeck May 21, 2025
388ae20
Performance tests WIP
validbeck May 21, 2025
4e3ad3b
Performance tests edit
validbeck May 21, 2025
83e4743
Performance tests edit2
validbeck May 21, 2025
f94f06d
Diagnostic test WIP
validbeck May 21, 2025
f6ab1bd
Feature importance tests WIP
validbeck May 21, 2025
2868b51
Editing...
validbeck May 22, 2025
15f12bf
More editing
validbeck May 22, 2025
b65ab59
ToC
validbeck May 22, 2025
47a66ad
Editing
validbeck May 22, 2025
1a80c53
More context to validation series
validbeck May 22, 2025
e3967b6
Added extended Next steps to model development series
validbeck May 22, 2025
58c5310
Broken link, cleanup
validbeck May 22, 2025
60d6c35
Tweak
validbeck May 22, 2025
7e5e83d
Update notebooks/quickstart/quickstart_model_validation.ipynb
validbeck May 27, 2025
62dd32e
Update notebooks/quickstart/quickstart_model_validation.ipynb
validbeck May 27, 2025
e456d97
Update notebooks/quickstart/quickstart_model_validation.ipynb
validbeck May 27, 2025
31c07d4
Update notebooks/quickstart/quickstart_model_validation.ipynb
validbeck May 27, 2025
f674971
Update notebooks/quickstart/quickstart_model_validation.ipynb
validbeck May 27, 2025
5c2586a
Update notebooks/quickstart/quickstart_model_validation.ipynb
validbeck May 27, 2025
e614fb5
Update notebooks/quickstart/quickstart_model_validation.ipynb
validbeck May 27, 2025
3107dd4
Update notebooks/quickstart/quickstart_model_validation.ipynb
validbeck May 27, 2025
fbb4aee
Update notebooks/quickstart/quickstart_model_validation.ipynb
validbeck May 27, 2025
339d696
Update notebooks/quickstart/quickstart_model_validation.ipynb
validbeck May 27, 2025
b397956
Update notebooks/quickstart/quickstart_model_validation.ipynb
validbeck May 27, 2025
4a8e33f
Update notebooks/quickstart/quickstart_model_validation.ipynb
validbeck May 27, 2025
f5b4a85
Fixing JSON errors from suggestions
validbeck May 27, 2025
e5d98af
Fixing context around Juan's suggestions
validbeck May 27, 2025
88396e9
Removing unneeded validation dataset initialization
validbeck May 27, 2025
6248385
Proofreading Ama's suggestions
validbeck May 27, 2025
4bf5266
Readding the validation dataset lol oops
validbeck May 27, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -197,6 +197,8 @@ lending_club_loan_data_*.csv
!notebooks/code_samples/model_validation/xgb_model_champion.pkl
# Sample logistic regression model for validation series — do not remove!
!notebooks/tutorials/model_validation/lr_model_champion.pkl
# Sample XGBoost model for validation quickstart — do not remove!
!notebooks/quickstart/xgboost_model_champion.pkl

notebooks/llm/datasets/*.jsonl

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -202,13 +202,19 @@
"\n",
"In order to log tests as a validator instead of as a developer, on the model details page that appears after you've successfully registered your sample model:\n",
"\n",
"1. Remove yourself as a developer: \n",
"1. Remove yourself as a model owner: \n",
"\n",
" - Click on the **OWNERS** tile.\n",
" - Click the **x** next to your name to remove yourself from that model's role.\n",
" - Click **Save** to apply your changes to that role.\n",
"\n",
"2. Remove yourself as a developer: \n",
"\n",
" - Click on the **DEVELOPERS** tile.\n",
" - Click the **x** next to your name to remove yourself from that model's role.\n",
" - Click **Save** to apply your changes to that role.\n",
"\n",
"2. Add yourself as a validator: \n",
"3. Add yourself as a validator: \n",
"\n",
" - Click on the **VALIDATORS** tile.\n",
" - Select your name from the drop-down menu.\n",
Expand Down Expand Up @@ -1358,9 +1364,9 @@
"\n",
"## Run diagnostic tests\n",
"\n",
"Next we want to inspect the robustness and stability testing comparison between our champion and challenger model.\n",
"Next, we want to inspect the robustness and stability testing comparison between our champion and challenger model.\n",
"\n",
"Use `list_tests()` to identify all the model diagnosis tests for classification:"
"Use `list_tests()` to list all available diagnosis tests applicable to classification tasks:"
]
},
{
Expand Down
4 changes: 2 additions & 2 deletions notebooks/quickstart/quickstart_model_documentation.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -617,7 +617,7 @@
"\n",
"### Assign predictions\n",
"\n",
"Once the model has been registered you can assign model predictions to the training and testing datasets.\n",
"Once the model has been registered, you can assign model predictions to the training and testing datasets.\n",
"\n",
"- The [`assign_predictions()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#assign_predictions) from the `Dataset` object can link existing predictions to any number of models.\n",
"- This method links the model's class prediction values and probabilities to our `vm_train_ds` and `vm_test_ds` datasets.\n",
Expand Down Expand Up @@ -755,7 +755,7 @@
"\n",
"2. In the left sidebar that appears for your model, click **Documentation**.\n",
"\n",
"What you see is the full draft of your model documentation in a more easily consumable version. From here, you can make qualitative edits to model documentation, view guidelines, collaborate with validators, and submit your model documentation for approval when it's ready. [Learn more ...](https://docs.validmind.ai/guide/working-with-model-documentation.html)"
" What you see is the full draft of your model documentation in a more easily consumable version. From here, you can make qualitative edits to model documentation, view guidelines, collaborate with validators, and submit your model documentation for approval when it's ready. [Learn more ...](https://docs.validmind.ai/guide/working-with-model-documentation.html)"
]
},
{
Expand Down
1,174 changes: 1,174 additions & 0 deletions notebooks/quickstart/quickstart_model_validation.ipynb

Large diffs are not rendered by default.

Binary file added notebooks/quickstart/xgboost_model_champion.pkl
Binary file not shown.
Original file line number Diff line number Diff line change
Expand Up @@ -896,7 +896,19 @@
"\n",
"### Work with your model documentation\n",
"\n",
"Now that you've logged all your test results and generated a draft for your model documentation, head to the ValidMind Platform to make qualitative edits, view guidelines, collaborate with validators, and submit your model documentation for approval when it's ready. **Learn more:** [Working with model documentation](https://docs.validmind.ai/guide/model-documentation/working-with-model-documentation.html)"
"Now that you've logged all your test results and generated a draft for your model documentation, head to the ValidMind Platform to wrap up your model documentation. Continue to work on your model documentation by:\n",
"\n",
"- **Run and log more tests:** Use the skills you learned in this series of notebooks to run and log more individual tests, including custom tests, then insert them into your documentation as supplementary evidence. (Learn more: [`validmind.tests`](https://docs.validmind.ai/validmind/validmind/tests.html))\n",
"\n",
"- **Inserting additional test results:** Add **Test-Driven Blocks** under any relevant section of your model documentation. (Learn more: [Work with test results](https://docs.validmind.ai/guide/model-documentation/work-with-test-results.html))\n",
"\n",
"- **Making qualitative edits to your test descriptions:** Click on the description of any inserted test results to review and edit the ValidMind-generated test descriptions for quality and accuracy. (Learn more: [Working with model documentation](https://docs.validmind.ai/guide/model-documentation/working-with-model-documentation.html#add-or-edit-documentation))\n",
"\n",
"- **View guidelines:** In any section of your model documentation, click **​ValidMind Insights** in the top right corner to reveal the Documentation Guidelines for each section to help guide the contents of your model documentation. (Learn more: [View documentation guidelines](https://docs.validmind.ai/guide/model-documentation/view-documentation-guidelines.html))\n",
"\n",
"- **Collaborate with other stakeholders:** Use the ValidMind Platform's real-time collaborative features to work seamlessly together with the rest of your organization, including model validators. Review suggested changes in your content blocks, work with versioned history, and use comments to discuss specific portions of your model documentation. (Learn more: [Collaborate with others](https://docs.validmind.ai/guide/model-documentation/collaborate-with-others.html))\n",
"\n",
"When your model documentation is complete and ready for review, submit it for approval from the same ValidMind Platform where you made your edits and collaborated with the rest of your organization, ensuring transparency and a thorough model development history. (Learn more: [Submit for approval](https://docs.validmind.ai/guide/model-documentation/submit-for-approval.html))"
]
},
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -564,7 +564,7 @@
"\n",
"## Documenting test results\n",
"\n",
"Now that we've done some analysis on two different datasets, we can use ValidMind to easily document why certain things were done to our raw data with testing to support it. As we learned above, every test result returned by the `run_test()` function has a `.log()` method that can be used to send the test results to the ValidMind Platform.\n",
"Now that we've done some analysis on two different datasets, we can use ValidMind to easily document why certain things were done to our raw data with testing to support it. Every test result returned by the `run_test()` function has a `.log()` method that can be used to send the test results to the ValidMind Platform.\n",
"\n",
"When logging validation test results to the platform, you'll need to manually add those results to the desired section of the validation report. To demonstrate how to add test results to your validation report, we'll log our data quality tests and insert the results via the ValidMind Platform."
]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -520,7 +520,7 @@
"\n",
"## Running model evaluation tests\n",
"\n",
"With everything ready for us, let's run the rest of our validation tests. We'll focus on comprehensive testing around model performance of both the champion and challenger models going forward as we've already verified the data quality of the datasets used to train the champion model."
"With our setup complete, let's run the rest of our validation tests. Since we have already verified the data quality of the dataset used to train our champion model, we will now focus on comprehensive performance evaluations of both the champion and challenger models."
]
},
{
Expand Down Expand Up @@ -584,7 +584,10 @@
"\n",
"#### Evaluate performance of the champion model\n",
"\n",
"Now, let's run and log our batch of model performance tests using our testing dataset (`vm_test_ds`) for our champion model:"
"Now, let's run and log our batch of model performance tests using our testing dataset (`vm_test_ds`) for our champion model:\n",
"\n",
"- The test set serves as a proxy for real-world data, providing an unbiased estimate of model performance since it was not used during training or tuning.\n",
"- The test set also acts as protection against selection bias and model tweaking, giving a final, more unbiased checkpoint."
]
},
{
Expand Down Expand Up @@ -725,9 +728,9 @@
"\n",
"### Run diagnostic tests\n",
"\n",
"Next we want to inspect the robustness and stability testing comparison between our champion and challenger model.\n",
"Next, we want to inspect the robustness and stability testing comparison between our champion and challenger model.\n",
"\n",
"Use `list_tests()` to identify all the model diagnosis tests for classification:"
"Use `list_tests()` to list all available diagnosis tests applicable to classification tasks:"
]
},
{
Expand All @@ -743,9 +746,12 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's see if models suffer from any *overfit* potentials and also where there are potential sub-segments of issues with the [`OverfitDiagnosis` test](https://docs.validmind.ai/tests/model_validation/sklearn/OverfitDiagnosis.html). \n",
"Let’s now assess the models for potential signs of *overfitting* and identify any sub-segments where performance may inconsistent with the [`OverfitDiagnosis` test](https://docs.validmind.ai/tests/model_validation/sklearn/OverfitDiagnosis.html).\n",
"\n",
"Overfitting occurs when a model learns the training data too well, capturing not only the true pattern but noise and random fluctuations resulting in excellent performance on the training dataset but poor generalization to new, unseen data:\n",
"\n",
"Overfitting occurs when a model learns the training data too well, capturing not only the true pattern but noise and random fluctuations resulting in excellent performance on the training dataset but poor generalization to new, unseen data."
"- Since the training dataset (`vm_train_ds`) was used to fit the model, we use this set to establish a baseline performance for how well the model performs on data it has already seen.\n",
"- The testing dataset (`vm_test_ds`) was never seen during training, and here simulates real-world generalization, or how well the model performs on new, unseen data. "
]
},
{
Expand All @@ -767,9 +773,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's also conduct *robustness* and *stability* testing of the two models with the [`RobustnessDiagnosis` test](https://docs.validmind.ai/tests/model_validation/sklearn/RobustnessDiagnosis.html).\n",
"Let's also conduct *robustness* and *stability* testing of the two models with the [`RobustnessDiagnosis` test](https://docs.validmind.ai/tests/model_validation/sklearn/RobustnessDiagnosis.html). Robustness refers to a model's ability to maintain consistent performance, and stability refers to a model's ability to produce consistent outputs over time across different data subsets.\n",
"\n",
"Robustness refers to a model's ability to maintain consistent performance, and stability refers to a model's ability to produce consistent outputs over time across different data subsets."
"Again, we'll use both the training and testing datasets to establish baseline performance and to simulate real-world generalization:"
]
},
{
Expand Down Expand Up @@ -811,6 +817,13 @@
"FI"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We'll only use our testing dataset (`vm_test_ds`) here, to provide a realistic, unseen sample that mimic future or production data, as the training dataset has already influenced our model during learning:"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1144,13 +1144,17 @@
"\n",
"- **Inserting additional test results:** Click **Link Evidence to Report** under any section of 2. Validation in your validation report. (Learn more: [Link evidence to reports](https://docs.validmind.ai/guide/model-validation/assess-compliance.html#link-evidence-to-reports))\n",
"\n",
"- **Making qualitative edits to your test descriptions:** Expand any linked evidence under Validator Evidence and click **See evidence details** to review and edit the ValidMind-generated test descriptions for quality and accuracy.\n",
"- **Making qualitative edits to your test descriptions:** Expand any linked evidence under Validator Evidence and click **See evidence details** to review and edit the ValidMind-generated test descriptions for quality and accuracy. (Learn more: [Preparing validation reports](https://docs.validmind.ai/guide/model-validation/preparing-validation-reports.html#get-started))\n",
"\n",
"- **Adding more findings:** Click **Link Finding to Report** in any validation report section, then click **+ Create New Finding**. (Learn more: [Add and manage model findings](https://docs.validmind.ai/guide/model-validation/add-manage-model-findings.html))\n",
"\n",
"- **Adding risk assessment notes:** Click under **Risk Assessment Notes** in any validation report section to access the text editor and content editing toolbar, including an option to generate a draft with AI. Edit your ValidMind-generated test descriptions (Learn more: [Work with content blocks](https://docs.validmind.ai/guide/model-documentation/work-with-content-blocks.html#content-editing-toolbar))\n",
"- **Adding risk assessment notes:** Click under **Risk Assessment Notes** in any validation report section to access the text editor and content editing toolbar, including an option to generate a draft with AI. Once generated, edit your ValidMind-generated test descriptions to adhere to your organization's requirements. (Learn more: [Work with content blocks](https://docs.validmind.ai/guide/model-documentation/work-with-content-blocks.html#content-editing-toolbar))\n",
"\n",
"- **Assessing compliance:** Under the Guideline for any validation report section, click **ASSESSMENT** and select the compliance status from the drop-down menu. (Learn more: [Provide compliance assessments](https://docs.validmind.ai/guide/model-validation/assess-compliance.html#provide-compliance-assessments))"
"- **Assessing compliance:** Under the Guideline for any validation report section, click **ASSESSMENT** and select the compliance status from the drop-down menu. (Learn more: [Provide compliance assessments](https://docs.validmind.ai/guide/model-validation/assess-compliance.html#provide-compliance-assessments))\n",
"\n",
"- **Collaborate with other stakeholders:** Use the ValidMind Platform's real-time collaborative features to work seamlessly together with the rest of your organization, including model developers. Propose suggested changes in the model documentation, work with versioned history, and use comments to discuss specific portions of the model documentation. (Learn more: [Collaborate with others](https://docs.validmind.ai/guide/model-documentation/collaborate-with-others.html))\n",
"\n",
"When your validation report is complete and ready for review, submit it for approval from the same ValidMind Platform where you made your edits and collaborated with the rest of your organization, ensuring transparency and a thorough model validation history. (Learn more: [Submit for approval](https://docs.validmind.ai/guide/model-documentation/submit-for-approval.html))"
]
},
{
Expand Down