Skip to content

training: Updated Validator Fundamentals courses#713

Merged
validbeck merged 70 commits intomainfrom
beck/sc-8468/update-validator-fundamentals-course
May 21, 2025
Merged

training: Updated Validator Fundamentals courses#713
validbeck merged 70 commits intomainfrom
beck/sc-8468/update-validator-fundamentals-course

Conversation

@validbeck
Copy link
Collaborator

@validbeck validbeck commented May 12, 2025

Internal Notes for Reviewers

sc-8468

Validator Fundamentals

REGISTRATION PAGE PREVIEW — 🚨 SEE COURSE MODULE PREVIEW DISCLAIMER BELOW!!!! 🚨

  • Like the Developer Fundamentals, I updated this course to be much more comprehensive yet straightforward.
  • There are now updated versions of the login instructions, individual module learning outcomes, module summaries, additional slides for things not previously covered, etc.
  • I thoroughly tested & timed these courses to make sure the instructions and the estimates made sense.

Course modules

🚨 NO LIVE PREVIEW 🚨 — AS THE IFRAME EMBEDS PULL FROM THE ROOT OF THE DOCS-DEMO FOLDER, YOU'LL NEED TO PULL DOWN THIS PR TO REVIEW & TEST THE TRAINING AS THE ONLINE HOSTED VERSION WON'T DRAW FROM THE CORRECT NOTEBOOK FILES:

  1. Retrieve the branch & load the preview:
gh pr checkout 713
cd site
quarto preview
  1. Navigate to Training > Validator Fundamentals.

Notebook execution

Important

I pulled in the latest notebook edits from validmind/validmind-library#370 branch as I needed the updated credentials information in the notebook as well. As a result, we'll want to revert the notebooks/ folder back to main if we call make get-source again until that PR is out on a release tag.

tests/ and validmind/ have been reverted to main as to not prematurely update them.

Proof that the development and validation execute notebooks both work even if the nested PR iframe can't display them properly:

Developer Validator
Set up the ValidMind LIbrary Set up the ValidMind Library for validation
Start the model development process Start the model validation process
Integrate custom tests Developing a potential challenger model
Finalize testing and documentation Finalize testing and reporting

Makefile

Adjusted the Makefile to simply copy over the notebooks/tutorials/ folder into a notebooks/EXECUTED/ directory including the model_validation courses:

@echo "Duplicating all files from notebooks/tutorials/model_validation/ for execution"

Also updated the backup make execute command:

execute:

Now, the command asks you for which profile you want, which set of notebooks you want executed (or a custom path), and then asks you for your credentials (as you need separate developer/validator credentials to execute each set of notebooks without errors) to create a temporary .env file:

❯ make execute
Select PROFILE:
1) exe-demo
2) exe-staging
3) exe-prod
Enter choice [1-3] (default: 1): 1
Select FILE_PATH:
1) notebooks/EXECUTED/model_development
2) notebooks/EXECUTED/model_validation
3) Enter custom filepath
Enter choice [1-3]: 1
Enter VM_API_HOST (default: https://api.prod.validmind.ai/api/v1/tracking): 
Enter VM_API_KEY: ***
Enter VM_API_SECRET: ***
Enter VM_API_MODEL: ***

Warning

When you test this script, you may run into the following Lua warning:

WARNING (/Applications/quarto/share/filters/main.lua:9521) Unable to parse table from raw html block: skipping.

This is just a warning, and doesn't actually affect the render of the executed notebooks. (Try it, you may see an error in the .ipynb file, but not in the live .html output.)

GitHub workflows

These have now been adjusted to accommodate for executing the validation notebooks alongside the development ones:

Demo/PRs Staging Prod
- name: Create valid.env file
- name: Create valid.env file
- name: Create valid.env file
- name: Execute ONLY the ValidMind for model validation series with heap development
- name: Execute ONLY the ValidMind for model validation series with heap staging
- name: Execute ONLY the ValidMind for model validation series with heap production
How it works

Code samples from the PR preview/validate workflow, the functionality has been replicated across all workflows/actions:

  1. I added an additional GitHub secret called PLATFORM_VALID_MODEL for the ValidMind Academy validation model.
  2. Now the workflows create two temporary .env files: dev.env. & valid.env from the PLATFORM_DEV_MODEL & PLATFORM_VALID_MODEL respectively:
    # If yes then create the dev.env file for use in execution step
    - name: Create dev.env file
      if: steps.filter.outputs.notebooks == 'true'
      id: create_dev_env
      run: |
        touch dev.env
        echo VM_API_HOST=${{ secrets.PLATFORM_API_HOST }} >> dev.env
        echo VM_API_KEY=${{ secrets.PLATFORM_API_KEY }} >> dev.env
        echo VM_API_SECRET=${{ secrets.PLATFORM_API_SECRET }} >> dev.env
        echo VM_API_MODEL=${{ secrets.PLATFORM_DEV_MODEL }} >> dev.env
        cat dev.env

    # If yes then create the valid.env file for use in execution step
    - name: Create valid.env file
      if: steps.filter.outputs.notebooks == 'true'
      id: create_valid_env
      run: |
        touch valid.env
        echo VM_API_HOST=${{ secrets.PLATFORM_API_HOST }} >> valid.env
        echo VM_API_KEY=${{ secrets.PLATFORM_API_KEY }} >> valid.env
        echo VM_API_SECRET=${{ secrets.PLATFORM_API_SECRET }} >> valid.env
        echo VM_API_MODEL=${{ secrets.PLATFORM_VALID_MODEL }} >> valid.env
        cat valid.env
  1. The two .envs are passed in as inputs into the composite action for executing the notebook:
    # Only execute the demo notebooks for training if .env files are created
    - name: Execute demo ValidMind for model development and validation series
      if: ${{ vars.ENABLE_DEMO_NOTEBOOK == 'true' && steps.create_dev_env.outcome == 'success' && steps.create_valid_env.outcome == 'success' }}
      uses: ./.github/actions/demo-notebook
      id: execute-demo-notebook
      with:
        dev_env: dev.env
        valid_env: valid.env
  1. The composite action checks for the required .env inputs:
inputs:
  dev_env:
    description: "Load the created dev.env file"
    required: true
  valid_env:
    description: "Load the created valid.env file"
    required: true
  1. Then executes the notebooks with each respective .env file, temporarily copying it into an .env (without a prefix/filename format) so that the notebooks don't throw an .env not found error:
    - name: Ensure dev.env file is available
      shell: bash
      id: find_dev_env
      run: |
        if [ ! -f "${{ inputs.dev_env }}" ]; then
          echo "Error: dev.env file not found at ${{ inputs.dev_env }}"
          exit 1
        fi

    - name: Execute ONLY the ValidMind for model development series with heap development
      shell: bash
      if: ${{ steps.find_dev_env.outcome == 'success' }}
      run: |
          cd site
          cp ../${{ inputs.dev_env }} ../.env
          source ../.env
          quarto render --profile exe-demo notebooks/EXECUTED/model_development &> render_errors.log || {
            echo "Execute for ValidMind for model development series failed";
            cat render_errors.log;
            exit 1;
          }

    - name: Ensure valid.env file is available
      shell: bash
      id: find_valid_env
      run: |
        if [ ! -f "${{ inputs.valid_env }}" ]; then
          echo "Error: valid.env file not found at ${{ inputs.valid_env }}"
          exit 1
        fi

    - name: Execute ONLY the ValidMind for model validation series with heap development
      shell: bash
      if: ${{ steps.find_valid_env.outcome == 'success' }}
      run: |
          cd site
          cp ../${{ inputs.valid_env }} ../.env
          source ../.env
          quarto render --profile exe-demo notebooks/EXECUTED/model_validation &> render_errors.log || {
            echo "Execute for ValidMind for model validation series failed";
            cat render_errors.log;
            exit 1;
          }

Note

It does up the runtime, because the validator notebooks have A LOT of code cells that need to be run, but I added a caching for the environment used to execute the notebooks so it at least cuts down on some of that. Example from demo-notebook:

- name: Cache pip dependencies

Seeing as the filter is in place for PR previews (only execute when site/notebooks/EXECUTED/ is changed) and we push to staging/prod infrequently, I think this is an OK trade off until we replace the execution environment.

Before cache — 36m exe. step After cache — 30m exe. step
Workflow run Workflow run

Developer Fundamentals

REGISTRATION PAGE PREVIEW — 🚨 SEE COURSE MODULE PREVIEW DISCLAIMER ABOVE!!!! 🚨

I gave these another proofread/quality-check and edited some minor errors or screen real estate.

External Release Notes

ValidMind Academy

Validator Fundamentals

With the goal of delivering more approachable and comprehensive training, we've overhauled our Validator Fundamentals course into four modules with clear learning outcomes:

  1. Module 1 — Using ValidMind for Model Validation
  2. Module 2 — Running Data Quality Tests
  3. Module 3 — Developing Challenger Models
  4. Module 4 — Finalizing Validation Reports

This training series walks you through the basics of using ValidMind to validate datasets and models, from running and logging tests with the ValidMind Library to finalizing your model's validation report in the ValidMind Platform. You'll learn how to use ValidMind to automate testing, populate evidence in your validation report, track finding resolution, and more — register to take the course now!

@github-actions
Copy link
Contributor

A PR preview is available: Preview URL

1 similar comment
@github-actions
Copy link
Contributor

A PR preview is available: Preview URL

@validbeck validbeck added documentation Improvements or additions to documentation highlight Feature to be curated in the release notes and removed internal Not to be externalized in the release notes labels May 13, 2025
@validbeck validbeck changed the title [DRAFT] Updated Validator Fundamentals training training: Updated Validator Fundamentals courses May 13, 2025
@github-actions
Copy link
Contributor

A PR preview is available: Preview URL

@validbeck validbeck requested a review from nrichers May 13, 2025 20:14
@github-actions
Copy link
Contributor

A PR preview is available: Preview URL

Copy link
Collaborator

@nrichers nrichers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another great update to our training content, @validbeck! 👏 👏👏 I especially enjoyed reading through the challenger model section, nicely done!

Some random observations:

  • I'd noticed that the UI for adding content block UI had changed and we needed to update our animated GIF and here you are, you've already done it. Thank you!
  • The "Collaborate with others" section really needs to be more about governance & workflows — of which comments are a small part — but that's a future update.

It does up the runtime, because the validator notebooks have A LOT of code cells that need to be run, but I added a caching for the environment used to execute the notebooks so it at least cuts down on some of that.

We should perhaps consider paralleling the notebook execution rather than inlining it as one of the "Validate docs site" steps, and then syncing that to S3 after notebook execution. When I experimented with the Lighthouse checks, I ended up doing the same.

@validbeck
Copy link
Collaborator Author

I'd noticed that the UI for adding content block UI had changed and we needed to update our animated GIF and here you are, you've already done it. Thank you!

Yes, this was taken care of in #707.

The "Collaborate with others" section really needs to be more about governance & workflows — of which comments are a small part — but that's a future update.

I think for "fundamentals" this is sufficient — I know we are thinking of adding a governance course that covers more of the governance features more in depth. This course is already huge, so I vote that we build materials for that supplementary or related course instead here. Either way, it's out of the scope of this current initiative.

We should perhaps consider paralleling the notebook execution rather than inlining it as one of the "Validate docs site" steps, and then syncing that to S3 after notebook execution. When I experimented with the Lighthouse checks, I ended up doing the same.

Great idea — can you start a Story for this since it looks like you have some suggestions/proven methods that work here? 🙏🏻

@github-actions
Copy link
Contributor

A PR preview is available: Preview URL

@github-actions
Copy link
Contributor

A PR preview is available: Preview URL

1 similar comment
@github-actions
Copy link
Contributor

A PR preview is available: Preview URL

@validbeck validbeck merged commit 4990cdb into main May 21, 2025
3 checks passed
@validbeck validbeck deleted the beck/sc-8468/update-validator-fundamentals-course branch May 21, 2025 18:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation highlight Feature to be curated in the release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants