[SC 11945] Demo notebook for agent testing documentation by AnilSorathiya · Pull Request #422 · validmind/validmind-library

AnilSorathiya · 2025-09-04T11:12:19Z

Pull Request Description

What and why?

This PR introduces a comprehensive Banking AI Agent demo using LangGraph and ValidMind integration, along with supporting framework improvements. The changes include:

Before: The agents directory contained multiple incomplete and redundant notebook demos with inconsistent implementations.
After: A single, comprehensive banking-specific AI agent demo with:

Complete LangGraph-based agent implementation for banking use cases
Specialized banking tools for credit risk, fraud detection, and account management
Comprehensive test dataset with 160+ banking scenarios
Proper ValidMind integration for agent validation and testing
Cleaned up codebase with removed duplicate/incomplete demos

Key additions:

langgraph_agent_simple_banking_demo.ipynb - Complete banking agent demo notebook
banking_tools.py - specialized banking tools (credit risk analyzer, fraud detection, etc.)
banking_test_dataset.py - Comprehensive test dataset with 160+ banking scenarios
Enhanced @scorer decorator functionality for better test integration
Updated dependencies for better compatibility (numpy, scipy, statsmodels constraints)

How to test

Run notebooks/agents/langgraph_agent_simple_banking_demo.ipynb

What needs special review?

Dependencies, breaking changes, and deployment notes

Release notes

Checklist

…gentic-model-in-vm-library

…-in-the-init-model-to

…nd-statistical-tests

…val-dataset-llmtestcase

johnwalz97

couple of nitpicks

pyproject.toml

validmind/tests/decorator.py

juanmleng · 2025-09-25T21:27:14Z

notebooks/agents/langgraph_agent_simple_banking_demo.ipynb

+    "\n",
+    "This notebook demonstrates how to build and validate a **banking-specific AI agent** using LangGraph integrated with ValidMind for comprehensive testing and monitoring.\n",
+    "\n",
+    "## What You'll Learn\n",


Maybe the intro should focus slightly more on how users can document and validate agents with ValidMind?

Agree. The intro section makes users believe that the objective it to build a sample AI agent but really the objective is to learn how to validate an AI agent.

A prerequisite is to have an AI agent to test/document, therefore the notebook also does that for the user to simplify things, using a very simple and hypothetical use case, but building the demo use case should not be the objective of the notebook.

notebooks/agents/langgraph_agent_simple_banking_demo.ipynb

juanmleng

Just left a couple of minor comments. Other than that, ltgm!

notebooks/code_samples/agents/langgraph_agent_simple_banking_demo.ipynb

pyproject.toml

johnwalz97

lgtm

github-actions · 2025-09-26T17:52:33Z

PR Summary

This pull request removes multiple legacy agent demo notebooks and associated code samples that were previously used to demonstrate the integration of LangChain, LangGraph, and banking-specific agent functionality with the ValidMind Library. In particular, the PR removes entire notebooks (e.g., the LangChain and LangGraph demo notebooks, both simple and complex versions, as well as the banking agent demo) and related utility modules (such as langchain_utils.py and agents/utils.py). In addition, some test decorators and a unit metric (RootMeanSquaredError) have been slightly refactored for conciseness. Overall, the changes are aimed at cleaning up outdated or redundant examples, thereby simplifying the codebase and reducing maintenance overhead.

Test Suggestions

Run the full suite of unit tests to ensure that removals have not inadvertently impacted any dependent modules.
Execute integration tests on ValidMind model and dataset functionality to verify that all tests still pass after the removal of old notebooks.
Perform manual smoke testing in environments using the updated library to ensure that new documentation and code paths function correctly.

AnilSorathiya added 30 commits June 24, 2025 11:18

support agent use case

1b3f67a

wrapper function for agent

723fcab

ragas metrics

28d9fbb

update ragas metrics

ecf8e09

fix lint error

53e8879

create helper functions

1662368

Merge branch 'main' into anilsorathiya/sc-10863/add-support-for-llm-a…

cc84cbc

…gentic-model-in-vm-library

delete old notebook

6f09780

update description for each section

0bb731e

simplify agent

e758979

simple demo notebook using langchain agent

7c35cfe

Update description of the simplified langgraph agent demo notebook

9bb70e9

add brief description to tests

894d52a

add brief description to tests

d86a9af

Allow dict return type predict_fn

884000f

update notebook and refactor utils

fbd5aa9

lint fix

daceabf

Merge branch 'main' into anilsorathiya/sc-11324/extend-the-predict-fn…

5f8823a

…-in-the-init-model-to

fix the test failure

70a5636

new unit tests for multiple columns return in assign_predictions

33b06fb

update notebooks to return multiple values in predict_fn

8e12bd2

general plotting and stats tests

e38929d

clear output

e900a65

Merge branch 'main' into anilsorathiya/sc-11380/add-generlize-plots-a…

a08e881

…nd-statistical-tests

remove duplicate tests

16f4700

update notebook

bb9f9af

Integration between deepeval and validmind

5078a7a

Merge branch 'main' into anilsorathiya/sc-11452/support-for-the-deepe…

2eb6abb

…val-dataset-llmtestcase

add MetricValues class for metric return type

ad0b719

Return MetricValues in the unit tests

94ca006

AnilSorathiya added 7 commits September 23, 2025 14:08

add copyright statement

5c1f604

update pyproject for numpy dependency

6766adb

update banking tools

61d1b20

add more cases to dataset

4f44143

more tests in demo notebook

8053fa3

fix tests

f7ee917

remove old notebooks

ef1bd83

AnilSorathiya requested review from cachafla, johnwalz97 and juanmleng September 24, 2025 18:29

update notebook

c9bd908

johnwalz97 requested changes Sep 25, 2025

View reviewed changes

pyproject.toml Outdated Show resolved Hide resolved

validmind/tests/decorator.py Outdated Show resolved Hide resolved

AnilSorathiya added 2 commits September 25, 2025 16:59

move import top of the file

48d5504

update notebook

4bb7f96

juanmleng reviewed Sep 25, 2025

View reviewed changes

notebooks/agents/langgraph_agent_simple_banking_demo.ipynb Outdated Show resolved Hide resolved

juanmleng reviewed Sep 25, 2025

View reviewed changes

notebooks/agents/langgraph_agent_simple_banking_demo.ipynb Outdated Show resolved Hide resolved

juanmleng approved these changes Sep 25, 2025

View reviewed changes

cachafla reviewed Sep 26, 2025

View reviewed changes

notebooks/code_samples/agents/langgraph_agent_simple_banking_demo.ipynb Show resolved Hide resolved

cachafla reviewed Sep 26, 2025

View reviewed changes

pyproject.toml Show resolved Hide resolved

AnilSorathiya added 2 commits September 26, 2025 09:42

uncommit the package dependency change

47cf614

update introduction and move notebook

80d258a

johnwalz97 approved these changes Sep 26, 2025

View reviewed changes

AnilSorathiya added 3 commits September 26, 2025 17:14

log dataset

6d59490

uncommit poetry.lock

a82c7c2

remove output

15d57f1

AnilSorathiya marked this pull request as ready for review September 26, 2025 22:12

AnilSorathiya merged commit 745993e into main Sep 26, 2025
18 checks passed

AnilSorathiya deleted the anilsorathiya/sc-11945/demo-notebook-for-agent-testing-documentation branch September 26, 2025 22:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SC 11945] Demo notebook for agent testing documentation#422

[SC 11945] Demo notebook for agent testing documentation#422
AnilSorathiya merged 68 commits intomainfrom
anilsorathiya/sc-11945/demo-notebook-for-agent-testing-documentation

AnilSorathiya commented Sep 4, 2025 •

edited

Loading

Uh oh!

johnwalz97 left a comment

Uh oh!

Uh oh!

Uh oh!

juanmleng Sep 25, 2025

Uh oh!

cachafla Sep 26, 2025

Uh oh!

Uh oh!

Uh oh!

juanmleng left a comment

Uh oh!

Uh oh!

Uh oh!

johnwalz97 left a comment

Uh oh!

github-actions bot commented Sep 26, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

AnilSorathiya commented Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Description

What and why?

How to test

What needs special review?

Dependencies, breaking changes, and deployment notes

Release notes

Checklist

Uh oh!

johnwalz97 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

juanmleng Sep 25, 2025

Choose a reason for hiding this comment

Uh oh!

cachafla Sep 26, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

juanmleng left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

johnwalz97 left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Sep 26, 2025

PR Summary

Test Suggestions

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

AnilSorathiya commented Sep 4, 2025 •

edited

Loading