[SC 11945] Demo notebook for agent testing documentation#422
Conversation
…gentic-model-in-vm-library
…-in-the-init-model-to
…nd-statistical-tests
…val-dataset-llmtestcase
| "\n", | ||
| "This notebook demonstrates how to build and validate a **banking-specific AI agent** using LangGraph integrated with ValidMind for comprehensive testing and monitoring.\n", | ||
| "\n", | ||
| "## What You'll Learn\n", |
There was a problem hiding this comment.
Maybe the intro should focus slightly more on how users can document and validate agents with ValidMind?
There was a problem hiding this comment.
Agree. The intro section makes users believe that the objective it to build a sample AI agent but really the objective is to learn how to validate an AI agent.
A prerequisite is to have an AI agent to test/document, therefore the notebook also does that for the user to simplify things, using a very simple and hypothetical use case, but building the demo use case should not be the objective of the notebook.
juanmleng
left a comment
There was a problem hiding this comment.
Just left a couple of minor comments. Other than that, ltgm!
PR SummaryThis pull request removes multiple legacy agent demo notebooks and associated code samples that were previously used to demonstrate the integration of LangChain, LangGraph, and banking-specific agent functionality with the ValidMind Library. In particular, the PR removes entire notebooks (e.g., the LangChain and LangGraph demo notebooks, both simple and complex versions, as well as the banking agent demo) and related utility modules (such as langchain_utils.py and agents/utils.py). In addition, some test decorators and a unit metric (RootMeanSquaredError) have been slightly refactored for conciseness. Overall, the changes are aimed at cleaning up outdated or redundant examples, thereby simplifying the codebase and reducing maintenance overhead. Test Suggestions
|
Pull Request Description
What and why?
This PR introduces a comprehensive Banking AI Agent demo using LangGraph and ValidMind integration, along with supporting framework improvements. The changes include:
Before: The agents directory contained multiple incomplete and redundant notebook demos with inconsistent implementations.
After: A single, comprehensive banking-specific AI agent demo with:
Key additions:
How to test
Run
notebooks/agents/langgraph_agent_simple_banking_demo.ipynbWhat needs special review?
Dependencies, breaking changes, and deployment notes
Release notes
Checklist