Fix langgraph version in the agentic demo notebook #471

AnilSorathiya · 2026-01-27T16:15:51Z

Pull Request Description

What and why?

We don't have hard dependancy of langgraph library. Only this notebook uses to build an agent.

Add langgraph lib with version in the lib.
Change system prompt to generate output with 500 words. This will help failthfulness test to break sentences and analyse them
Remove toxicity test from notebook
Slim the size of test dataset to reduce execution time of tests

How to test

Run the notebook

What needs special review?

Dependencies, breaking changes, and deployment notes

Release notes

Checklist

cachafla · 2026-01-28T01:11:55Z

Merging this PR should fix the dependency failures: #470

validbeck · 2026-01-28T01:59:29Z

@AnilSorathiya Before you merge, I think the "AI Agent Evaluation Metrics > Execution Layer" is missing the scorer/score assignment for StepEfficiencyMetric — can you please add an example?

Edit

I see we're actually missing that test entirely. I've added it (and a section for the metric score in the working new version of the notebook) in my WIP branch here:

document_agentic_ai.ipynb (WIP)
scores/llm/deepeval/StepEfficiency.py
scorers/llm/deepeval/init.py:

validmind-library/validmind/scorers/llm/deepeval/__init__.py

Line 16 in 4f2fb51

from .StepEfficiency import StepEfficiency

When I have the PR up for review someone will need to confirm the test does what we want it to do as I'm not the expert there.

github-actions · 2026-01-28T11:47:32Z

PR Summary

This PR introduces several changes aimed at refining the banking-related notebooks and test datasets, along with minor code style improvements in the test loading module. Notable changes include:

In the banking test dataset file, one of the test cases for credit risk and additional tests for fraud detection were removed. This streamlines the dataset to focus on core account management and credit risk scenarios.
Within the banking demo notebook, the installation command now includes a fixed version for the 'langgraph' dependency, ensuring compatibility and predictability. Instructions within a code cell have been expanded to request that responses be detailed, user-friendly, and include a concise execution plan. This change is intended to guide users in producing more comprehensive output from the banking assistance agent.
The data handling in the notebook has been updated by replacing a sample filter with using the complete dataset, which may provide more consistent tests and performance evaluation.
Formatting improvements across the notebooks involve adjustments in the display of data frames (e.g., using vm_test_dataset._df instead of vm_test_dataset._df.head()) to better showcase the underlying data.
In the tests loading file, the function signature in _get_test_function_from_provider was reformatted and the error message was consolidated into one line for improved readability.

The overall functional changes focus on refining the test cases, clarifying output expectations of the LLM-driven banking responses, and ensuring that dependency versions are strictly managed for predictable behavior during agent execution.

Test Suggestions

Run the notebook cells to confirm that all cells execute successfully, particularly after the dependency version change and dataset modifications.
Verify that the removal of extra test cases from the banking dataset does not affect other modules dependent on the dataset.
Write unit tests for the _get_test_function_from_provider function to ensure that the correct exception is raised when a test provider is not found.
Test the LLM output to ensure that the newly added instructions (detailed explanation, 500+ words, concise execution plan) are adhered to in the generated responses.

AnilSorathiya · 2026-01-28T11:49:21Z

@AnilSorathiya Before you merge, I think the "AI Agent Evaluation Metrics > Execution Layer" is missing the scorer/score assignment for StepEfficiencyMetric — can you please add an example?

Edit

I see we're actually missing that test entirely. I've added it (and a section for the metric score in the working new version of the notebook) in my WIP branch here:

document_agentic_ai.ipynb (WIP)

scores/llm/deepeval/StepEfficiency.py

scorers/llm/deepeval/init.py:

validmind-library/validmind/scorers/llm/deepeval/__init__.py

Line 16 in 4f2fb51

from .StepEfficiency import StepEfficiency

When I have the PR up for review someone will need to confirm the test does what we want it to do as I'm not the expert there.

@validbeck StepEfficiency metric has been removed from our codebase as there is an underline bug in the deepeval.
Latest version of vm-lib doesn't have stepefficiency test file. I have cleanup descriptions in the notebooks that was missed by me.

add langgraph installation in the notebook

4515d5d

AnilSorathiya added bug Something isn't working internal Not to be externalized in the release notes labels Jan 27, 2026

remove toxicity test from notebook

e56504f

AnilSorathiya requested a review from juanmleng January 27, 2026 17:01

juanmleng approved these changes Jan 27, 2026

View reviewed changes

cachafla approved these changes Jan 28, 2026

View reviewed changes

remove stepefficiency description from notebook

c6d19bc

AnilSorathiya merged commit 031c591 into main Jan 28, 2026
17 checks passed

AnilSorathiya deleted the anilsorathiya/sc-14175/fix-langgraph-version-in-agentic-demo-notebook branch January 28, 2026 12:23

validbeck mentioned this pull request Jan 28, 2026

notebook: Document an agentic AI system #472

Merged

11 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix langgraph version in the agentic demo notebook #471

Fix langgraph version in the agentic demo notebook #471

Uh oh!

AnilSorathiya commented Jan 27, 2026 •

edited

Loading

Uh oh!

cachafla commented Jan 28, 2026

Uh oh!

validbeck commented Jan 28, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Jan 28, 2026

Uh oh!

AnilSorathiya commented Jan 28, 2026 •

edited

Loading

Edit

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Fix langgraph version in the agentic demo notebook #471

Fix langgraph version in the agentic demo notebook #471

Uh oh!

Conversation

AnilSorathiya commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Description

What and why?

How to test

What needs special review?

Dependencies, breaking changes, and deployment notes

Release notes

Checklist

Uh oh!

cachafla commented Jan 28, 2026

Uh oh!

validbeck commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Edit

Uh oh!

github-actions bot commented Jan 28, 2026

PR Summary

Test Suggestions

Uh oh!

AnilSorathiya commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Edit

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

AnilSorathiya commented Jan 27, 2026 •

edited

Loading

validbeck commented Jan 28, 2026 •

edited

Loading

AnilSorathiya commented Jan 28, 2026 •

edited

Loading