fix: context length issues by johnwalz97 · Pull Request #312 · validmind/validmind-library

johnwalz97 · 2025-02-11T17:43:04Z

Internal Notes for Reviewers

Pulling hotfix in to main.

External Release Notes

github-actions · 2025-02-11T17:43:38Z

PR Summary

This pull request introduces two main enhancements to the validmind library:

Summary Truncation: A new function _truncate_summary has been added to handle the truncation of summaries that exceed a specified token limit. This function uses the tiktoken library to encode the summary and ensure it does not exceed the max_tokens limit. If the summary is too long, it truncates the middle part and appends a truncation notice.
Enhanced Error Handling: The error handling in the generate_description function has been improved. Specifically, when an exception occurs due to the maximum context length being exceeded, a warning is logged, and a default description is used. This change provides more informative logging and ensures that the system can gracefully handle large test results without failing.

These changes aim to improve the robustness and usability of the description generation process in the validmind library.

Test Suggestions

Test the _truncate_summary function with summaries of varying lengths to ensure it correctly truncates when necessary.
Verify that the generate_description function logs appropriate warnings and uses default descriptions when the maximum context length is exceeded.
Check the integration of tiktoken to ensure it encodes and decodes summaries correctly.
Ensure that existing functionality in generate_description is not affected by the new error handling logic.

github-actions · 2025-02-11T18:04:59Z

PR Summary

This pull request introduces a new function _truncate_summary in the validmind/ai/test_descriptions.py file to handle cases where the summary of a test description exceeds a specified maximum token length. The function uses the tiktoken library to encode the summary and truncate it if necessary, ensuring that the summary does not exceed the max_tokens limit. If truncation occurs, a warning is logged, and the summary is truncated with a note indicating the truncation.

Additionally, the generate_description function has been updated to use _truncate_summary for the summary field, ensuring that the generated descriptions adhere to context length restrictions. The error handling in the wrapped function has also been improved to provide more specific warnings when a test result is too large to generate a description.

Test Suggestions

Test the _truncate_summary function with summaries of varying lengths to ensure it correctly truncates when necessary.
Verify that the generate_description function correctly integrates _truncate_summary and handles summaries exceeding the maximum token length.
Check that appropriate warnings are logged when a summary is truncated or when a test result is too large.
Ensure that the default description is used when an exception occurs in generate_description.

johnwalz97 added 2 commits February 11, 2025 12:41

fix: context length issues

bc5a7c0

2.8.10

e140d7f

johnwalz97 added bug Something isn't working internal Not to be externalized in the release notes labels Feb 11, 2025

johnwalz97 requested a review from cachafla February 11, 2025 17:43

johnwalz97 added 2 commits February 11, 2025 13:00

chore: adding warning when truncating test result for llm

c4d65dd

chore: updating warning message

634d6b3

johnwalz97 closed this Feb 20, 2025

johnwalz97 deleted the john6797/sc-8457/experian-gets-an-error-on-too-large-dataset branch February 27, 2025 19:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: context length issues#312

fix: context length issues#312
johnwalz97 wants to merge 4 commits intomainfrom
john6797/sc-8457/experian-gets-an-error-on-too-large-dataset

johnwalz97 commented Feb 11, 2025

Uh oh!

github-actions bot commented Feb 11, 2025

Uh oh!

github-actions bot commented Feb 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

johnwalz97 commented Feb 11, 2025

Internal Notes for Reviewers

External Release Notes

Uh oh!

github-actions bot commented Feb 11, 2025

PR Summary

Test Suggestions

Uh oh!

github-actions bot commented Feb 11, 2025

PR Summary

Test Suggestions

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant