You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Eliminate Network Calls on Import with Lazy Tiktoken Loading
This PR refactors the validmind.ai.test_descriptions module to eliminate network calls during import by implementing lazy loading for tiktoken. Previously, import tiktoken at the module level would trigger network requests to download encoding data, causing delays and failures in environments without network access.
The solution implements a hybrid approach that attempts to import tiktoken once at module load within a try-catch block, caching the result in module-level flags (_TIKTOKEN_AVAILABLE and _TIKTOKEN_ENCODING). The _truncate_summary function now checks these cached flags with zero runtime overhead:
Before: Direct import causes network call
importtiktokendef_truncate_summary(summary, test_id, max_tokens=100_000):
encoding=tiktoken.encoding_for_model("gpt-4o") # Called every timesummary_tokens=encoding.encode(summary)
...
After: Cached import with character-based fallback
_TIKTOKEN_AVAILABLE=False_TIKTOKEN_ENCODING=Nonetry:
importtiktoken_TIKTOKEN_ENCODING=tiktoken.encoding_for_model("gpt-4o")
_TIKTOKEN_AVAILABLE=Trueexcept (ImportError, Exception):
pass# Fall back to character-based estimationdef_truncate_summary(summary, test_id, max_tokens=100_000):
if_TIKTOKEN_AVAILABLE:
summary_tokens=_TIKTOKEN_ENCODING.encode(summary) # Use cached encoding
...
else:
estimated_tokens=len(summary) //4# Simple fallback
...
When tiktoken is available, the implementation uses accurate token counting. When unavailable (no network, import failure), it gracefully falls back to character-based estimation (~4 characters per token). This ensures the library works reliably in all environments while maintaining accuracy when possible. Comprehensive unit tests verify both code paths execute correctly with proper assertions on mocked function calls.
How to test
What needs special review?
Dependencies, breaking changes, and deployment notes
Release notes
Checklist
What and why
Screenshots or videos (Frontend)
How to test
What needs special review
Dependencies, breaking changes, and deployment notes
This PR introduces significant enhancements to the token estimation and summary truncation logic within the project. The changes include:
Implementation of a character-based token estimation function (_estimate_tokens_simple) and a corresponding text truncation function (_truncate_text_simple) that are used as a fallback when the tiktoken library is unavailable.
Modification of the _truncate_summary function to dynamically choose between using tiktoken for accurate token counting and falling back to the character-based methods. This ensures that summary truncation works reliably in different runtime environments.
Addition of comprehensive unit tests in tests/test_test_descriptions.py that validate both the tiktoken and fallback code paths. These tests cover scenarios such as:
Token estimation for texts of varying lengths.
Proper truncation behavior for both short and excessively long texts.
Correct selection of the code path based on the availability of the tiktoken module using patching techniques.
Minor version updates in configuration files to reflect the new release version.
Overall, these changes enhance the robustness of the module by ensuring that summary truncation is both accurate (using tiktoken when possible) and resilient (with a reliable fallback).
Test Suggestions
Test with multi-byte or Unicode characters to ensure the character-based estimation remains consistent.
Add edge case tests where the summary length is just around the max_tokens threshold to verify boundaries.
Include tests that simulate failures in tiktoken functions (e.g., encoding/decoding errors) to further validate fallback behavior.
Run performance benchmarks for long text inputs to ensure the fallback method scales well.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
internalNot to be externalized in the release notes
2 participants
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Pull Request Description
What and why?
Eliminate Network Calls on Import with Lazy Tiktoken Loading
This PR refactors the
validmind.ai.test_descriptionsmodule to eliminate network calls during import by implementing lazy loading fortiktoken. Previously,import tiktokenat the module level would trigger network requests to download encoding data, causing delays and failures in environments without network access.The solution implements a hybrid approach that attempts to import
tiktokenonce at module load within a try-catch block, caching the result in module-level flags (_TIKTOKEN_AVAILABLEand_TIKTOKEN_ENCODING). The_truncate_summaryfunction now checks these cached flags with zero runtime overhead:Before: Direct import causes network call
After: Cached import with character-based fallback
When
tiktokenis available, the implementation uses accurate token counting. When unavailable (no network, import failure), it gracefully falls back to character-based estimation (~4 characters per token). This ensures the library works reliably in all environments while maintaining accuracy when possible. Comprehensive unit tests verify both code paths execute correctly with proper assertions on mocked function calls.How to test
What needs special review?
Dependencies, breaking changes, and deployment notes
Release notes
Checklist