Conversation
…y caching Regex instance
nkolev92
approved these changes
Mar 26, 2026
martinrrm
approved these changes
Mar 26, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
🤖 AI-Generated Pull Request 🤖
This pull request was generated by the VS Perf Rel AI Agent. Please review this AI-generated PR with extra care! For more information, visit our wiki. Please share feedback with TIP Insights
Issue:
LicenseExpressionTokenizer.HasValidCharacters()constructs anew Regex(...)on every invocation using a hardcoded constant pattern string. EachRegexconstructor call triggers internal parsing that allocatesRegexParser,RegexTree,RegexNode,RegexCharClass,RegexCode,StringBuilder, and other objects (~12–15 short-lived objects per call).This method is called once per license expression parse via the call chain:
SearchObject.ProcessSearchResultsAsync→PackageSearchMetadataCacheItem→PackageSearchMetadataContextInfo.Create→PackageSearchMetadata.get_LicenseMetadata→NuGetLicenseExpressionParser.Parse→GetTokens→LicenseExpressionTokenizer.HasValidCharacters.For a typical NuGet package search, this produces hundreds of Regex constructions — each allocating ~12–14 short-lived internal objects — contributing to measurable GC pressure on a path that the allocation telemetry specifically flagged.
new Regex(...)internal objectsThis matches the allocation stack trace showing
StringBuilderallocated insideRegex..ctorcalled byHasValidCharactersduring NuGet package search:Issue type: Reduce repeated identical allocations from per-call
Regexconstruction on a hot pathProposed fix: Promote the per-call
Regexlocal variable to aprivate static readonly Regexfield withRegexOptions.CultureInvariant. The pattern and matching behavior are identical;Regex.IsMatchis documented thread-safe on constructed instances.RegexOptions.Compiledis intentionally omitted — the pattern is a trivial single character-class match where interpreted mode is sub-microsecond, andCompiledwould add a non-collectibleDynamicMethodon .NET Framework 4.7.2 with no measurable benefit for this pattern complexity and call frequency.This follows the existing convention in the codebase:
PackageIdValidator.IdRegexuses the identicalprivate static readonly Regexpattern for package ID validation.Best practices wiki
See related failure in PRISM
ADO work item