fix: prevent frontend freeze on extremely long AI outputs#451
Open
voidborne-d wants to merge 1 commit intoValueCell-ai:mainfrom
Open
fix: prevent frontend freeze on extremely long AI outputs#451voidborne-d wants to merge 1 commit intoValueCell-ai:mainfrom
voidborne-d wants to merge 1 commit intoValueCell-ai:mainfrom
Conversation
…ai#133) Two changes to ChatMessage: 1. Use React.useDeferredValue for the streaming text so ReactMarkdown can skip intermediate re-parses when tokens arrive faster than the browser can render. This keeps the UI thread responsive. 2. During streaming, cap the text fed to ReactMarkdown at 50k chars (showing the tail). Once streaming completes the full text is rendered normally. This avoids the O(n²) cumulative parse cost that causes 100% CPU and memory spikes on very long outputs.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #133
Problem
When an AI model produces very long output (e.g. 100k+ chars), the frontend freezes with 100% CPU and memory spikes. This is 100% reproducible.
Root cause: During streaming, every new token triggers a full re-render of
ChatMessage.ReactMarkdownre-parses the entire markdown AST synchronously on each render. For a 100k-char message receiving tokens at ~50/s, this means ~50 full 100k-char markdown parses per second — O(n²) cumulative work that blocks the main thread.Fix
Two targeted changes to
ChatMessage.tsx:1.
useDeferredValuefor streaming textReact can now skip intermediate renders when tokens arrive faster than the browser can paint. The markdown only re-parses when the browser has idle time, keeping the UI responsive.
2. Tail-render cap for extremely long outputs
During streaming, if text exceeds 50k chars, only the last 50k chars are fed to ReactMarkdown. A small hint tells the user content is truncated while streaming. Once streaming completes, the full text is rendered normally.
Why this works
useDeferredValueis a React 18 primitive designed exactly for this case — deferring expensive re-renders of fast-changing dataTesting