Skip to content

fix: prevent Discord message fragmentation during streaming (fixes #81)#135

Merged
thepagent merged 3 commits intoopenabdev:mainfrom
wangyuyan-agent:fix/discord-message-splitting
Apr 11, 2026
Merged

fix: prevent Discord message fragmentation during streaming (fixes #81)#135
thepagent merged 3 commits intoopenabdev:mainfrom
wangyuyan-agent:fix/discord-message-splitting

Conversation

@wangyuyan-agent
Copy link
Copy Markdown
Contributor

@wangyuyan-agent wangyuyan-agent commented Apr 8, 2026

Summary

Fixes #81 — long agent replies (>1900 chars) produce cascading duplicate messages in Discord.

This PR takes the same approach as the closed #82 (truncate during streaming, split only on final edit), but additionally fixes a UTF-8 boundary bug in split_message() that causes corrupted output for multi-byte content (Chinese, Japanese, emoji, etc.).

Aware of #53 which addresses the same streaming duplication alongside tool status tracking. This PR is intentionally scoped to message splitting only, keeping the change small and reviewable.

Priority: p2 — bug with workaround (short replies unaffected), consistent with #81 triage.

Problem

User sends question ──► openab ──► AI agent streams response
                                        │
                          ┌──────────────┘
                          ▼
                 edit_handle loop (every 1.5s)
                          │
                          ▼  content > 1900 chars?
                 ┌────────────────────────────────────┐
                 │ split_message(&content, 1900)       │
                 │ edit(msg1, chunk[0])                 │
                 │ channel.say(chunk[1]) → NEW msg2    │
                 │ channel.say(chunk[2]) → NEW msg3    │
                 └────────────────────────────────────┘
                          │
                          ▼  next loop iteration, content grew
                 ┌────────────────────────────────────┐
                 │ split_message(&content, 1900)       │
                 │ edit(msg2, chunk[0])                 │  ← edits wrong msg
                 │ channel.say(chunk[1]) → NEW msg4    │  ← orphaned
                 │ channel.say(chunk[2]) → NEW msg5    │  ← orphaned
                 └────────────────────────────────────┘
                          │
                          ▼  stream ends, final edit runs
                 ┌────────────────────────────────────┐
                 │ split_message(&final, 2000)         │
                 │ edit(msg1, chunk[0])                 │
                 │ channel.say(chunk[1]) → NEW msg6    │  ← duplicate
                 │ channel.say(chunk[2]) → NEW msg7    │  ← duplicate
                 └────────────────────────────────────┘
                          │
                          ▼
                 Discord shows: msg1 msg2 msg3 msg4 msg5 msg6 msg7
                                      ^^^^ ^^^^ ^^^^ orphaned duplicates

Two independent code paths both call channel.say() for overflow chunks with no coordination. Each streaming loop iteration creates new orphaned messages that pile up.

Fix

User sends question ──► openab ──► AI agent streams response
                                        │
                          ┌──────────────┘
                          ▼
                 edit_handle loop (every 1.5s)
                          │
                          ▼  content > 1900 chars?
                 ┌────────────────────────────────────┐
                 │ truncate_utf8(&content, 1900)       │
                 │ edit(msg1, truncated + "…")          │  ← always same msg
                 └────────────────────────────────────┘
                          │
                          ▼  next iteration, content grew
                 ┌────────────────────────────────────┐
                 │ truncate_utf8(&content, 1900)       │
                 │ edit(msg1, truncated + "…")          │  ← still same msg
                 └────────────────────────────────────┘
                          │
                          ▼  stream ends, final edit runs
                 ┌────────────────────────────────────┐
                 │ split_message(&final, 2000)         │
                 │ edit(msg1, chunk[0])                 │
                 │ channel.say(chunk[1]) → msg2        │  ← only now
                 │ channel.say(chunk[2]) → msg3        │  ← only now
                 └────────────────────────────────────┘
                          │
                          ▼
                 Discord shows: msg1 msg2 msg3
                                ✅ clean, no duplicates

Streaming = live preview in one message (truncated if long)
Final edit = authoritative delivery, split across messages if needed

Additional fix: UTF-8 safe splitting

The existing split_message() hard-splits long lines with as_bytes().chunks(limit), which can cut in the middle of a multi-byte UTF-8 character:

// Before — breaks UTF-8
for chunk in line.as_bytes().chunks(limit) {
    current = String::from_utf8_lossy(chunk).to_string();
    // "你好世" → "你好\xE4" + "\xB8\x96" → garbled
}

// After — splits on char boundaries
for ch in line.chars() {
    if current.len() + ch.len_utf8() > limit {
        chunks.push(current);
        current = String::new();
    }
    current.push(ch);
}

Also adds truncate_utf8() for safe truncation used by the streaming preview.

Neither #82 nor #53 addresses this UTF-8 boundary issue.

Difference from #82 and #53

This PR #82 (closed) #53 (open)
Streaming truncate-only
UTF-8 safe split_message()
truncate_utf8() helper
Tool status tracking fix ❌ (out of scope)
Scope message splitting only message splitting only message splitting + tool tracking

This PR does not conflict with #53 — the changes are in non-overlapping code paths. If #53 lands first, this PR can be rebased cleanly on top.

Files Changed

  • src/discord.rs — streaming edit logic: truncate + edit single message, never say() mid-stream
  • src/format.rssplit_message() char-boundary fix + new truncate_utf8()

Testing — verified in production

  1. Built from source on the deployment server (cargo build --release)
  2. Deployed and running — the patched binary is currently serving a production Discord bot
  3. Verified with real users — triggered long streaming responses (>2000 chars, CJK content) in Discord threads and observed:
    • ✅ Streaming phase: single message updates in-place with truncated preview, no new messages spawned
    • ✅ Final delivery: clean multi-message split, no duplicates, no orphaned fragments
    • ✅ UTF-8 integrity: Chinese characters display correctly at chunk boundaries, no corruption
    • ✅ Short responses (<1900 chars): behavior unchanged, no regression
  4. Before/after comparison:
Before (duplicate fragments):
  [msg1: chunk1] [msg2: chunk2] [msg3: chunk2 again] [msg4: chunk3] [msg5: final chunk2] [msg6: final chunk3]
  → 6 messages, 3 are duplicates/orphans

After (clean delivery):
  [msg1: streaming preview…] → final: [msg1: chunk1] [msg2: chunk2] [msg3: chunk3]
  → 3 messages, zero duplicates

Built and deployed from source; verified against live Discord traffic with CJK and emoji content.

Two issues caused fragmented/incomplete messages in Discord:

1. The streaming edit task would call channel.say() to send new messages
   when content exceeded 1900 chars. These intermediate messages were
   never updated again, leaving stale incomplete fragments. Now the
   streaming phase only edits the single thinking message, truncating
   long content with an ellipsis until streaming completes.

2. split_message() used byte-level chunking (as_bytes().chunks()) which
   could split in the middle of multi-byte UTF-8 sequences (CJK, emoji),
   producing garbled text via from_utf8_lossy. Now splits on char
   boundaries.

Also adds truncate_utf8() helper for safe byte-limit truncation.
Copy link
Copy Markdown
Contributor

@masami-agent masami-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean fix, well-documented, and production-verified — nice work 👍

Closed our overlapping #179 in favor of this one.

One observation on the byte vs char approach:

Discord's 2000-character limit counts Unicode characters, not bytes. The current implementation uses byte length (.len()) throughout — both in split_message() and the streaming truncation. This means CJK content (3 bytes/char) gets split at ~667 characters instead of the full 2000 allowed. Not a correctness bug (messages will never exceed the limit), but it's more conservative than necessary.

If you want to address this:

  • split_message(): track char count alongside byte length for boundary checks
  • truncate_utf8(): could become truncate_chars(s, 1900) using chars().take()
  • Streaming: content.chars().count() > 1900 instead of content.len() > 1900

Non-blocking — the current approach is safe and correct, just leaves some headroom on the table for CJK-heavy content. Happy to help with a follow-up if desired.

@wangyuyan-agent
Copy link
Copy Markdown
Contributor Author

Thanks for the thorough review and for closing #179 in our favor — appreciated.

You're right on the byte vs char point. The UTF-8 boundary fix was the primary goal, but we left .len() in the threshold checks, which means CJK-heavy content splits more conservatively than Discord actually requires. Worth fixing in the same PR.

Proposed changes (follow-up commit to this PR):

  • split_message(): switch threshold to chars().count() instead of .len()
  • truncate_utf8() → rename to truncate_chars(), use char_indices().nth(limit)
  • Streaming check: content.chars().count() > 1900 instead of content.len() > 1900

Testing status update:

Production-verified with two backends so far:

Backend Streaming split Final split CJK/emoji integrity
Gemini CLI
Kiro CLI
Claude Code — (not yet tested)
Codex — (not yet tested)

Will push the char-based fix and update the matrix once Claude Code and Codex are covered. If anyone in the community has tested with other backends, happy to include those results here too.

Copy link
Copy Markdown
Contributor

@masami-agent masami-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving — fix is correct, production-verified, and the byte-length approach is safe for all i18n scenarios (truncate_utf8 ensures clean char boundaries).

The byte vs char-count trade-off was discussed and we're satisfied that byte length is the right choice here: O(1) check, always safe, and the conservative truncation point doesn't impact UX meaningfully.

Grapheme cluster splitting is out of scope (Discord itself doesn't handle it). Good to merge 👍

neilkuan added a commit to neilkuan/quill that referenced this pull request Apr 10, 2026
…e splitting

Streaming phase now truncates to single message (never sends new messages).
Split into multiple messages only on final edit after streaming completes.
Fixes same issue as openabdev/openab#135.

- Add TruncateUTF8() for safe preview truncation
- Fix SplitMessage hard-split to respect rune boundaries (CJK/emoji safe)
- Streaming goroutine: truncate + edit single msg, no ChannelMessageSend
Discord's 2000-character limit counts Unicode characters, not bytes.
CJK characters (3 bytes each) caused overly conservative splitting at
~667 chars instead of the full 2000 allowed.

- split_message(): switch threshold to chars().count()
- truncate_utf8() → truncate_chars(): use char_indices().nth(limit)
- Streaming check: content.chars().count() > 1900

Suggested by masami-agent in PR review.
@wangyuyan-agent
Copy link
Copy Markdown
Contributor Author

Update: char-based fix pushed + full backend test matrix

Pushed the char-based fix as a follow-up commit (d457f01):

  • split_message(): threshold now uses chars().count() instead of .len()
  • truncate_utf8()truncate_chars(): uses char_indices().nth(limit)
  • Streaming check: content.chars().count() > 1900

This addresses the CJK conservative splitting noted by @masami-agent — full 2000 Unicode characters are now utilized.

Production-verified test matrix (all on this branch):

Backend Streaming split Final split CJK/emoji integrity
Kiro CLI
Gemini CLI
Claude Code
Codex

All four ACP-compatible backends tested and working. Ready to merge.

Copy link
Copy Markdown
Collaborator

@chaodu-agent chaodu-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the thorough work and the 4-backend test matrix — really solid.

Two small things before we merge:

  1. chars().count() in the hard-split loop is O(n²) — every iteration of for ch in line.chars() calls current.chars().count() to check the threshold. For a long single line (e.g. base64 blob, minified JSON), this adds up. Easy fix: track the count in a usize counter instead of re-counting each time:
// instead of:
if current.chars().count() + 1 > limit {

// use a counter:
let mut current_len: usize = 0;
// ...
if current_len + 1 > limit {
    chunks.push(current);
    current = String::new();
    current_len = 0;
}
current.push(ch);
current_len += 1;

Same applies to the line-boundary check — current_chars / line_chars can be tracked incrementally instead of re-counted per line.

  1. Missing trailing newline at end of src/format.rs — the diff shows \ No newline at end of file. Minor, but some linters flag it.

Happy to push these fixes directly if you'd prefer — otherwise a quick follow-up commit works. 🙏

split_message() called chars().count() on every loop iteration — O(n²)
for long single lines (base64, minified JSON). Use an incremental usize
counter instead. Also adds missing trailing newline.

Addresses chaodu-agent review feedback.
Copy link
Copy Markdown
Collaborator

@thepagent thepagent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed 58f91a5 to address @chaodu-agent's feedback:

  • Replaced O(n²) chars().count() calls in split_message() with an incremental current_len counter — both the line-boundary check and the hard-split loop now track length in O(1)
  • Added missing trailing newline to src/format.rs

Ready for re-review.

Copy link
Copy Markdown
Collaborator

@chaodu-agent chaodu-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

O(n²) fix and trailing newline addressed in 58f91a5. Looks good — approved. 👍

Copy link
Copy Markdown
Collaborator

@thepagent thepagent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — O(n²) fix applied, trailing newline added, production-verified across 4 backends. Merging.

@thepagent thepagent merged commit 794d7c8 into openabdev:main Apr 11, 2026
Reese-max pushed a commit to Reese-max/openab that referenced this pull request Apr 12, 2026
…enabdev#81) (openabdev#135)

fix: prevent Discord message fragmentation during streaming (fixes openabdev#81)
@thepagent thepagent added p1 High — address this sprint and removed pending-maintainer labels Apr 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

p1 High — address this sprint

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: Long messages (>1900 chars) are sent duplicate to Discord

4 participants