fix(inbound): preserve quoted context for voice messages with ref_msg (#48) by draix · Pull Request #63 · Tencent/openclaw-weixin

draix · 2026-04-14T04:28:17Z

Summary

When a user sends a voice message that quotes/replies to a previous text message, the agent received the transcribed text but lost the quoted context. Text replies with ref_msg already prepended [引用: ...] correctly — voice replies did not, leaving the agent unable to understand what the user was responding to.

Root cause

bodyFromItemList() in src/messaging/inbound.ts handled TEXT + ref_msg with full quoted-context logic, but the VOICE branch returned voice_item.text unconditionally without checking ref_msg:

// Before: ref_msg silently ignored for voice
if (item.type === MessageItemType.VOICE && item.voice_item?.text) {
  return item.voice_item.text;  // ← ref_msg dropped entirely
}

Fix

Apply the same quoted-context logic to VOICE as TEXT. The fix is structurally identical to the TEXT branch above it — same null checks, same isMediaItem guard, same parts accumulation, same output format.

Scenario	Before	After
Voice replies to text (title)	`yes please`	`[引用: Can you schedule a meeting?]\nyes please`
Voice replies to text (content)	`agreed`	`[引用: Let's meet at 3pm]\nagreed`
Voice replies to image/video/file/voice	`nice photo`	`nice photo` (unchanged — media can't be quoted as text)
Standalone voice (no `ref_msg`)	`schedule meeting`	`schedule meeting` (unchanged)
Untranscribed voice	``	`` (unchanged)

Testing

11 new tests in describe("voice messages with quoted context (#48)"):

Quoted context included:

title-only ref_msg → [引用: title]\n<voice text>
message_item-only ref_msg → [引用: content]\n<voice text>
title + message_item → [引用: title | content]\n<voice text>

Quoted context omitted (media replies — can't express as text):

ref_msg is IMAGE → voice text only
ref_msg is VIDEO → voice text only
ref_msg is FILE → voice text only
ref_msg is VOICE → voice text only

No-change regression guards:

empty ref_msg → voice text only
standalone voice (no ref_msg) → voice text only
untranscribed voice, no ref_msg → empty body
untranscribed voice, with ref_msg → empty body

tsc --noEmit passes. All 36 inbound tests pass.

Note on src/auth/pairing.test.ts: the one failing test in the suite (uses withFileLock for concurrency safety) reproduces on main without this change — it is pre-existing and unrelated to this PR.

Fixes #48

…Tencent#48) When a user sends a voice message that quotes/replies to a previous text message, the transcribed text was returned without the quoted context. Text replies with ref_msg already prepended '[引用: ...]' — voice replies did not, making the agent unaware of what the user was responding to. Root cause ---------- bodyFromItemList() handled TEXT + ref_msg with full quoted-context logic but the VOICE branch returned voice_item.text unconditionally, ignoring ref_msg entirely. Fix --- Apply the same quoted-context logic to VOICE as TEXT: - If ref_msg is absent: return transcribed text as-is (unchanged). - If ref_msg.message_item is a media type (IMAGE/VIDEO/FILE/VOICE): return transcribed text only — media cannot be quoted as text. - If ref_msg has title and/or a TEXT message_item: prepend '[引用: <title> | <text>]' before the transcribed voice text. - If voice has no transcription (voice_item.text absent): return '' regardless of ref_msg (nothing to prepend the quote to). The fix is structurally identical to the TEXT branch above it — same null checks, same isMediaItem guard, same parts accumulation, same output format. Testing ------- 11 new tests added to 'voice messages with quoted context (Tencent#48)': Quoted-context included: - title-only ref_msg → '[引用: title]\n<voice text>' - message_item-only ref_msg → '[引用: content]\n<voice text>' - title + message_item → '[引用: title | content]\n<voice text>' Quoted-context omitted (media replies): - ref_msg is IMAGE → voice text only - ref_msg is VIDEO → voice text only - ref_msg is FILE → voice text only - ref_msg is VOICE → voice text only No-change cases (regression guards): - empty ref_msg → voice text only - standalone voice → voice text only - untranscribed, no ref → empty body - untranscribed, with ref → empty body TypeScript: tsc --noEmit passes. All 36 inbound tests pass. The pre-existing failure in src/auth/pairing.test.ts is unrelated — reproduces on main without this change (confirmed).

draix force-pushed the fix/48-voice-quoted-context branch from d03b9f1 to 0bb27ec Compare April 14, 2026 04:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(inbound): preserve quoted context for voice messages with ref_msg (#48)#63

fix(inbound): preserve quoted context for voice messages with ref_msg (#48)#63
draix wants to merge 1 commit intoTencent:mainfrom
draix:fix/48-voice-quoted-context

draix commented Apr 14, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

draix commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root cause

Fix

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

draix commented Apr 14, 2026 •

edited

Loading