Skip to content

[SECURITY] Overlong UTF-8 accepted in readStringJS — msgpackr + cbor-x #177

@Xvush

Description

@Xvush

Summary

readStringJS() in both msgpackr and cbor-x accepts overlong UTF-8 sequences forbidden by RFC 3629 §3. The raw bytes contain no dangerous characters, but they decode to dangerous characters — bypasses any security filter operating on the wire format.

Same root cause in both packages: the 2-byte decoder does ((byte1 & 0x1f) << 6) | byte2 without checking that the result is ≥ U+0080. Same gap in the 3-byte and 4-byte paths.

Affected

Package Version Tested File Lines Condition
msgpackr 1.11.8 unpack.js 621–641 Without native addon, or string ≤ 64 bytes
cbor-x 1.6.0 decode.js 611–631 String ≤ 64 bytes (JS fallback path)

With native addon loaded AND string > 64 bytes → TextDecoder handles it (safe). Without addon or short strings → readStringJS() handles it (vulnerable).

PoC — msgpackr

const { unpack } = require('msgpackr');

// 0xC0 0xAF = overlong encoding of "/" (U+002F)
// Wire bytes contain no 0x2F — passes any byte-level filter
const payload = Buffer.concat([
  Buffer.from([0xa2]),         // msgpack fixstr, 2 bytes
  Buffer.from([0xc0, 0xaf])   // overlong "/"
]);

const result = unpack(payload);
console.log(result);           // "/"
console.log(result === '/');   // true

PoC — cbor-x

const { decode } = require('cbor-x');

// CBOR text string (major type 3), 2 bytes, overlong "/"
const payload = Buffer.from([0x62, 0xc0, 0xaf]);

const result = decode(payload);
console.log(result);           // "/"
console.log(result === '/');   // true

Attack vectors

Overlong bytes Decodes to Attack
C0 AF / Path traversal (..C0AF..C0AF etc/passwd)
C0 AE . Path traversal
C0 BC < XSS
C0 BE > XSS
C0 A7 ' SQLi
C0 80 NUL NUL injection / string truncation
ED A0 80 U+D800 Lone surrogate (undefined behavior in downstream)

A WAF/validator sees no /, no <, no ' in the raw bytes. The application deserializes via msgpackr/cbor-x and gets the real characters.

Downstream

  • msgpackr: 3M+ weekly downloads
  • cbor-x: 425K weekly, used by @simplewebauthn/server (757K weekly) for WebAuthn/FIDO2 attestation parsing

Fix

Add minimum codepoint checks after decoding each sequence:

 } else if ((byte1 & 0xe0) === 0xc0) {
     const byte2 = src[position++] & 0x3f
-    units.push(((byte1 & 0x1f) << 6) | byte2)
+    const cp = ((byte1 & 0x1f) << 6) | byte2
+    if (cp < 0x80) throw new Error('Overlong UTF-8 sequence')
+    units.push(cp)

Same pattern for 3-byte (reject < 0x800 and surrogates 0xD800–0xDFFF) and 4-byte (reject < 0x10000 and > 0x10FFFF).

Both unpack.js in msgpackr and decode.js in cbor-x need the same fix — the readStringJS functions are nearly identical.

Tested on Node v20.19.5, Linux.

— Malik X (@Xvush) / 90-day coordinated disclosure

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions