Skip to content

Fix x86_64 SIGILL crash: correct TURN patch offsets and use full NOP sled#636

Merged
frenck merged 2 commits intohassio-addons:mainfrom
zglate:pr/fix-635-x86_64-turn-crash
Apr 14, 2026
Merged

Fix x86_64 SIGILL crash: correct TURN patch offsets and use full NOP sled#636
frenck merged 2 commits intohassio-addons:mainfrom
zglate:pr/fix-635-x86_64-turn-crash

Conversation

@zglate
Copy link
Copy Markdown
Contributor

@zglate zglate commented Apr 13, 2026

Fixes #635.

Root cause

The x86_64 patch offsets introduced in #631 (0x114D1F, 0x11583D) were 102 bytes off. They were derived from a corrupted extraction of libubnt_webrtc_jni.so that had md5sum output accidentally prepended to the ELF header. The 102-byte text prefix shifted every derived offset by the same amount.

The corruption was caught and the committed binary was re-extracted cleanly, but the shifted offsets were retained. The md5 check was added after that point, so it couldn't catch that the offsets were now wrong. aarch64 was unaffected because its extraction was clean.

At the wrong x86_64 offsets, the single 0x90 NOP byte was written into the middle of unrelated instructions (lock xadd, cmp, etc.), producing a malformed instruction stream. The crash PC 0x115841 reported in #635 is exactly 4 bytes past the broken write at 0x11583D, inside the resulting garbage.

What this PR changes

Correct x86_64 offsets. The actual mov edx, 0x1a pattern in the pristine library is at 0x114CB9 and 0x1157D7:

>>> data[0x114CB9:0x114CBE].hex()  # pristine library
'ba1a000000'  # mov edx, 0x1a
>>> data[0x1157D7:0x1157DC].hex()
'ba1a000000'  # mov edx, 0x1a

5-byte NOP sled instead of 1-byte. mov edx, 0x1a on x86_64 is a 5-byte instruction (ba 1a 00 00 00). A 1-byte NOP leaves the immediate bytes 1a 00 00 00 in place, which the CPU decodes as garbage instructions (sbb al, [rax] + add [rax], al) and faults with SIGILL when rax doesn't point to writable memory. The 5-byte NOP cleanly replaces the whole instruction. The subsequent mov rsi, rbx and call AppendFieldEmpty still execute, with edx retaining its prior value, matching the behavior on aarch64 (which NOPs the 4-byte mov w2, #0x1a but leaves the bl call intact).

Pre-patch and post-patch byte verification. The build now explicitly checks that each target offset contains the expected opcode before applying the patch, and that the NOPs are actually written after. This would have caught the original offset bug at build time.

Testing

I was unable to runtime-reproduce the #635 crash locally (requires a populated HAOS install with remote access enabled). I did verify:

  • Pristine library md5 matches the expected value in the Dockerfile
  • The correct offsets contain ba 1a 00 00 00 (the mov edx, 0x1a opcode)
  • The previous offsets point at unrelated bytes that would produce the observed crash when 1-byte NOP'd
  • The build succeeds with pre/post-patch byte verification

It would help to have @charlesomer (the #635 reporter) confirm the fix on their amd64 HAOS install before merging.

Apologies

The original extraction bug and shifted offsets were my mistake in #631. Sorry for the followup.

Summary by CodeRabbit

  • Chores
    • Enhanced Docker build verification to ensure binary modifications are correctly applied across supported architectures, improving build reliability and process integrity.

…sled

Fixes hassio-addons#635.

The x86_64 offsets introduced in hassio-addons#631 (0x114D1F, 0x11583D) were 102
bytes off. They were derived from a binary that had md5sum output
accidentally prepended during extraction over SSH (cat mixed text and
binary output); the 102-byte prefix shifted all derived offsets by the
same amount. The corrupted binary was later replaced with a clean
extract, but the shifted offsets were retained. The md5 check was added
after the fix, so it could not catch that the offsets were still wrong.

At the incorrect offsets, the single 0x90 NOP bytes were written into
the middle of unrelated instructions (lock xadd, cmp, etc.), producing
a malformed instruction stream that crashes with SIGILL when executed.
The crash PC 0x115841 reported in hassio-addons#635 is 4 bytes into the broken
write at 0x11583D, inside the resulting garbage.

aarch64 was not affected: its extraction was clean, offsets are correct,
and a 4-byte NOP cleanly replaces the full 'mov w2, #0x1a' instruction.

Also:

- Use a 5-byte NOP sled on x86_64 to cover the full 'mov edx, 0x1a'
  instruction. A 1-byte NOP on x86_64 leaves residual immediate bytes
  (`1a 00 00 00`) that the CPU decodes as garbage instructions
  (sbb/add on [rax]), causing SIGILL. This matches the aarch64 pattern
  of NOPing the full instruction.
- Add pre-patch and post-patch byte verification so this class of
  mistake fails the build loudly in the future.
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 13, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 795bd457-dca1-4d38-b8b6-31192c74cae2

📥 Commits

Reviewing files that changed from the base of the PR and between e85f9fb and 478055b.

📒 Files selected for processing (1)
  • unifi/Dockerfile

Walkthrough

Added pre- and post-patch byte verification and adjusted patch offsets/bytes in the Dockerfile that modifies libubnt_webrtc_jni.so, including architecture-specific checks for AArch64 and x86_64 and updated x86_64 NOP-sled writes.

Changes

Cohort / File(s) Summary
Binary patching & verification
unifi/Dockerfile
Introduced verify_bytes helper and pre-patch checks for AArch64 (42038052) and x86_64 (ba1a000000). Updated x86_64 patching to write 5-byte NOP sleds (9090909090) at new offsets (replacing prior single-byte writes). Added post-patch verifications (AArch64: 1f2003d5; x86_64: 9090909090).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • frenck

Poem

🐰 I hopped into binaries, bytes in my paw,
I checked every offset and fixed what I saw.
NOPs in a row, prechecks in place,
No more odd crashes to ruin the race,
A tiny patch dance — now code can hop on with thaw.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main fix: correcting x86_64 TURN patch offsets and using a full NOP sled instead of single-byte NOPs.
Linked Issues check ✅ Passed The PR directly addresses issue #635 by correcting the erroneous x86_64 patch offsets, implementing proper 5-byte NOP sleds, and adding verification checks to prevent similar errors.
Out of Scope Changes check ✅ Passed All changes are scoped to fixing the x86_64 SIGILL crash by correcting offsets, improving NOP replacement strategy, and adding pre/post-patch verification.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
unifi/Dockerfile (1)

64-68: Good post-patch verification.

The verification correctly confirms:

  • aarch64: "1f2003d5" (ARM64 NOP instruction)
  • x86_64: "9090909090" (5-byte NOP sled)

One minor observation: if any test command fails, the error message won't indicate which specific check failed. Since the md5sum check (lines 49-50) guards against library changes, this is acceptable, but for future maintainability you could consider adding brief echo statements before each test to aid debugging.

,

♻️ Optional: Add diagnostic output before tests
     # Verify patches actually landed
+    && echo "Verifying patches..." \
     && test "$(dd if=${AARCH64_LIB} bs=1 skip=$((0x167214)) count=4 2>/dev/null | od -An -tx1 | tr -d ' \n')" = "1f2003d5" \

Or wrap each test in a conditional with descriptive failure messages:

&& { test "$(dd if=${X86_64_LIB} bs=1 skip=$((0x114CB9)) count=5 2>/dev/null | od -An -tx1 | tr -d ' \n')" = "9090909090" \
     || { echo "FAIL: x86_64 patch at 0x114CB9 did not apply"; exit 1; }; } \
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@unifi/Dockerfile` around lines 64 - 68, The post-patch verification tests
using the dd/od checks against ${AARCH64_LIB} and ${X86_64_LIB} (the dd calls
that compare bytes at offsets 0x167214, 0x167D20, 0x114CB9, 0x1157D7) currently
fail silently with no indication which check failed; update each test to emit a
short diagnostic or fail message on error (for example print which
library/offset is being checked before running, or wrap each test so a failed
test prints "FAIL: <arch> patch at 0x<offset> did not apply" and exits non-zero)
so maintainers can quickly identify which specific dd/od check failed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@unifi/Dockerfile`:
- Around line 64-68: The post-patch verification tests using the dd/od checks
against ${AARCH64_LIB} and ${X86_64_LIB} (the dd calls that compare bytes at
offsets 0x167214, 0x167D20, 0x114CB9, 0x1157D7) currently fail silently with no
indication which check failed; update each test to emit a short diagnostic or
fail message on error (for example print which library/offset is being checked
before running, or wrap each test so a failed test prints "FAIL: <arch> patch at
0x<offset> did not apply" and exits non-zero) so maintainers can quickly
identify which specific dd/od check failed.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9b430d7c-0143-4c8a-8ddb-a2d21e824956

📥 Commits

Reviewing files that changed from the base of the PR and between 5a807c6 and e85f9fb.

📒 Files selected for processing (1)
  • unifi/Dockerfile

Reviewer suggestion: wrap the pre/post-patch checks in a helper that
reports which architecture, site, and offset failed, rather than a
silent test exit. Preserves the same guarantees while making build
failures self-explanatory.
@frenck frenck added the bugfix Inconsistencies or issues which will cause a problem for users or implementors. label Apr 13, 2026
@frenck frenck requested a review from Copilot April 13, 2026 21:31
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the UniFi add-on build to prevent an amd64 (x86_64) SIGILL crash by correcting the WebRTC TURN patch locations and making the patch safer via byte-level validation.

Changes:

  • Fix x86_64 patch offsets to the correct mov edx, 0x1a instruction sites.
  • Replace the previous 1-byte NOP with a full 5-byte NOP sequence on x86_64.
  • Add pre-patch and post-patch byte verification to fail the build if offsets/opcodes don’t match expectations.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@zglate
Copy link
Copy Markdown
Contributor Author

zglate commented Apr 13, 2026

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 13, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@frenck frenck merged commit 886de50 into hassio-addons:main Apr 14, 2026
18 of 19 checks passed
@charlesomer
Copy link
Copy Markdown

Sorry I thought I had commented yesterday, I was able to start and run these changes after adding your repo to home assistant. I didn't see any crashes but I only went as far as starting the container, I didn't attempt to restore a setup.

@zglate
Copy link
Copy Markdown
Contributor Author

zglate commented Apr 15, 2026

I saw your comment (you put it on issue 631 which wasn't the right spot but it worked). I pushed a new fix that should fix your issue @charlesomer and it was merged last night. Give it a shot and if there are still issues let me know. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bugfix Inconsistencies or issues which will cause a problem for users or implementors.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Crash loop with latest release (5.1.0)

4 participants