Skip to content

FM-7: WebContent XPC OOM-killed after multi-day run (551MB growth → UI freeze) #258

@a5af

Description

@a5af

Summary

After running v0.32.90 for ~4 days, the UI became completely unresponsive. The Tauri host process and backend sidecar were both healthy and idle. The WKWebView WebContent XPC process had been silently killed by macOS JETSAM under memory pressure.

Symptoms

  • UI frozen, no response to clicks or keyboard input
  • agentmux and agentmuxsrv-rs both at 0% CPU, threads in normal wait states
  • sample shows Tauri main thread in __CFRunLoopServiceMachPort → mach_msg2_trap (idle, not deadlocked)
  • No crash report in ~/Library/Logs/DiagnosticReports/

Evidence

PID   COMMAND                   CPU   MEM (phys)
81550 agentmux (v0.32.90)       0.0%  41MB
81551 com.apple.WebKit.GPU       0.0%  —
81552 com.apple.WebKit.Networking 0.0% —
81553 com.apple.WebKit.WebContent 0.0% 287MB  ← peak was 551MB
81555 agentmuxsrv-rs             0.0%  —
  • WebContent Idle exit: clean flag — this is the macOS JETSAM memory-pressure kill marker
  • System free pages at time of freeze: 3853 pages = ~63MB
  • WebContent peak physical footprint: 551MB (dropped to 287MB after partial reclaim)
  • WebContent CPU minutes: 71m over 4 days (not abnormal, confirms no spin loop)

Root Cause

macOS JETSAM killed the WebContent XPC process when the system hit memory pressure. The process had grown to 551MB over ~4 days of continuous use with multiple terminal panes open.

The primary growth driver is xterm.js scrollback accumulation. The current default is 2000 lines per terminal (term.tsx:160). Each line with rich styled output (e.g. claude's colored stream-json output) carries significant per-cell overhead in the xterm.js internal row model and WebGL texture atlas. With many terminals × days of active output, this grows unboundedly.

There is no recovery mechanism — when WKWebView's renderer is killed, the app goes blank and stays blank. WKWebView fires webContentProcessDidTerminate in this case, but the Tauri shell does not handle it.

Two Required Fixes

Fix A — Scrollback limit (memory growth)

  • Lower default scrollback from 2000 → 500 lines
  • Honor term:scrollback setting (already plumbed in) for users who need more
  • Consider proactively trimming scrollback on long-running panes (e.g. truncate to 200 lines when the pane has been running > 1h with no user scroll activity)

Fix B — WebContent crash recovery

  • Handle webContentProcessDidTerminate in the Tauri app delegate
  • On termination: reload the WKWebView page and let TermResyncHandler (already present) reconnect the terminal controllers
  • Show a brief "renderer restarted" notification so the user knows what happened

Versions

  • AgentMux: v0.32.90
  • macOS: 25.2.0 (Darwin)
  • Hardware: Apple Silicon (aarch64)

Related

  • FM-1 through FM-6: docs/investigations/fix-plan-zombie-cpu-2026-03-23.md
  • TermResyncHandler in frontend/app/view/term/term.tsx — already handles backend restart resync, can be extended for WebContent restart

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions