Skip to content

ProcessMemoryMonitor uses workingSetSize instead of privateBytes and lacks trend detection #3285

@gregpriday

Description

@gregpriday

Summary

The process memory monitor polls per-process metrics every 30 seconds and fires warn logs when absolute thresholds are exceeded. Two problems make this less useful than it should be: the metric used (workingSetSize) fluctuates with OS swapping and generates false positives, and there is no trend analysis — a process growing at 50 MB/hour will not trigger any warning until it crosses a static threshold, which may be hours away.

Problem Statement

Wrong metric for leak detection. app.getAppMetrics() returns a memory object with both workingSetSize and privateBytes. The current monitor uses workingSetSize exclusively:

workingSetSize is total physical RAM mapped to the process, including pages that the OS can freely swap out. On macOS with memory compression active, this number fluctuates significantly even when actual heap allocation is stable — triggering false warnings on legitimate sessions. privateBytes is unshared memory exclusive to the process and is the standard metric for leak detection because it reflects what the process actually owns.

No growth rate analysis. The monitor only compares the current sample to a static threshold:

A slow memory leak — the kind introduced by an unclosed event listener or an unbounded Zustand slice — grows at 5–20 MB/hour. At a 300 MB threshold for the Browser process, a leak starting at 150 MB would go undetected for potentially 7+ hours on a developer's long-running session. By then the evidence is gone and the leak is hard to locate.

The correct approach is to track the rate of change over a rolling window using smoothed trend analysis (e.g. Exponential Moving Average over bucket-minimum values to filter GC sawtooth). This detects slow leaks early and eliminates false positives from transient spikes.

Startup allocation bursts are not suppressed. V8 JIT compilation and module loading during the first 10–15 minutes of app startup produce rapid allocations that a naive threshold check would flag as a leak.

Desired Behavior

  • The monitor tracks privateBytes per process rather than workingSetSize
  • Each 30s sample contributes to a per-process rolling history, with bucket minima used to filter GC noise
  • A smoothed trend (rate of change in MB/hour) is calculated from the history
  • Sustained growth exceeding a configurable threshold (e.g., >5 MB/hour over a 30-minute window) is logged at warn level
  • The first 15 minutes of app uptime are excluded from trend evaluation to absorb startup allocation
  • Absolute threshold warnings are preserved but use privateBytes as the source value

Context

The monitor is initialized unconditionally on every app launch via:

The logging infrastructure (logDebug, logWarn) is already in use within the monitor:

Existing tests cover the polling and snapshot behavior:

Acceptance Criteria

  • Memory samples use privateBytes instead of workingSetSize for threshold comparisons and trend analysis
  • A process growing at a sustained rate above the trend threshold (configurable, default 5 MB/hour) produces a warn-level log entry identifying the process type, PID, and projected hourly growth
  • Trend evaluation is suppressed for the first 15 minutes after the monitor starts, preventing startup allocation from triggering false warnings
  • Single-sample spikes (e.g., temporary allocations that are GC'd) do not trigger trend warnings
  • Existing test suite remains green; new tests cover trend detection and the startup suppression window
  • Absolute threshold warnings (Browser, Tab, Utility) continue to fire, now based on privateBytes

Edge Cases & Risks

  • On macOS, process.getSystemMemoryInfo().free is unreliable due to memory compression and should not be used for trend calculations — privateBytes from app.getAppMetrics() is OS-agnostic
  • If a process is killed and restarted (e.g., renderer reload), its PID changes; history keyed by PID will naturally reset without special handling
  • GC cycles create sawtooth patterns in raw heap samples — using bucket-minimum values (the trough of each 60s window) rather than raw samples eliminates most false trend signals

Metadata

Metadata

Assignees

Labels

backendMain process / backendenhancementNew feature or requestperformancePerformance optimization

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions