test(perf): add hot-swap memory load testing harness and analysis#652
test(perf): add hot-swap memory load testing harness and analysis#6523rabiii wants to merge 1 commit intoOneBusAway:mainfrom
Conversation
aaronbrethorst
left a comment
There was a problem hiding this comment.
Great investigation, Adel — this is thorough work and the findings are genuinely useful for capacity planning. The test harness, k6 scenario, and analysis document together make a solid toolkit for understanding hot-swap behavior. A few things to address before merging:
Critical
(none)
Important
-
Shell script uses
/procwhich doesn't exist on macOS (scripts/hotswap-memory-test.sh:84,179). Themonitor_rssfunction andrun_hotswap_testboth check[ -d "/proc/$pid" ], which will always be false on macOS/Darwin. Since many contributors develop on macOS, RSS monitoring will silently do nothing. Consider usingkill -0 "$pid" 2>/dev/nullto check if a process is alive, which works on both Linux and macOS. -
Inconsistent multiplier in the analysis doc (
docs/hotswap_memory_analysis.md). Most of the document reports the peak multiplier as 1.25x, but lines 119 and 146 say 1.30x:- Line 119: "The observed 1.30x multiplier is acceptable..."
- Line 146: "No immediate mitigation needed: The 1.30x multiplier is within acceptable bounds"
Pick one and use it consistently.
Fit and Finish
-
k6 script indentation (
scripts/hotswap-memory-test.sh:116): Thek6 runline has inconsistent indentation — it uses spaces where the rest of the file uses tabs, and the continuation line is misaligned relative to its context. -
Consider whether
docs/hotswap_memory_analysis.mdshould live in the repo. This file contains hardcoded results from a single test run on a specific machine. These numbers will drift as the codebase evolves. It might be better suited as a GitHub issue comment on #504, or moved intoloadtest/README.mdas a "Sample Results" section with a note that results will vary. Up to you — just flagging that static benchmark results in docs tend to become misleading over time.
Verdict
Request changes — please fix the /proc portability issue and the inconsistent multiplier, then this is good to go.
This PR introduces the testing harness, continuous monitoring scripts, and comprehensive analysis for the GTFS hot-swap memory behavior ForceUpdate).
To ensure accuracy and validate the system's behavior, I ran the newly implemented TestHotSwapMemory_LargeAgency using the TriMet (Large Agency) dataset with FTS5 enabled, simulating heavy production load (10 concurrent readers) during the swap window.
Answers to Issue Questions (Based on Local Test Results)
1.25x baseline (Baseline: 2.92 GiB, Peak: 3.64 GiB). This is significantly better than the anticipated 2.0x multiplier. Go's concurrent GC actively and effectively reclaims memory during the build phase (10 GC cycles ran during the swap).
~1.7 seconds (1,728 ms) for memory to settle after the write lock is released and the swap completes.
No. The request failure rate was 0.00%. Out of 245,340 requests made during the test, only 1 failed. The RWMutex successfully protects data consistency without noticeable API impact.
For TriMet-sized agencies (~24MB compressed GTFS), a 4 GB* container limit is perfectly safe and recommended. I have documented the recommended limits for all agency sizes (Small to XL) in the updated
README.mdand the detailed markdown report.Deliverables in this PR:
scripts/hotswap-memory-test.sh: Bash utility for live RSS monitoring and heap profiling.hotswap_memory_analysis.md: Full documentation of findings and container sizing recommendations.loadtest/README.mdand .gitignore to prevent tracking heavy dumps.No immediate mitigations (like streaming imports or explicit runtime.GC()) are necessary at this stage since the peak multiplier is heavily contained (1.25x).
Proof of Work:
@aaronbrethorst
closes : #504