Skip to content

test(perf): add hot-swap memory load testing harness and analysis#652

Open
3rabiii wants to merge 1 commit intoOneBusAway:mainfrom
3rabiii:perf/hot-swap-memory-analysis
Open

test(perf): add hot-swap memory load testing harness and analysis#652
3rabiii wants to merge 1 commit intoOneBusAway:mainfrom
3rabiii:perf/hot-swap-memory-analysis

Conversation

@3rabiii
Copy link
Contributor

@3rabiii 3rabiii commented Mar 10, 2026

This PR introduces the testing harness, continuous monitoring scripts, and comprehensive analysis for the GTFS hot-swap memory behavior ForceUpdate).
To ensure accuracy and validate the system's behavior, I ran the newly implemented TestHotSwapMemory_LargeAgency using the TriMet (Large Agency) dataset with FTS5 enabled, simulating heavy production load (10 concurrent readers) during the swap window.

Answers to Issue Questions (Based on Local Test Results)

  1. What is the peak memory multiplier?
    1.25x baseline (Baseline: 2.92 GiB, Peak: 3.64 GiB). This is significantly better than the anticipated 2.0x multiplier. Go's concurrent GC actively and effectively reclaims memory during the build phase (10 GC cycles ran during the swap).
  2. How long does the old data take to be GC'd after the swap?
    ~1.7 seconds (1,728 ms) for memory to settle after the write lock is released and the swap completes.
  3. Do any requests fail or timeout during the swap window?
    No. The request failure rate was 0.00%. Out of 245,340 requests made during the test, only 1 failed. The RWMutex successfully protects data consistency without noticeable API impact.
  4. At what agency size does this become a problem for typical container limits?
    For TriMet-sized agencies (~24MB compressed GTFS), a 4 GB* container limit is perfectly safe and recommended. I have documented the recommended limits for all agency sizes (Small to XL) in the updated README.md and the detailed markdown report.

Deliverables in this PR:

  • internal/gtfs/hot_swap_memory_test.go: Automated memory profiling test with a perftest tag.
  • loadtest/k6/hotswap_scenario.js: k6 load testing script with heuristic swap-window detection.
  • scripts/hotswap-memory-test.sh: Bash utility for live RSS monitoring and heap profiling.
  • hotswap_memory_analysis.md: Full documentation of findings and container sizing recommendations.
  • Updated loadtest/README.md and .gitignore to prevent tracking heavy dumps.
    No immediate mitigations (like streaming imports or explicit runtime.GC()) are necessary at this stage since the peak multiplier is heavily contained (1.25x).

Proof of Work:

Screenshot From 2026-03-10 13-07-46

@aaronbrethorst
closes : #504

Copy link
Member

@aaronbrethorst aaronbrethorst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great investigation, Adel — this is thorough work and the findings are genuinely useful for capacity planning. The test harness, k6 scenario, and analysis document together make a solid toolkit for understanding hot-swap behavior. A few things to address before merging:

Critical

(none)

Important

  1. Shell script uses /proc which doesn't exist on macOS (scripts/hotswap-memory-test.sh:84,179). The monitor_rss function and run_hotswap_test both check [ -d "/proc/$pid" ], which will always be false on macOS/Darwin. Since many contributors develop on macOS, RSS monitoring will silently do nothing. Consider using kill -0 "$pid" 2>/dev/null to check if a process is alive, which works on both Linux and macOS.

  2. Inconsistent multiplier in the analysis doc (docs/hotswap_memory_analysis.md). Most of the document reports the peak multiplier as 1.25x, but lines 119 and 146 say 1.30x:

    • Line 119: "The observed 1.30x multiplier is acceptable..."
    • Line 146: "No immediate mitigation needed: The 1.30x multiplier is within acceptable bounds"

    Pick one and use it consistently.

Fit and Finish

  1. k6 script indentation (scripts/hotswap-memory-test.sh:116): The k6 run line has inconsistent indentation — it uses spaces where the rest of the file uses tabs, and the continuation line is misaligned relative to its context.

  2. Consider whether docs/hotswap_memory_analysis.md should live in the repo. This file contains hardcoded results from a single test run on a specific machine. These numbers will drift as the codebase evolves. It might be better suited as a GitHub issue comment on #504, or moved into loadtest/README.md as a "Sample Results" section with a note that results will vary. Up to you — just flagging that static benchmark results in docs tend to become misleading over time.

Verdict

Request changes — please fix the /proc portability issue and the inconsistent multiplier, then this is good to go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Measure memory behavior during GTFS hot-swap under load

2 participants