Skip to content

Performance: API Reader Starvation due to O(N) Map Allocation inside Write Lock (realTimeMutex) #670

@vedanthnyk25

Description

@vedanthnyk25

Description

Currently, internal/gtfs/realtime.go suffers from severe lock contention under load, leading to thread exhaustion and high p99 latency for API readers.

The rebuildMergedRealtimeLocked function is called to rebuild the global routing maps (realTimeTripLookup and realTimeVehicleLookupByTrip). However, it performs this O(N) map allocation and data copying while holding an exclusive write lock (realTimeMutex.Lock()).

The Bottleneck

When processing a real-world GTFS-RT feed with tens of thousands of active trips, the map allocation takes several milliseconds. During this critical section:

  1. Every incoming HTTP API request attempting to read state (e.g., calling GetRealTimeTrips) blocks while waiting for the RLock.
  2. This creates a massive queue of blocked goroutines.
  3. When the write lock is finally released, the "thundering herd" of blocked readers wakes up simultaneously, trashing the Go scheduler and causing cascading timeouts.

Note: This architectural bottleneck perfectly aligns with the 75%+ failure rate and 1-minute latency spikes documented in docs/mutex_contention_analysis.md.

Proposed Solution: Lock-Free Copy-On-Write (COW)

To eliminate the reader starvation, we should move the O(N) allocation entirely out of the critical section using a Copy-On-Write pattern with atomic.Value.

Implementation Steps:

  1. Group the lookup maps into a single state struct (e.g., RealTimeState).
  2. Store this struct in the API manager using atomic.Value.
  3. Inside the rebuild function, allocate and populate the new maps in local memory without acquiring the global lock.
  4. Once the new maps are fully built, perform an O(1) atomic pointer swap to make them active.

This ensures that API readers never block waiting for background feed processing, keeping read latency strictly bounded to O(1) uncontended atomic loads.

@aaronbrethorst @Ahmedhossamdev
Should I work on this?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions