Skip to content

Editor "Load version" scrambles pattern stops across routes — bug in gtfs-lib dependency (PatternFinder.createPatternObjects) #648

@canales

Description

@canales

Summary

When loading a published GTFS version into the GTFS editor ("Load version"), every pattern in the feed receives the stop sequence of a different pattern, crossing route boundaries. Shapes display correctly; only stop sequences are affected. The corruption is silent — no error or warning is surfaced. The bug is deterministic and reproduces on every load of any multi-route feed.

Note: The fix does not belong in this repo. The root cause is in ibi-group/gtfs-lib at PatternFinder.createPatternObjects(). Filing here because ibi-group/gtfs-lib does not have issues enabled. Resolution requires a patch to gtfs-lib and a dependency bump in datatools-server.


Steps to Reproduce

  1. Any published feed with 2+ routes, each with 2+ trip patterns
  2. Click Load version on the published version and confirm
  3. Open any route in the editor → Trip patterns tab
  4. Inspect the stop sequence of any pattern

Expected: stops match the pattern's shape and the published stop_times

Actual: stops belong to a different pattern, typically from a different route


Root Cause (in ibi-group/gtfs-lib)

PatternFinder.createPatternObjects() includes the following logic:

boolean usePatternsFromFeed =
    patternsFromFeed.size() == tripsForPattern.keySet().size();

if (usePatternsFromFeed) {
    pattern.pattern_id =
        patternsFromFeed.get(patternsFromFeedIndex).pattern_id; // ← BY INDEX
}

When pattern counts match, file-loaded pattern IDs are assigned to derived patterns by array index position, not by content-based match. The two lists have incompatible orderings:

List Sort order
patternsFromFeed pattern_id ascending
tripsForPattern (LinkedHashMultimap) first trip occurrence in trips.txt

These orderings are unrelated for any real-world feed. Every pattern_stop row ends up written with the pattern_id of a different pattern. Since the patterns table is not recreated when usePatternsFromFeed = true, it retains correct route/shape metadata — but the pattern_stops reference wrong pattern_ids. No SQL constraint catches this; the corruption is silent and semantic.

The code comment at the assignment site explicitly acknowledges the problem:

"There is no viable relationship between patterns that are loaded from a feed (patternsFromFeed) and patterns generated here."


Proof

For a feed with 2 routes (Route A, Route B), each with 2 patterns:

patternsFromFeed order (ascending pattern_id):

Index pattern_id Route Shape
0 "1" Route A shape X
1 "2" Route A shape Y
2 "3" Route B shape W
3 "5" Route B shape Z

Note: gap at "4" — a previously deleted pattern left a non-contiguous sequence, a common real-world condition.

tripsForPattern order (trip file order):

Index Route Shape
0 Route B shape Z
1 Route B shape W
2 Route A shape X
3 Route A shape Y

Positional assignment result — 0 out of 4 correct:

pattern_id written Stops stored patterns table says
"1" Route B / shape Z stops Route A / shape X
"2" Route B / shape W stops Route A / shape Y
"3" Route A / shape X stops Route B / shape W
"5" Route A / shape Y stops Route B / shape Z

Both mismatches verified programmatically against a real GTFS feed and confirmed in the live editor.


Suggested Fix (in ibi-group/gtfs-lib)

Replace positional index assignment with a shape_id-keyed lookup in PatternFinder.createPatternObjects():

// Build map before the loop
Map<String, Pattern> filePatternsByShape = new HashMap<>();
for (Pattern p : patternsFromFeed) {
    filePatternsByShape.put(p.shape_id, p);
}

// Inside the loop — replace positional with content-keyed match
if (usePatternsFromFeed) {
    String shapeId = pattern.associatedShapes.isEmpty() ? null
        : pattern.associatedShapes.iterator().next();
    Pattern filePattern = filePatternsByShape.get(shapeId);
    if (filePattern != null) {
        pattern.pattern_id = filePattern.pattern_id;
        pattern.name = filePattern.name;
    } else {
        pattern.pattern_id = Integer.toString(nextPatternId++);
    }
}

shape_id is the natural stable key — it is written to both trips.txt and datatools_patterns.txt at export time, making it the correct join key between the two lists.


Related

  • Secondary issue also identified in gtfs-lib: shape_id is not part of TripPatternKey.equals(), so trips with the same stops but different shapes are merged into one pattern with non-deterministic shape assignment (associatedShapes.iterator().next() on a HashSet). Lower severity but worth addressing separately.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions