Summary
When loading a published GTFS version into the GTFS editor ("Load version"), every pattern in the feed receives the stop sequence of a different pattern, crossing route boundaries. Shapes display correctly; only stop sequences are affected. The corruption is silent — no error or warning is surfaced. The bug is deterministic and reproduces on every load of any multi-route feed.
Note: The fix does not belong in this repo. The root cause is in ibi-group/gtfs-lib at PatternFinder.createPatternObjects(). Filing here because ibi-group/gtfs-lib does not have issues enabled. Resolution requires a patch to gtfs-lib and a dependency bump in datatools-server.
Steps to Reproduce
- Any published feed with 2+ routes, each with 2+ trip patterns
- Click Load version on the published version and confirm
- Open any route in the editor → Trip patterns tab
- Inspect the stop sequence of any pattern
Expected: stops match the pattern's shape and the published stop_times
Actual: stops belong to a different pattern, typically from a different route
Root Cause (in ibi-group/gtfs-lib)
PatternFinder.createPatternObjects() includes the following logic:
boolean usePatternsFromFeed =
patternsFromFeed.size() == tripsForPattern.keySet().size();
if (usePatternsFromFeed) {
pattern.pattern_id =
patternsFromFeed.get(patternsFromFeedIndex).pattern_id; // ← BY INDEX
}
When pattern counts match, file-loaded pattern IDs are assigned to derived patterns by array index position, not by content-based match. The two lists have incompatible orderings:
| List |
Sort order |
patternsFromFeed |
pattern_id ascending |
tripsForPattern (LinkedHashMultimap) |
first trip occurrence in trips.txt |
These orderings are unrelated for any real-world feed. Every pattern_stop row ends up written with the pattern_id of a different pattern. Since the patterns table is not recreated when usePatternsFromFeed = true, it retains correct route/shape metadata — but the pattern_stops reference wrong pattern_ids. No SQL constraint catches this; the corruption is silent and semantic.
The code comment at the assignment site explicitly acknowledges the problem:
"There is no viable relationship between patterns that are loaded from a feed (patternsFromFeed) and patterns generated here."
Proof
For a feed with 2 routes (Route A, Route B), each with 2 patterns:
patternsFromFeed order (ascending pattern_id):
| Index |
pattern_id |
Route |
Shape |
| 0 |
"1" |
Route A |
shape X |
| 1 |
"2" |
Route A |
shape Y |
| 2 |
"3" |
Route B |
shape W |
| 3 |
"5" |
Route B |
shape Z |
Note: gap at "4" — a previously deleted pattern left a non-contiguous sequence, a common real-world condition.
tripsForPattern order (trip file order):
| Index |
Route |
Shape |
| 0 |
Route B |
shape Z |
| 1 |
Route B |
shape W |
| 2 |
Route A |
shape X |
| 3 |
Route A |
shape Y |
Positional assignment result — 0 out of 4 correct:
pattern_id written |
Stops stored |
patterns table says |
"1" |
Route B / shape Z stops |
Route A / shape X |
"2" |
Route B / shape W stops |
Route A / shape Y |
"3" |
Route A / shape X stops |
Route B / shape W |
"5" |
Route A / shape Y stops |
Route B / shape Z |
Both mismatches verified programmatically against a real GTFS feed and confirmed in the live editor.
Suggested Fix (in ibi-group/gtfs-lib)
Replace positional index assignment with a shape_id-keyed lookup in PatternFinder.createPatternObjects():
// Build map before the loop
Map<String, Pattern> filePatternsByShape = new HashMap<>();
for (Pattern p : patternsFromFeed) {
filePatternsByShape.put(p.shape_id, p);
}
// Inside the loop — replace positional with content-keyed match
if (usePatternsFromFeed) {
String shapeId = pattern.associatedShapes.isEmpty() ? null
: pattern.associatedShapes.iterator().next();
Pattern filePattern = filePatternsByShape.get(shapeId);
if (filePattern != null) {
pattern.pattern_id = filePattern.pattern_id;
pattern.name = filePattern.name;
} else {
pattern.pattern_id = Integer.toString(nextPatternId++);
}
}
shape_id is the natural stable key — it is written to both trips.txt and datatools_patterns.txt at export time, making it the correct join key between the two lists.
Related
- Secondary issue also identified in gtfs-lib:
shape_id is not part of TripPatternKey.equals(), so trips with the same stops but different shapes are merged into one pattern with non-deterministic shape assignment (associatedShapes.iterator().next() on a HashSet). Lower severity but worth addressing separately.
Summary
When loading a published GTFS version into the GTFS editor ("Load version"), every pattern in the feed receives the stop sequence of a different pattern, crossing route boundaries. Shapes display correctly; only stop sequences are affected. The corruption is silent — no error or warning is surfaced. The bug is deterministic and reproduces on every load of any multi-route feed.
Steps to Reproduce
Expected: stops match the pattern's shape and the published
stop_timesActual: stops belong to a different pattern, typically from a different route
Root Cause (in
ibi-group/gtfs-lib)PatternFinder.createPatternObjects()includes the following logic:When pattern counts match, file-loaded pattern IDs are assigned to derived patterns by array index position, not by content-based match. The two lists have incompatible orderings:
patternsFromFeedpattern_idascendingtripsForPattern(LinkedHashMultimap)trips.txtThese orderings are unrelated for any real-world feed. Every
pattern_stoprow ends up written with thepattern_idof a different pattern. Since thepatternstable is not recreated whenusePatternsFromFeed = true, it retains correct route/shape metadata — but thepattern_stopsreference wrongpattern_ids. No SQL constraint catches this; the corruption is silent and semantic.The code comment at the assignment site explicitly acknowledges the problem:
Proof
For a feed with 2 routes (Route A, Route B), each with 2 patterns:
patternsFromFeedorder (ascendingpattern_id):"1""2""3""5"tripsForPatternorder (trip file order):Positional assignment result — 0 out of 4 correct:
pattern_idwrittenpatternstable says"1""2""3""5"Both mismatches verified programmatically against a real GTFS feed and confirmed in the live editor.
Suggested Fix (in
ibi-group/gtfs-lib)Replace positional index assignment with a
shape_id-keyed lookup inPatternFinder.createPatternObjects():shape_idis the natural stable key — it is written to bothtrips.txtanddatatools_patterns.txtat export time, making it the correct join key between the two lists.Related
shape_idis not part ofTripPatternKey.equals(), so trips with the same stops but different shapes are merged into one pattern with non-deterministic shape assignment (associatedShapes.iterator().next()on aHashSet). Lower severity but worth addressing separately.