-
-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Bug: Mods duplicating in Firestore — URL format mismatch in icarus-mod-tools sync pipeline
Summary
Mods are being duplicated in the Firestore mods collection (and therefore on projectdaedalus.app/mods and in the Mod Manager). Out of 437 total mod documents, 18 mods are duplicated (15 exact same-author duplicates, 3 same-name-different-author). The root cause is that the imt sync pipeline creates two separate Firestore documents for the same mod because GitHub URLs appear in two different-but-equivalent string formats, and no URL normalization is applied anywhere in the pipeline.
Affected Mods (18 duplicates found)
| Mod Name | Author | DocID (Copy A — updated) | DocID (Copy B — orphaned) |
|---|---|---|---|
| Ammo Crafting Overhaul | AgentKush | C36TS9hnyLTpdvZlQtp9 | MRAJGEkocxjFkLlDzjuQ |
| Armor Set Bonuses Enhanced | AgentKush | PVroMlKJxc0B0hifUhwP | bdCEYK62gdl5EQyssNZW |
| Creature Difficulty Scaling | AgentKush | hZ4KC1AxNU3trvnoO9tt | ywxphBQ0u9mM78Uzsdkg |
| Creature Drop Enhancer | AgentKush | B486JpYYGqaVGV9azcid | r4hNce8YNsuhxMtyXygz |
| Exotic Economy Overhaul | AgentKush | ZoAqKslCgHTh05gmFN9L | hMWOOZjijHNBV89vbHdG |
| Extended Spoil Timers | AgentKush | HsF6nRiXT1OsnMKbNSrP | qLnloK5jAz1AATpiiYGy |
| Faster Crafting | AgentKush | 49aoTZnWS1aGfCsXyHpp | VYEMpqUxCwpBfSgIFZoV |
| Fish and Farming Boost | AgentKush | QjjJOWjmMCQgYGPfP2Rr | ZcIRrbgz7jnwxBT4h9Vc |
| Hardcore Rebalance Pack | AgentKush | TYFO3Oywk2xIb2dX2DZs | qbP6EorPnjDqDhzpk0dv |
| Performance Optimizer | AgentKush | eQh97ZFKg4OTkWme8W3v | xo3yyL0ez61tZvnzS9cj |
| Stack Size Overhaul | AgentKush | 6gNWsTqVbmmjXZ91tZD8 | t7pRO3RC2sq4wQa34GQA |
| Tier Upgrade Forge | AgentKush | cXi3IIm7NF0FF3YodSvu | gyw13Wg07HCFPvyIUHPK |
| Trap and Defense Expansion | AgentKush | Y0dtJQlZiWl1O9q7z7RC | bDZEOAKJ8dmMDIJx4h7O |
| Jimk Fixed Weather Vane | Jimk72 | kVFxZDaWtEJPnnFqi8gN | z6pIfflkYpFjfshtxqA9 |
| Extraction 10 Seconds | Jimk72 / TheOrangeFloof | LRmG4qwVxMyN6aDSdtlt | SR9N5m9olseyhxcEwp1a |
| Extraction 5 Seconds | TheOrangeFloof / Jimk72 | BuM5jiC6WBbdSojVYkoE | khzIZhD0GpRxUg8zvq5c |
| Larkwell Care Package | Begginfokillz / Eric | So6FTF4m3gdV0m6XaFPU | c5y23huHKkdOgCpRVkW4 |
| Zay - Extended Frozen Ore | Zayon | tuYdRhOkJXoM2fe1wtbn | WU6IA6olDg2xIVNE0EAR |
Root Cause
The URL format mismatch
Modders write URLs in modinfo.json using the human-friendly GitHub format:
https://github.com/AgentKush/Icarus-mods/raw/main/Faster_Crafting/Faster_Crafting.EXMODZ
The GitHub Contents API (via Octokit) returns download_url values in the raw format:
https://raw.githubusercontent.com/AgentKush/Icarus-mods/main/Faster_Crafting/Faster_Crafting.EXMODZ
These point to the exact same file (GitHub 302-redirects one to the other), but they're different strings. There is no URL normalization anywhere in icarus-mod-tools to canonicalize them before comparison or storage.
How this creates duplicates
During a single imt sync run, the meta/modinfo/list collection ends up with both URL formats pointing to the same modinfo.json:
https://raw.githubusercontent.com/AgentKush/Icarus-mods/main/modinfo.json(from GitHub API scan)https://github.com/AgentKush/Icarus-mods/raw/main/modinfo.json(possibly from manualimt addor a secondary discovery path)
When imt sync mods runs, it fetches the modinfo.json from both URLs, getting identical JSON content. Both produce Modinfo objects with the same name/author. The first creates a Firestore document. The second should find it via sync.find(list) and update — but instead creates a new document.
Why sync.find() misses the existing document
The most likely explanation is Firestore eventual consistency. The duplicate pairs are created within ~5 seconds of each other:
'Tier Upgrade Forge':
Copy A created: 2026-02-22T05:00:44.459953Z
Copy B created: 2026-02-22T05:00:49.731891Z (5.3 seconds later)
The batch processes all mods from the first modinfo.json URL, then immediately processes the same mods from the second URL. The find query runs before Firestore has fully indexed the documents from the first batch, returns nil, and a duplicate is created.
Evidence: Copy B entries are orphaned
Copy B entries have createTime == updateTime — they were created once and never updated again by subsequent sync runs:
Copy A (updated regularly): Created: 2026-02-22T05:00:44Z Updated: 2026-02-25T01:00:52Z ← different, gets regular updates
Copy B (orphaned):
Created: 2026-02-22T05:00:49Z
Updated: 2026-02-22T05:00:49Z ← same as create, never touched again
This confirms Copy B was created by accident and is never matched by subsequent sync runs — likely because the subsequent runs only process one URL format from meta/modinfo/list.
Additional data anomaly: "Eye Colors Expanded!" has wrong download URLs
The mod Eye Colors Expanded! (DocID: 55F4mIY6qi5RYsAY278Y) has file URLs that actually point to More Drop Ship Slots files:
exmodz: .../More%20Drop%20Ship%20Slots/More%20Drop%20Ship%20Slots.EXMODZ
pak: .../More%20Drop%20Ship%20Slots/More%20Drop%20Ship%20Slots_P.pak
This is probably a modder error in their modinfo.json but worth noting.
Suggested Fix
1. Add URL normalization (primary fix)
Add a helper method to normalize all GitHub URLs to a single canonical format:
def normalize_github_url(url) return url if url.nil? || url.empty?github.com/OWNER/REPO/raw/BRANCH/path → raw.githubusercontent.com/OWNER/REPO/BRANCH/path
url.sub(
%r{https://github\.com/([^/]+/[^/]+)/raw/},
'https://raw.githubusercontent.com/\1/'
).sub(
%r{https://github\.com/([^/]+/[^/]+)/blob/},
'https://raw.githubusercontent.com/\1/'
)
end
Apply this when:
- Storing
download_urlvalues inmeta/modinfo/list(duringimt sync modinfo) - Parsing URLs from modinfo.json content (during
imt sync mods) - Storing file URLs in the
modscollection documents
2. Deduplicate meta/modinfo/list (secondary fix)
Before running imt sync mods, deduplicate the modinfo URL list by normalized URL so the same modinfo.json isn't fetched twice.
3. Add a pre-create check with retry (belt-and-suspenders)
Before creating a new document in sync_list, add a brief delay or re-query to handle Firestore eventual consistency:
doc_id = sync.find(list)
if doc_id.nil?
sleep(0.5) # Brief pause for Firestore consistency
doc_id = sync.find(list) # Retry
end
4. Cleanup existing duplicates
The 18 "Copy B" orphaned documents should be deleted from Firestore. These are the entries where createTime == updateTime from the table above.
How to Reproduce
- Have a modinfo.json with URLs using
github.com/.../raw/...format - Ensure the repo is in the tracked repos list
- Run
imt sync modinfo— this stores the modinfo.json URL inraw.githubusercontent.comformat (from GitHub API) - If
meta/modinfo/listalso contains thegithub.com/.../raw/...format URL (from manual add or secondary discovery), both are stored - Run
imt sync mods— both URLs get fetched, producing duplicate Modinfo objects - Rapid-fire Firestore writes bypass the
finddeduplication due to eventual consistency
Environment
Icarus-Mod-Toolsgem v2.5.x- Google Cloud Firestore (project:
projectdaedalus-fb09f) - Mod Manager: IcarusModManagerPATCH241
- Website: projectdaedalus.app/mods