This repository was archived by the owner on Jun 2, 2023. It is now read-only.

Description
When training the RGCN in PyTorch, if shuffle=True then the reaches get mixed up during training and no longer maintain the relationships in the adjacency matrix. shuffle is false by default, but it's an easy thing to overlook. @jsadler2, @jdiaz4302, @jds485 not sure if any of you are using River-dl RGCN workflows, but wanted to give you a heads up if so.
Not sure what the best way to safeguard against this is. Right now the RGCN treats each reach time series as a training instance. I think it's more accurate to think of an entire sequence for the entire network as a training instance, so when you shuffle them you're shuffling the order to model sees sequences for the entire network. Basically going from the input shape of [n reaches, sequence length, n features] to [batch size, n reaches, sequence length, n features]. Does that make sense? Any hot takes?