Skip to content

Train stride tradeoffs #27

@alecgunny

Description

@alecgunny

Data sampling at train time depends on a kernel_stride parameter that indicates how much time (in seconds) to place between kernels that get sampled from the training timeseries, implicitly setting

  • The set of initial timestamps of kernels that will be sampled from the timeseries
  • The number of kernels that constitute an epoch

Do we actually need either of these?

  • This first point just limits the number of different view of the data the network gets to see, which would seem to encourage overfitting
  • This second point is largely arbitrary: an epoch for us is just an opportunity to do some validation. We could just sample uniformly from the timeseries for each batch and set some pre-determined interval to validate at.

The counter point to this is that if sampled uniformly from the timeseries (with an effective stride of 1 / sample_rate), all of our kernels would have an enormous amount of overlap with other kernels in the dataset, and might decrease the efficiency with which we feed data to the network.

Think of it this way. If we set kernel_stride = kernel_length, then every kernel would present entirely new information to the network, but we would have a lot fewer samples to train on. If kernel_stride = 1 / sample_rate, most kernels don't present new information, but we have a lot of them. We could even see this second case as a sort of real-time augmentation of a less-frequently sampled set of kernels.

I have to imagine there's a tradeoff here between the number of updates required for convergence and the converged validation loss, but I don't really know what it is. Measuring this and explaining it more fully would be really valuable for all GW DL applications that have to build kernels from these longer timeseries.

Metadata

Metadata

Assignees

No one assigned

    Labels

    dataResearch topic about data used to train DeepCleanresearch topicQuestion about DeepClean optimization and interpretation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions