master consideres batchnorm mean/std update as gradients

in the way the master receives the "update" from the workers, for the bn mean weight (the running mean) it would consider the diff as a gradient and do something with this, instead of applying something dedicated to a value that was not updated by gradient descent.
the gamma/beta weights of the bn are ok in this respect.
I fear that this is playing badly with https://github.com/svalleco/mpi_learn/pull/3 
@duanders if you have any insights on how to modify the mpi-learn-optimizers to take this in consideration, please do tell

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

master consideres batchnorm mean/std update as gradients #19

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

master consideres batchnorm mean/std update as gradients #19

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions