Skip to content

Conversation

@Chris-Nicholls
Copy link

From the dropout paper http://www.cs.toronto.edu/~rsalakhu/papers/srivastava14a.pdf :

If a unit is retained with probability p during training, the outgoing weights of that unit are multiplied by p at test time

The test activations should be scaled by (1-drop_prob), not drop_prob.
For example, if drop prob is 0, this layer should have no effect and we should scale activations by 1.

@ratajczak
Copy link

Hi Chris, it looks like duplicate of #61

@Chris-Nicholls
Copy link
Author

Yup, you're right. Looks like this isn't being maintained anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants