Skip to content

Conversation

@Kuratius
Copy link

@Kuratius Kuratius commented Jan 5, 2023

The calculation of the gradient is missing the derivative of the sigmoid function for the outer layer and the weights of the hidden layer were updated in a way that affects the update of the weight of the input layer. This is not formally correct (though I imagine the latter probably works in a lot of cases if you do small enough steps). The former probably only works because the sigmoid function is strictly increasing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant