rmsprop final steps

I'm slightly confused about the final steps described in the doc vs the code below, should the Nesterov momentum be applied before updating the parameters, i.e.: self.wrt -= step1 + step2

```
        step1 = step_m1 * self.momentum
        self.wrt -= step1
        gradient = self.fprime(self.wrt, *args, **kwargs)

        self.moving_mean_squared = (
            self.decay * self.moving_mean_squared
            + (1 - self.decay) * gradient ** 2)
        step2 = self.step_rate * gradient
        step2 /= sqrt(self.moving_mean_squared + 1e-8)
        self.wrt -= step2

        step = step1 + step2
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rmsprop final steps #26

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

rmsprop final steps #26

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions