Skip to content

Commit 23af742

Browse files
committed
add literature reference in nag optimizer learn.md
1 parent ae7ce40 commit 23af742

File tree

1 file changed

+6
-0
lines changed

1 file changed

+6
-0
lines changed

Problems/X_nag_optimizer/learn.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,12 @@ Where:
2626

2727
The key difference from classical momentum is that the gradient is evaluated at $\theta_{lookahead, t-1}$ instead of $\theta_{t-1}$
2828

29+
Read more at:
30+
31+
1. Nesterov, Y. (1983). A method for solving the convex programming problem with convergence rate O(1/k²). Doklady Akademii Nauk SSSR, 269(3), 543-547.
32+
2. Ruder, S. (2017). An overview of gradient descent optimization algorithms. [arXiv:1609.04747](https://arxiv.org/pdf/1609.04747)
33+
34+
2935
## Problem Statement
3036
Implement the Nesterov Accelerated Gradient optimizer update step function. Your function should take the current parameter value, gradient function, and velocity as inputs, and return the updated parameter value and new velocity.
3137

0 commit comments

Comments
 (0)