-
Notifications
You must be signed in to change notification settings - Fork 44
Gradient Descent
Jinho D. Choi edited this page Jan 1, 2017
·
5 revisions
- Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms, Collins, EMNLP, 2002.
- Solving Large Scale Linear Prediction Problems Using Stochastic Gradient Descent Algorithms, Zhang, ICML, 2004.
- Adaptive Subgradient Methods for Online Learning and Stochastic Optimization, Duchi et. al., JMLR, 2011.
- Stochastic Gradient Descent Training for L1-regularized Log-linear Models with Cumulative Penalty, Tsuruoka et. al., ACL, 2009.
- Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization, Xiao, JMLR, 2010.
- AdaDelta: An Adaptive Learning Rate Method, Zeiler, arXiv:1212.5701, 2012.
- ADAM: A Method For Stochastic Optimization, Kingma and Ba, ICLR, 2015.†
Copyright © 2015-2019 Emory University - All Rights Reserved.
