This is code for the paper
Improving Stochastic Gradient Descent with Feedback,
Jayanth Koushik*,
Hiroaki Hayashi*,
(* equal contribution)
All results from the paper, and more are in the data
folder. For example data/cnn/cifar10/eve.pkl
has the results for using Eve to optimize a CNN on CIFAR10. The pickle files contain the loss history and cross-validation parameters. Additionally, all results are visualized in a jupyter notebook src/compare_opts.ipynb
. The fixed models used in the paper are in src/models.py
. The models are implemented in Keras. The experiments can be run using src/runexp.py
. Run this script with --help
as an argument to see the interface. The code for the character language model is in src/charnn.py
. It is implemented in Theano. A keras implementation of our algorithm Eve is in src/eve.py
. A theano implementation is also available in src/theano_utils.py
.
If you find this code useful, please cite
@misc{1611.01505,
Author = {Jayanth Koushik and Hiroaki Hayashi},
Title = {Improving Stochastic Gradient Descent with Feedback},
Year = {2016},
Eprint = {arXiv:1611.01505},
}