This homework assignment investigates implementing Adam and a few variants. We will be testing your optimizers on a simplified implementation of GPT based on the minGPT repository by Andrej Karpathy. This is a model that takes as input a sequence of characters from a text file and attempts to predict the next character. This can be used to generate novel text by starting with a seed text string, and then repeatedly using the model to generate another character.
(these instructions assume you are using google Colab. If you do something else, you are on your own.)
In colab, go to file->open notebook, choose the "GitHub" option, and past the name of this repo (https://github.com/acutkosky/EC500PA2). You may also clone the repo into your own github account if you wish. Select the PA2.ipynb file to open.
Go to runtime->change runtime type, and select "GPU".
MIT