Kaggle competition. https://www.kaggle.com/c/random-acts-of-pizza/data?sampleSubmission.csv
Milestones
Download the data and figure out how to import it into python/numpy objects so you can process it with SKLearn. Split the data you get into training and development for running your own experiments. Establish a baseline and submit to Kaggle for verification. For your submission, you should probably train your model on all the data you have. Send your instructor a link to the leaderboard that shows your baseline score. Due Date: 7/14
Create briefly summarizing the work we’ve done and what we plan to do. Due Date: 7/28
Try different models and parameter settings. Engineer new features. Use feature selection techniques. Examine errors and iterate. See how much progress on the leaderboard we can make.
Synthesize our work in an ipython notebook. We’d like to make your notebooks public and as useful as possible to people getting started with machine learning. We are more interested in experiments and analysis that make concepts we’ve learned clear than our ranking on the leaderboard (though good performance will make people more interested in reading what we have to say).
Try to limit the number of experiments we include in the notebook to just those that improved results or were interesting in some useful way. If we’d like to include extra work, add an appendix at the end. Include some kind of summary table that shows the relative contribution of each important idea we've implemented. Due Date: 8/13