Giter VIP home page Giter VIP logo

compsci-589's Introduction

COMPSCI 589: Open Source ML Course

##Introduction

COMPSCI 589 is an open source applied machine learning course designed for senior undergraduate students and junior (masters-level) graduate students. The course materials have been developed by Prof. Benjamin M. Marlin at the College of Information and Computer Sciences, University of Massachusetts Amherst since fall 2014.

##How To Use These Materials

The course slides were created in Latex using the Beamer package. Pre-compiled PDF slides are available in the slides directory. Pre-compiled PDF handouts (without animations) are available in the handouts directory. The majority of the lectures also have accompanying Jupyter notebook demos. The demos are located in the demos/code directory.

The Latex source for the slides is available in the src directory. The title slide for each lecture can by customized with your course number, your name, and your affiliation by editing the src/config.tex file and recompiling the slides. To recompile the slides, you will need pdflatex installed with the Beamer package. Slides and handouts can be recompiled individually, or using the supplied compile_all_slides.sh bash script.

The demos require Python 2.7, Jupyter notebook, and a current version of scikit-learn. Some demos use additional packages including Theano and wxPython.

##Course Topics and Readings

The course introduces core machine learning models and algorithms for classification, regression, clustering, and dimensionality reduction. On the theory side, the course focuses on understanding models and the relationships between them. On the applied side, the course focuses on effectively using machine learning methods to solve real-world problems with an emphasis on model selection, regularization, design of experiments, and presentation and interpretation of results. The course also explores the use of machine learning methods across different computing contexts including desktop and cloud computing. The course focuses on Python, Scikit-Learn, and Apache Spark as toolkits.

The readings are taken from An Introduction to Statistical Learning [ISL], and The Elements of Statistical Learning, Second Edition [ESL], both of which are freely available.

##Course Contents

Unit 1: Classification

  • Lecture 1: Course Overview - Supervised and Unsupervised Learning

    Materials: Slides | Handouts | latex

    Reading: ISL Section 1 (p.1-9), Section 2.1.4 (p27-29)

  • Lecture 2: KNN and Decision Trees

    Materials: Slides | Handouts | latex

    Reading: ESL Section 2.3.2 (p.14-16), ISL: Section 8 (p. 303, 311-314), ESL Section 2.5 (p.22-23)

  • Lecture 3: Naïve Bayes, LDA, and Logistic Regression

    Materials: Slides | Handouts | latex

    Reading: ESL Section 4 (p. 101-102, 106-110, 119-120, 127-132)

  • Lecture 4: Overfitting, Regularization and Crossvalidation

    Materials: Slides | Handouts | latex

    ISL Section 2.2.3 (p. 37), Section 5 (176-183, 184-186)

  • Lecture 5: Support Vector Machines, Basis Expansion, and Kernels

    Materials: Slides | Handouts | latex

    Reading: ISL Section 9.5 (p.356-359)

  • Lecture 6: Neural Networks and Deep Learning

    Materials: Slides | Handouts | latex

    Reading: ESL Section 11.3 (p.392-395, 397-409)

  • Lecture 7: Ensembles and Classification

    Materials: Slides | Handouts | latex

    Reading: ISL Section 8.2 (p.316-324)

###Unit 2: Regression

  • Lecture 8: Linear Regression, Ridge and the Lasso

    Materials: Slides | Handouts | latex

    Reading: ISL Section 3.1 (p.61-63), Section 3.2 (p.71-75), Section 6.2 (p.214-224), Section 3.3.2 (p.86-92)

  • Lecture 9: KNN, Regression Trees, and Feature Selection

    Materials: Slides | Handouts | latex

    Reading: ISL Section 3.5 (p.104-109), Section 8.1.1 (p.304-311), Section 6.1 (205-210)

  • Lecture 10: Support Vector and Neural Network Regression

    Materials: Slides | Handouts | latex

    Reading: ESL Section 11.3 (392-401), ESL Section 12.3.6 (p.434-438)

  • Lecture 11: KOLS and Gaussian Process Regression

    Materials: Slides | Handouts | latex

    Reading: Gaussian Processes in Machine Learning

###Unit 3: Large-Scale Learning

###Unit 4: Clustering

  • Lecture 15: Hierarchical Clustering

    Materials: Slides | Handouts | latex

    Reading: ISL Section 10.3.2 (p.390-401)

  • Lecture 16: K-Means Clustering

    Materials: Slides | Handouts | latex

    Reading: ISL Section 10.3.1 (p.386-390), ESL Section 6.8 (p.214-216), Section 8.5 (p.272-276)

  • Lecture 17: Mixture Models

    Materials: Slides | Handouts | latex

    Reading: ISL Section 10.3.1 (p.386-390), ESL Section 6.8 (p.214-216), Section 8.5 (p.272-276)

###Unit 5: Dimensionality Reduction

  • Lecture 18: Linear Dimensionality Reduction and SVD

    Materials: Slides | Handouts | latex

    Reading: ESL Section 14.15.1 (p.534-536)

  • Lecture 19: Principal Component Analysis

    Materials: Slides | Handouts | latex

    Reading: ISL Section 10.3 (p.374-385)

  • Lecture 20: Sparse Coding, Non-negative Matrix Factorization, and Independent Component Analysis

    Materials: Slides | Handouts | latex

    Reading: ESL Section 14.6 (p.553-557), Section 14.7 (p.557-570),

    Reading: Sparse Coding

  • Lecture 21: Kernel PCA and Spectral Clustering

    Materials: Slides | Handouts | latex

    Reading: ESL Section 14.15.3 (p.544-547), ESL Section 14.15.4 (p.547-550),

  • Lecture 22: Multidimensional Scaling and Isomap

    Materials: Slides | Handouts | latex

    Reading: ESL Section 14.8-9 (p.570-576)

##List of Demos

  • Lecture01: Introduction to Python
  • Lecture02: KNN and Decision Trees
  • Lecture03: Naive Bayes, LDA and Logistic Regression
  • Lecture04: Model Complexity and Overfitting
  • Lecture05: SVMs, Basis Expansions and Kernels
  • Lecture06: Neural Network Classification (uses Theano)
  • Lecture11: Gaussian Processes (uses wxPython)
  • Lecture15: Hierarchical Clustering
  • Lecture16: KMeans Clustering
  • Lecture17: Mixture Models
  • Lecture18-20: Linear Dimensionality Reduction

##Legal

Copyright 2016 Benjamin M. Marlin. These materials are provided under the GNU GENERAL PUBLIC LICENSE Version 3 (GPL 3). As permitted by GPL 3 Section 7(b), all attributions present in this work must be preserved in all copies and derived works.

##Support

The development of these materials is supported by the National Science Foundation through award # IIS-1350522.

compsci-589's People

Contributors

benmarlin avatar steveli avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.