Decision trees, Bootstrap aggregation and Random forests, feature importance with Breiman's algorithm.
Multinomial and ordinal logistic regression implementation (L-BFGS used for optimizing log-likelihood). MLR coefficient interpretation.
Ridge regression (using the closed form solution with intercept) and Lasso regression (using the Powell method for optimization). Grid search for the best ridge regularization weight on a superconductor dataset.
Kernelized ridge regression and support vector regression implementation. Grid search and model evaluation on a real world dataset with RBF and Polynomial kernels.
Implementation of standard risk estimation techniques: validation set, train-test split, cross-validation. Demonstration and interpretation of risk estimation techniques on a toy DGP (implemented in R).
Implementation of deep neural networks and the backpropagation algorithm. Supported activations per layer are ReLU and sigmoid, but this can easily be extended. Verification of correctness of backpropagation is also present, computed partial derivatives are compared with numerical estimates. Comparison of deep neural networks and machine learning algorithms implemented in previous homeworks on the housing dataset. Hyperparameter optimization on a ~50k records dataset.
Implementation of hard margin and soft margin SVM using CVXOPT library for quadratic programming. 2D demonstrations added for both algorithms.
Principal component analysis implementation with numpy, and demonstration using the well-known Iris dataset.