Hands On Machine Learning

with Scikit-Learn and TensorFlow

This repository is my notes and exercise solutions while I was reading Hands On Machine Learning with Scikit-Learn and TensorFlow by Aurélian Géron.

All of the code and notes are written in Jupyter notebooks that are written to run in Google Colab. This allows me to not need to set up a virtual environment and install machine learning libraries each time. Colab has most of the modules this book uses installed by default. There is some Colab specific code in here that will not work on a run-of-the-mill Jupyter Notebook kernel. There are aslo differences in how mathematical formuals are rendered in Colab than in Github's ipynb renderer.

Topics Covered
Papers Cited
License

Topics Covered

Chapter 2: End to End Machine Learning Project

`Housing.ipynb`

Downloading data for a machine learning project.
Inspecting data with pandas.
Plotting histograms of the data.
Splitting data into training and test sets.
Discretizing continuous features.
Visualizing geographic data.
Computing the correlation between features with pandas.
Visualizing relationships between features using scatter plots.
Combining features to make new ones.
Cleaning the data for machine learning.
Handling categorical features using OneHotEncoder.
Defining custom feature transformations.
Scikit-Learn's Pipeline object.
Computing the root mean squared error (RMSE) of a regression model.
Cross validation.
Grid searching hyperparameters with Scikit-Learn's GridSearchCV.
Evaluating models with a test set.
Training a Support Vector Machine (SVM) on the Housing dataset.
Fine tuning a RandomForestRegressor using RandomizedSearchCV.
Creating a pipeline for a machine learning model with the Housing dataset.

Chapter 3: Classification

`MNIST.ipynb`

The MNIST dataset, a dataset of images of handwritten digits.
Training a binary classifier.
Measuring accuracy of a classifier model.
Confusion matrix.
Precision and recall.
F1 score.
Precision/recall tradeoff.
The receiver operating characteristic (ROC) curve.
Multiclass classifier.
Error analysis using the confusion matrix and plotting examples where the model was wrong.
Multilabel classification.
Multioutput classification.

`SpamClassifier.ipynb`

Downloading the email dataset.
Processing email data in Python.
Counting the most common words and symbols in text data.
Viewing the headers in email data with Python.
Parsing HTML into plaintext with Python.
Transforming words into their stems with Scikit-Learn's PorterStemmer class.
Extracting URLs from text with Python.
Defining a transformer class using Scikit-Learn which extracts the word counts from email data.
Defining a transformer class which transforms word counts into a vector which can be used as an input to a machine learning model.
Using LogisticRegression to classify emails as spam or ham.
Evaluating the model's performance with cross-validation.

`Titanic.ipynb`

Downloading the Titanic dataset from Kaggle.
Defining a Pipeline to transform the data from Kaggle into input for a machine learning model.
Train an SGDClassifier to determine if a passenger survived or died on the Titanic.
Evaluate the SGDClassifier with cross-validation.
Do a grid search with GridSearchCV to fine tune the hyperparameters of a RandomForestClassifier.

Chapter 4: Training Models

`TrainingModels.ipynb`

Linear Regression.
Mean squared error (MSE).
The normal equation and its computational complexity.
Batch Gradient Descent.
Learning rate.
Stochastic Gradient Descent.
Mini-Batch Gradient Descent.
Polynomial regression with Scikit-Learn.
Learning curves.
Regularization.
Ridge Regression with Scikit-Learn.
Lasso Regression with Scikit-Learn.
Elastic Net with Scikit-Learn.
Early stopping.
Logistic Regression.
Decision boundaries.
Softmax Regression with Scikit-Learn on the Iris dataset.
Implementing Batch Gradient Descent and Softmax Regression without using Scikit-Learn.

Chapter 5: Support Vector Machines

`SupportVectorMachines.ipynb`

Linear Support Vector Machine (SVM) classification.
Hard margin classification.
Soft margin classification.
Scikit-Learn's LinearSVM classifier.
Nonlinear SVMs classification.
Using polynomial kernels for SVM classification.
Adding features using a similarity function and landmark instances.
Gaussian Radial Bias Function (RBF).
Using Gaussian RBF kernels for SVM classification.
Computational complexity of SVMs.
SVM Regression.
Scikit-Learn's LinearSVR class.
Nonlinear SVM regression using Scikit-Learn's SVR class.
The SVM decision function.
The training objective for an SVM for hard and soft margin classification.
Quadratic programming.
Solving the dual problem of a quadratic programming problem.
Kernelized SVMs and applying the kernel trick.
Computing the decision function for a nonlinear SVM using the kernel trick.
Online SVMs.
Using the hinge loss function for Gradient Descent.
Using a QP solver to train an SVM by solving the dual problem.
Comparing training a LinearSVC and a SGDClassifier class to get the same model.
Training SVM classifier to classify handwritten digits.
Training an SVM regression model to predict housing prices with the California housing dataset.

Chapter 6: Decision Trees

`DecisionTrees.ipynb`

Training a DecisionTreeClassifier with Scikit-Learn.
Visualizing a Decision Tree's decision making algorithm.
Making predictions with a Decision Tree.
Gini impurity.
White-box versus black-box models.
Estimating class probabilities with a DecisionTreeClassifier.
The Classification And Regression Tree (CART) algorithm.
The computational complexity of training and making predictions with a Decision Tree.
Using entropy instead of Gini impurity to train a Decision Tree.
Parametric versus nonparametric machine learning models.
Training a DecisionTreeRegressor with Scikit-Learn.
The cost function for training a Decision Tree for regression.
Computing the approximate depth of a Decision Tree.
Train a DecisionTreeClassifier to classify instances of the moons dataset.
Implementing a Random Forest using Scikit-Learn's DecisionTreeClassifier.

Chapter 7: Ensemble Learning

`EnsembleLearning.ipynb`

Voting classifiers.
Hard voting classifiers.
Strong leaner versus weak learners.
Scikit-Learn's VotingClassifier class.
Bagging and pasting.
Scikit-Learn's BaggingClassifier.
Out-of-bag evaluation.
Random Patches and Random Subspaces.
Scikit-Learn's RandomForestClassifier.
Extremely Randomized Trees (Extra-Trees).
Feature importance.
Boosting (in the context of Emsemble Learning).
AdaBoost.
Using Scikit-Learn's AdaBoostClassifier.
Gradient Boosting, Gradient Tree Boosting, and Gradient Boosted Regression Trees (GBRTs).
Training a GRBT with and without Scikit-Learn's GradientBoostingRegressor class.
Stacked generalization (stacking).
Training an voting ensemble model on the MNIST dataset using Scikit-Learn.
Training an ensemble which uses stacking and comparing the results to the voting ensemble.

Chapter 8: Dimensionality Reduction

`DimensionalityReduction.ipynb`

The curse of dimensionality.
Projection.
Manifold learning and the manifold hypothesis.
Principal Component Analysis (PCA).
Singular Value Decomposition (SVD).
PCA with Scikit-Learn.
Explained variance ratio.
Reconstructing the original data after PCA and reconstruction error.
Incremental PCA (IPCA).
IncrementalPCA with Scikit-Learn.
Randomized PCA.
Kernel PCA (kPCA).
KernelPCA with Scikit-Learn.
Locally Linear Embedding (LLE).
LocallyLinearEmbedding with Scikit-Learn.
Multidimensional Scaling (MDS).
Isomap.
t-Distributed Stochastic Neighbor Embedding (t-SNE).
Linear Discriminant Analysis (LDA).
Training a Random Forest with the MNIST dataset and observing how PCA helps reduce the time it takes to train the model.
Reducing the dimension of the MNIST dataset and plotting the result.

Chapter 9: Up and Running with Tensorflow

`UpAndRunningWithTensorflow.ipynb`

Creating a TensorFlow graph.
Running a computation defined with a TensorFlow graph.
Managing TensorFlow graph.
Lifecycle of a node value.
Linear Regression with TensorFlow.
Manually computing the gradient versus using the autodiff algorithm versus using a GradientDescentOptimizer.
Gradient Descent using a MomentumOptimizer.
Feeding data to a training algorithm using tf.placeholder().
Implementing Mini-Batch Gradient Descent with TensorFlow.
Saving and restoring a model.
Visualizing the graph and training curves using TensorBoard in Google Colab.
Name scopes.
Rectified linear units (ReLU).
Creating a neural network by iteratively applying ReLU operations.
Sharing variables.
Implementing Logistic Regression using TensorFlow.

Chapter 10: Introduction to Artificial Neural Networks

`ArtificialNeuralNetworks.ipynb`

The invention of artificial neural networks (ANNs).
Biological neurons.
Performing computations with artificial neurons.
Perceptrons and Linear threshold units (LTUs).
Hebb's rule (Hebbian learning).
TensorFlow's Perceptron class.
Multilayer perceptrons (MLPs) and deep neural networks (DNNs).
Backpropagation.
Softmax function.
Feedforward neural networks (FNNs).
Training a DNN with TensorFlow's high-level API.
Training a DNN with plain TensorFlow.
Fine tuning neural network parameters.
Training a DNN for classifying the MNIST dataset.

Chapter 11: Training Deep Neural Nets

`TrainingDeepNeuralNets.ipynb`

The vanishing/exploding gradients problem.
Xavier and He initialization.
Leaky ReLU activation function.
Exponential linear unit (ELU) activation function.
Batch Normalization.
Implementing Batch Normalization with TensorFlow.
Gradient clipping.
Reusing pretrained models.
Reusing pretrained TensorFlow models.
Reusing models from other frameworks when using TensorFlow.
Freezing lower layers with TensorFlow.
Caching frozen layers.
Model zoos.
Unsupervised pretraining.
Pretraining with an auxilary task.
Momentum optimizers.
TensorFlow's MomentumOptimizer.
Nesterov Accelerated Gradient.
AdaGrad algorithm.
TensorFlow's AdagradOptimizer.
RMSProp algorithm.
TensorFlow's RMSPropOptimizer.
Adaptive moment estimation (Adam).
TensorFlow's AdamOptimizer.
Learning rate scheduling.
Using regularization while training a neural network.
Implementing regularization with TensorFlow.
Dropout and implementing models that use dropout with TensorFlow.
Follow the Regularized Leader (FTRL).
FTRL-Proximal and the FTRLOptimizer class.
Max-norm regularization and implementing it in TensorFlow.
Data augmentation.
Training a DNN to classify the MNIST dataset.
Transfer learning with the MNIST dataset.
Pretraining with an auxilary task before training a DNN to classify the MNIST dataset.

Chapter 12: Distributing TensorFlow Across Devices and Servers

`DistributingTensorflow.ipynb`

Nvidia's Compute Unified Device Architecture library (CUDA).
Managing the GPU RAM.
Placing operations on devices.
Dynamic placer algorithm versus a simple lacer.
Logging which device each node is pinned to.
Parallel execution in TensorFlow.
Distributing devices across multipe servers.
TensorFlow cluster specifications.
Master and worker services.
Sharing variables across servers.
Resource containers.
TensorFlow queues.
Asychronous communication using queues.
TensorFlow's FIFOQueue.
TensorFlow's RandomShuffleQueue.
TensorFlow's PaddingFIFOQueue.
Multithreaded readers using TensorFlow's Coordinator and QueueRunner.
TensorFlow's string_input_producer() function.
TensorFlow's input_producer(), range_input_producer(), and slice_input_producer() functions.
TensorFlow's shuffle_batch() function.
In-graph versus between-graph replication.
Model parallelism.
Data parallelism.
Synchronous versus asynchronous updates.
Bandwidth saturation.
Training multiple TensorFlow DNNs in parallel.

`ParallelNeuralNetworks.ipynb`

Training a distributed DNN with TensorFlow.
Comparing the performance of synchronous versus asynchronous updates.

Chapter 13: Convolutional Neural Networks

`ConvolutionalNeuralNetworks.ipynb`

The visual cortex in animals and local receptive fields.
Convolutional neural networks (CNNs).
Convolutional layers.
Zero padding.
Stride.
Filters and feature maps.
Stacking feature maps.
Convolutional layers in TensorFlow.
VALID versus SAME padding.
Memory requirements of a CNN.
Pooling layers.
ILSVRC ImageNet challenge and other visual challenges CNNs can solve.
LeNet-5.
AlexNet.
GoogLeNet.
Residual Network (ResNet).
VGGNet and Inception-v4.
Convolution operations with TensorFlow.

`DeepDream.ipynb`

Displaying a model graph from TensorBoard inline in Colab.
Naive feature map visualization.
Improving the visualization using gradient ascent.
Laplacian Pyramid Gradient Normalization.
Visualizing different feature maps in the Inception model.
Google's DeepDream algorithm.

`Inception.ipynb`

Preparing data for the Inception v3 model.
Downloading the pretrained Inception v3 model.
Defining the Inception v3 model graph with TensorFlow.
Restoring the model parameters for a TensorFlow model.
Labeling images with the Inception v3 and TensorFlow.
Implementing data augmentation for image data using numpy.
Creating a model graph with pretrained lower layers from the Inception v3 model.
Training the higher layers of the new model for a different classification task.

`MNIST.ipynb`

Implementing an augmented version of LeNet-5 using TensorFlow.
Training a CNN to classify the MNIST dataset with over 99% accuracy.

Chapter 14: Recurrent Neural Networks

`RecurrentNeuralNetworks.ipynb`

Recurrent neural networks (RNNs).
Recurrent neurons.
Implementing recurrent neurons in TensorFlow.
Static unrolling through time with TensorFlow.
Dynamic unrolling through time with TensorFlow's BasicRNNCell.
Handling variable input sequence lengths in TensorFlow.
Handling variable output sequence lengths.
Backpropagation through time (BPTT).
Training a sequence classifier with TensorFlow.
Training a model to predict time series.
Generating new sequences with creative RNNs.
Deep RNNs and TensorFlow's MultiRNNCell.
Creating a deep RNN across devices.
Applying dropout while training an RNN.
The difficulty of training RNNs for long sequences.
Truncated backpropagation through time.
Long Short-Term Memory (LSTM) cell.
TensorFlow's BasicLSTMCell.
Peephole connections.
Gated Recurrent Unit (GRU) cell.
TensorFlow's GRUCell.
Natural language processing (NLP).
Word embeddings.
Computing embeddings using TensorFlow.
Defining a model graph for sequence-to-sequence machine translation using TensorFlow.

`Exercises.ipynb`

Embedded Reber grammars.
Training a model to classify if a sequence is a Reber grammar using TensorFlow.
Training a model for the "How Much Did it Rain II" Kaggle competition.
Developing a Spanish-to-English translation system using TensorFlow.

Chapter 15: Autoencoders

`Autoencoders.ipynb`

Autoencoders.
Codings.
Undercomplete autoencoders.
Performing PCA with a undercomplete linear autoencoder using TensorFlow.
Stacked autoencoders.
Training a stacked autoencoder with TensorFlow.
Tying weights.
Training one autoencoder at a time in multiple TensorFlow graphs.
Training one autoencoder at a time in a single TensorFlow graph.
Caching outputs from the frozen layer to speed up training.
Visualizing the reconstructions.
Visualizing the extracted features.
Unsupervised pretraining using an autoencoder for a classification task.
Stacked denoising autoencoders.
Sparse autoencoders.
Kullback-Leibler divergence.
Variational autoencoders.
Latent loss.
Implementing a variational autoencoder using TensorFlow.
Contractive autoencoders.
Stacked convolutional autoencoders.
Generative stochastic networks (GSNs).
Winner-take-all (WTA) autoencoders.
Generative adversarial networks (GANs).

`Exercises.ipynb`

Pretraining with a convolutional autoencoder to train an image classification model using TensorFlow.
Semantic hashing.
Implementing an autoencoder to compute semantic hashes of images using TensorFlow.
Training a semantic hashing model by pretraining a CNN for image classification.
Training a convolutional variational autoencoder (CVAE) to generate new instances of the Oxford Flowers dataset.

Chapter 16: Reinforcement Learning

For this chapter, I highly recommend opening these notebooks in Colab. Github does not render the <video> tags in these notebooks which show the agents playing the Open AI Gym games.

`ReinforcementLearning.ipynb`

Reinforcement learning.
Policy and policy search.
Genetic algorithms.
Policy gradients.
OpenAI Gym.
Ms. Pacman OpenAI Gym environment.
CartPole OpenAI Gym environment.
Hard coding a policy.
Neural network policies with TensorFlow.
Training a neural network to learn a hard-coded policy.
The credit assignment problem.
REINFORCE algorithms.
Training a neural network agent for the CartPole environment using policy gradients.
Markov chains.
Markov decision processes (MDPs).
Bellman Optimality Equation.
Value Iteration algorithm.
Q-Values.
Temporal Difference Learning (TD Learning).
Approximate Q-Learning and Deep Q-Learning.
Deep Q-networks (DQNs).
Replay memory.
Preprocessing observations from the Breakout OpenAI Gym environment.

`MsPacMan.ipynb`

The Ms. Pacman OpenAI Gym environment.
Preprocessing the observations from the Ms. Pacman environment.
Training a DQN to play Ms. Pacman.

`Exercises.ipynb`

The BipedalWalker Open AI Gym environment.
Training a neural network policy for the BipedalWalker environment using policy gradients.
The Pong Open AI Gym environment.
Preprocessing the Pong environment for training a DQN.
Training a DQN to play Pong.

Papers Cited

Artificial Neural Networks

Autoencoders

Convolutional Neural Networks

Dimensionality Reduction

Kernel Principal Component Analysis, Bernhard Schölkopf, Alexander Smola , Klaus-Robert Müller

Distributing TensorFlow

Ensemble Learning

Recurrent Neural Networks

Reinforcement Learning

Support Vector Machines

Training Neural Networks

License

This code is released under an Apache 2.0 Licence. Please see LICENSE for more information.

ldkwebsite / hands-on-machine-learning Goto Github PK