Book reading log "Deep Learning from scratch" (Ja『ゼロから作る Deep Learning』)
- Basic Python3 syntax
- How to use following libraries
- numpy
- matplotlib
- Perceptron is algorithm that has input and outputs. when input passes then return specific value
- Perceptron has parameters weight and bias
- Using perceptron, we can implement logical circuit(such as AND/OR gate)
- XOR gate can implement more than two layer perceptron
- single layer perceptron can express only liner area, but multi-layer perceptron can express non-liner area.
- multi-layer perceptron can express computer(logically)
- Use sigmoid, ReLU and the other smooth function as activation function
- Using Numpy's multi-dimensional array feature, we can implement neural network efficiently
- Machine learning approach broadly divided classification and regression
- Activation function of output layer
- Regression: identification function
- Classification: softmax function
- In classification problem we set output layer numbers equal to classification class number
- Batch: unit of input
PIL
install via pip install pillow
execute following command
$ python download_dataset.py
then this script generates dataset/mnist.pkl
- Data sets used in machine learning are divided into training data and test data
- Learning with training data and evaluating the general-purpose ability of the learned model with test data
- Learning of the neural network updates the weight parameter so that the value of the loss function becomes small with the loss function as an index
- When updating the weight parameter, the work of updating the value of the weight in the gradient direction is repeated using the gradient of the weight parameter
- Calculating the derivative by the difference when giving a small value is called numerical differentiation
- The gradient of the weight parameter can be obtained by numerical differentiation
- Calculation by numerical differentiation takes time, but its implementation is simple. On the other hand, the slightly complicated error back propagation method implemented in the next chapter can obtain the gradient at high speed
I have 1st print of this book. This has many mistakes.
So I check errata and current version sample code.
Some codes doesn't work well now.
- By using the computational graph, it's possible to visually grasp the computation process
- Nodes of computational graph are configured by local calculation
- Propagation in the computational graph performs normal computation. On the other hand, differentiation of each node can be obtained by backpropagation of computational graph.
- By implementing the components of the neural network as layers, it's possible to efficiently compute the gradient calculation(Backpropagation)
- By comparing the results of numerical differentiation and backpropagation method, it can be confirmed that there is no error in implementation of error back propagation method(gradient check)
- Besides SGD, as a method of updating parameters, there are methods such as Momentum, AdaGrad, Adam, etc, which are famous
- The way of giving the initial value of the weight is important for correct learning
- The initial value of Xavier and the initial value of He are valid as the initial value of the weight
- By using Batch Normalization, learning can be advanced quickly, it becomes robust against the initial value
- There are Weight decay and Dropout as normalization methods to suppress over learning
- Searching for hyperparameters is an efficient way to progress while gradually narrowing down the range where good values exist
- CNN adds a convolution layer and a pooling layer to the network of all the bonded layers so far
- For convolution layer and pooling layer, im2col can be used for simple and efficient implementation
- The visualization of CNN shows how advanced information is extracted as the layer becomes deeper
- Representative types of CNN network are LeNet and AlexNet
- Big data and GPU have greatly contributed to the development of deep learning
- For many problems, improvement of performance can be expected by deepening the network
- A recent trend in the image recognition competition called ILSVRC is that the deep learning method monopolizes the top and the used network is also deep
- Famous networks include VGG, GoogleLeNet and ResNet
- Improve deep learning speed by GPU, distributed learning, reduction of bit system, and so on
- Deep learning(Neural network) can be used not only for object recognition but also for object detection and segmentation
- As applications using deep learning, there are caption generation of images, generation of images, reinforcement learning. Recently, use of deep learning for automatic driving is also expected,
- Python 3.5 対応画像処理ライブラリ Pillow (PIL) の使い方: installation of PIL
- CS231n