Giter VIP home page Giter VIP logo

book-deep-learning-from-scratch's Introduction

Deep Learning from scratch

Book reading log "Deep Learning from scratch" (Ja『ゼロから作る Deep Learning』)

Summary

Chapter 1: Introduction to Python

  • Basic Python3 syntax
  • How to use following libraries
    • numpy
    • matplotlib

Chapter 2: Perceptron

  • Perceptron is algorithm that has input and outputs. when input passes then return specific value
  • Perceptron has parameters weight and bias
  • Using perceptron, we can implement logical circuit(such as AND/OR gate)
  • XOR gate can implement more than two layer perceptron
  • single layer perceptron can express only liner area, but multi-layer perceptron can express non-liner area.
  • multi-layer perceptron can express computer(logically)

Chapter 3: Neural Network

  • Use sigmoid, ReLU and the other smooth function as activation function
  • Using Numpy's multi-dimensional array feature, we can implement neural network efficiently
  • Machine learning approach broadly divided classification and regression
  • Activation function of output layer
    • Regression: identification function
    • Classification: softmax function
  • In classification problem we set output layer numbers equal to classification class number
  • Batch: unit of input

notice

PIL library installation

PIL install via pip install pillow

MNIST data generation

execute following command

$ python download_dataset.py

then this script generates dataset/mnist.pkl

Chapter 4: Training of Neural Network

  • Data sets used in machine learning are divided into training data and test data
  • Learning with training data and evaluating the general-purpose ability of the learned model with test data
  • Learning of the neural network updates the weight parameter so that the value of the loss function becomes small with the loss function as an index
  • When updating the weight parameter, the work of updating the value of the weight in the gradient direction is repeated using the gradient of the weight parameter
  • Calculating the derivative by the difference when giving a small value is called numerical differentiation
  • The gradient of the weight parameter can be obtained by numerical differentiation
  • Calculation by numerical differentiation takes time, but its implementation is simple. On the other hand, the slightly complicated error back propagation method implemented in the next chapter can obtain the gradient at high speed

notice

I have 1st print of this book. This has many mistakes.

So I check errata and current version sample code.

Some codes doesn't work well now.

Chapter 5: Backpropagation

  • By using the computational graph, it's possible to visually grasp the computation process
  • Nodes of computational graph are configured by local calculation
  • Propagation in the computational graph performs normal computation. On the other hand, differentiation of each node can be obtained by backpropagation of computational graph.
  • By implementing the components of the neural network as layers, it's possible to efficiently compute the gradient calculation(Backpropagation)
  • By comparing the results of numerical differentiation and backpropagation method, it can be confirmed that there is no error in implementation of error back propagation method(gradient check)

Chapter 6: Techies about learning

  • Besides SGD, as a method of updating parameters, there are methods such as Momentum, AdaGrad, Adam, etc, which are famous
  • The way of giving the initial value of the weight is important for correct learning
  • The initial value of Xavier and the initial value of He are valid as the initial value of the weight
  • By using Batch Normalization, learning can be advanced quickly, it becomes robust against the initial value
  • There are Weight decay and Dropout as normalization methods to suppress over learning
  • Searching for hyperparameters is an efficient way to progress while gradually narrowing down the range where good values ​​exist

Chapter 7: Convolution neural network

  • CNN adds a convolution layer and a pooling layer to the network of all the bonded layers so far
  • For convolution layer and pooling layer, im2col can be used for simple and efficient implementation
  • The visualization of CNN shows how advanced information is extracted as the layer becomes deeper
  • Representative types of CNN network are LeNet and AlexNet
  • Big data and GPU have greatly contributed to the development of deep learning

Chapter 8: Deep learning

  • For many problems, improvement of performance can be expected by deepening the network
  • A recent trend in the image recognition competition called ILSVRC is that the deep learning method monopolizes the top and the used network is also deep
  • Famous networks include VGG, GoogleLeNet and ResNet
  • Improve deep learning speed by GPU, distributed learning, reduction of bit system, and so on
  • Deep learning(Neural network) can be used not only for object recognition but also for object detection and segmentation
  • As applications using deep learning, there are caption generation of images, generation of images, reinforcement learning. Recently, use of deep learning for automatic driving is also expected,

References

Author

Kota SAITO

book-deep-learning-from-scratch's People

Contributors

noissefnoc avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.