Giter VIP home page Giter VIP logo

deeplearning.ai's Introduction

Hello World

by: Vietnamese

I'm Anh, from 🇻🇳 and be a 👨‍💻, like to build a system by DDD.

Currently have interested in System Design 👷 🏗️ and Database.

Github statistic

GitHub stats

Tech Toolbox 🧰

Database 🗃️

MySQL MongoDB

Infrastructure

AWS

Programming Language 💻

JS NodeJS TS Dart Python

UI 🎨

React Redux MUI

Mobile 📱

Flutter

ORM ⚙️

Sequelize

Tool ⚒️

Git

Testing 🧪

Jest

Deep Learning 🤖

Tensorflow

Architecture 🏗️

Clean architeture + DDD

CQRS, Event Sourcing Screen Shot 2023-09-16 at 13 32 19

deeplearning.ai's People

Contributors

tuananhhedspibk avatar

Stargazers

 avatar

Watchers

 avatar

deeplearning.ai's Issues

Random Initialization

If we initialize the parameter with 0, the hidden units will make the same influence to the output layer. The hidden units will be symmetric.

Screen Shot 2020-12-27 at 10 26 18

Why we choose 0.01, instead 100 or 1000. Because z = Wx + b, if W is large, z will be large. In the case we use tanh function as activation function the output a will be the point at flat part of tanh function so gradient descent or learning will be slow.

Screen Shot 2020-12-27 at 10 33 25

Ref: https://www.youtube.com/watch?v=yXcQ4B-YSjQ&list=PLkDaE6sCZn6Ec-XTbcX1uRg2_u4xOEky0&index=35

Gradient Descent

Sử dụng Gradient Descent để tìm ra cặp (w, b) phù hợp cho Logistic Regression sao cho giá trị Cost function (J(w, b)) nhỏ nhất.
Hàm J(w, b) là convex function (Hàm lồi).
Ban đầu ta sẽ khởi tạo giá trị của (w, b) - giả sử là (w0, b0). Sử dụng gradient descent, ta sẽ đi dần xuống điểm nhỏ nhất của J(w, b)

Screen Shot 2020-12-09 at 16 33 48

Alpha là learning rate - chỉ độ lớn của bước đi của gradient descent.
dJw / dw: Derivative - thể hiện cập nhật đối với w, nó còn là độ dốc của tiếp tuyến với đồ thị của hàm J(w, b)

Screen Shot 2020-12-09 at 17 10 22

Ref: https://www.youtube.com/watch?v=uJryes5Vk1o&list=PLkDaE6sCZn6Ec-XTbcX1uRg2_u4xOEky0&index=10

Activation Functions

Use tanh function makes it easier for training because, tanh function have mean 0 (you can center your data between 0 instead of 0.5). But with Binary Classification you can use sigmoid function for the output layer (to calculate y^ = {0, 1}).

But tanh and sigmoid functions slope will be small when input is too large or too small.

We also have the ReLU function:

a = max(0, z)

Conclusion:

  • Binary Classification ===> SIGMOID function for OUTPUT LAYER.
  • ReLU (Derivative = 0 when z < 0), Tanh ===> Use in HIDDEN LAYER. Use ReLU make your model learn faster than Tanh or Sigmoid.

Screen Shot 2020-12-23 at 12 42 57

Screen Shot 2020-12-23 at 12 44 51

Ref: https://www.youtube.com/watch?v=Xvg00QnyaIY&list=PLkDaE6sCZn6Ec-XTbcX1uRg2_u4xOEky0&index=30

Exponentially Weighted Averages

Screen Shot 2021-01-04 at 16 37 54

If beta larger, the weight for theta(t) is smaller so it adapts slowly to the changes of the temperature. With smaller beta we have averaged over small window, we will have a lot of noise, but it adapts quickly to temperatures changes.

Screen Shot 2021-01-04 at 16 43 43

Ref: youtube.com/watch?v=lAq96T8FkTw&list=PLkDaE6sCZn6Hn0vK8co82zjQtt3T2Nkqc&index=17

Dropout Regularization

With Dropout we go through each of layers and set the probability for eliminating a node in NN. For example: with coin toss, we have probability for keeping and removing node is 0.5/0.5.

Screen Shot 2020-12-31 at 16 10 20

d3: dropout vector for layer 3
10% shut off can be understood by keep_prob. With node shut off, a[3] will be reduced so in order to make z[4] not changes, we must divide a[3] to keep_prob. This dropout technique make sure that our expected value a[3] will not be changed. It make test time easier because prevent scaling problem.

Screen Shot 2020-12-31 at 16 36 35

At the test time, we don't use dropout.

Screen Shot 2020-12-31 at 16 38 28

Ref: https://www.youtube.com/watch?v=D8PJAL-MZv8&list=PLkDaE6sCZn6Hn0vK8co82zjQtt3T2Nkqc&index=6

np.sum

Sum of array elements in a given axis

np.sum([[0, 1], [0, 5]], axis=0) # axis = 0: vertically --> [0, 6]
np.sum([[0, 1], [0, 5]], axis=1) # axis = 1: horizontally --> [1, 5]
np.sum([[0, 1], [0, 5]]) # Not specify axis: sum of all elements in array. --> 6

Screen Shot 2021-02-09 at 21 17 18

Image Ref: https://qiita.com/Phoeboooo/items/b464b7df3c64a33caf94

Bias/Variance

High bias: simple classifier.
High variance: complicated classifier.

Screen Shot 2020-12-31 at 14 46 26

Look at training set error to check "high bias" problem. Look at dev set error to check "high variance" problem. Sometimes bayes error maybe high (blur images, ...)

Screen Shot 2020-12-31 at 14 53 32

Line in purple is "high bias" and "high variance", because the "linear part" is under-fitting - high bias, the curve part is over-fitting - high variance.

Screen Shot 2020-12-31 at 14 56 44

Ref: https://www.youtube.com/watch?v=SjQyLhQIXSM&list=PLkDaE6sCZn6Hn0vK8co82zjQtt3T2Nkqc&index=2

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.