Giter VIP home page Giter VIP logo

jupyter-notebooks's People

Contributors

rdipietro avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

jupyter-notebooks's Issues

b=b-7 ?

there is an error in the first equation

Why in order to get the smallest number of bits we must use the log of probability of occurrence

Hi,

Great article!
I have a question, not sure if it's OK if I post this here as an issue. If it's not I'll delete it.

When talking about encoding elements of a distribution in order to minimise the number of bits, i your article says this:

It turns out that if you have access to the underlying distribution y, then to use the smallest number of bits on average, you should assign log(1/y_i) bits to the i-th symbol.

Why log(1/y_i)?

How can we prove that this is the minimum?

Some questions about notation

Hi Rob!

I enjoy reading "A Friendly Introduction to Croo-Entropy Loss", but I have a few questions about the notation used, because I don't think that I've come across of it before.

I think the notation is explained well enough in the post, but I'm wondering if it's a standard notation in some fields, and if it is, if you could point me towards a webpage/pdf providing an overview of the notation :)

It's the red parts which is new to me.

Scan - multiple inputs/outputs

Hi, I am a beginner in TensorFlow and I run into your brilliant scan tutorial. I wonder whether you could share some knowledge of iterating across multiple tensors at once and/or returning tuples from fn.

So far I've tried passing a list of tensors (possibly wrapped by tf.identity) but it doesn't work.

I believe this would be valuable extension of your post :)

Petr

Issues in link of Cross Entropy and About Me in the Contents Section

I am not sure if everyone can reproduce this issue, but in the Content section for Cross Entropy and About Me, I am not able to open the links
Maybe you can give links like:

Contents

...

* [Cross Entropy](https://rdipietro.github.io/friendly-intro-to-cross-entropy-loss/#Cross-Entropy)
* [About Me](https://rdipietro.github.io/friendly-intro-to-cross-entropy-loss/#About-Me)

...

and it should work like a charm, something like:

Contents

...

...

Actually I believe that you can try giving hyperlinks in markdown using the same structure, i.e, instead of using <url>/cross-entropy, use <url>/Cross-Entropy (usage of same letters as in text, and separating them with hyphens) and it should work for all links in Contents Section

Possible typo in "A Friendly Introduction to Cross-Entropy Loss"

First, thanks for the excellent tutorial! I've always wondered about the relationship between entropy and cross-entropy loss, and you explained it perfectly.

I did notice a possible typo in the "Predictive Power" section:

is just the first entry of y^(1)=(0.4,0.1,0.5)T , which is y1(1)=0.4 .

Shouldn't that last y1(1) be y^1(1)?

Why are the samples assumed to be identically distributed?

I am miles off an understanding here i assume and this is my first time raising an issue on git so please i beg everyone's indulgence
The post makes the point that

Because we usually assume that our samples are independent and identically distributed, the likelihood over all of our examples decomposes into a product over the likelihoods of individual examples:
L({y(n)},{y^(n)})=โˆnL(y(n),y^(n))

(sorry first issue on git, do not know how to math)
Here is what my issue is:

  1. How come the samples are identically distributed? Do we not have actual probability distributions of each sample as 0 for each value except a 1 at the actual label place. For example (from the post):

To keep going with this example, let's assume we have a total of four training images, with labels {landscape, something else, landscape, house}, giving us ground-truth distributions y(1)=(1.0,0.0,0.0)Ty(1)=(1.0,0.0,0.0)T , y(2)=(0.0,0.0,1.0)Ty(2)=(0.0,0.0,1.0)T , y(3)=(1.0,0.0,0.0)Ty(3)=(1.0,0.0,0.0)T , and y(4)=(0.0,1.0,0.0)Ty(4)=(0.0,1.0,0.0)T

so are not y(1), y(2), y(3), y(4) having different distributions?

  1. Is anywhere the identically distributed part needed. Is not the entire mathematics in post consistent with just the independent distributions assumption?

Distinguishing y and y_hat

I can never remember whether y_hat is supposed to be the target value or the prediction output, which makes skim reading your article quite difficult.

Two possible solutions...

  1. Add a position:fixed legend that stays visible.
  2. Use clearer names, e.g. p for the prediction output, t for the true value.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.