Giter VIP home page Giter VIP logo

Comments (7)

nicodjimenez avatar nicodjimenez commented on July 22, 2024

@evbo ,
you are correct, my example was intended as multiple inputs, single output. Typically, there is an inner product layer + softmax on top of the network which takes the hidden state at every time point and predicts something (e.g. the current character). Since this would involve writing another layer and I wanted to keep the code as simple as possible, I implicitly made this extra layer w^T h, where w = [1,0,...] and h is the value of the hidden layer. Would it be less confusing if I added another layer?

With a single input, single ouput, the network simply does not have enough complexity to predict a complex sequence. This would work a lot better if you added an inner product layer on top in order to decode the internal dynamics.

from lstm.

evbo avatar evbo commented on July 22, 2024

Thanks for confirming. This makes sense. I think stylistically is where people get confused - some come from a background where it is customary to have the output squashing function and some do not. It seems, even according to Lipton's paper, that the extra squashing function is optional with no evidence of it being necessary.

from lstm.

mishfaq avatar mishfaq commented on July 22, 2024

kindly help me whats going over there
y_list = [-0.5,0.2,0.1, -0.5]
input_val_arr = [np.random.random(x_dim) for _ in y_list]

for _ in y_list
what does this ( _ ) means in above for loop statement

and also please elaborate me i am new in python and nn

from lstm.

ScottMackay2 avatar ScottMackay2 commented on July 22, 2024

You could search the _ syntax for yourself on google (it is just a convention for saying that that variable will not be used). And this was clearly not a Python tutorial. You should follow one first.

from lstm.

zackchase avatar zackchase commented on July 22, 2024

from lstm.

ScottMackay2 avatar ScottMackay2 commented on July 22, 2024

Sorry. Was not my intention to bring this the wrong way.

from lstm.

MrLeexm avatar MrLeexm commented on July 22, 2024

@evbo ,
you are correct, my example was intended as multiple inputs, single output. Typically, there is an inner product layer + softmax on top of the network which takes the hidden state at every time point and predicts something (e.g. the current character). Since this would involve writing another layer and I wanted to keep the code as simple as possible, I implicitly made this extra layer w^T h, where w = [1,0,...] and h is the value of the hidden layer. Would it be less confusing if I added another layer?

With a single input, single ouput, the network simply does not have enough complexity to predict a complex sequence. This would work a lot better if you added an inner product layer on top in order to decode the internal dynamics.

Sincerely hope you can answer my question ASAP,its killing me. From your code, it looks like a multiple-input multiple-output LSTM. You took the first value of state.h in the four Lstm_time node as the final output. Isn't this multiple output? As far as i know, single output means only take the final node'output as result, am i right?

from lstm.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.