Giter VIP home page Giter VIP logo

Comments (8)

nicholas-leonard avatar nicholas-leonard commented on July 17, 2024

when calling forward on recurrent modules during training, the output will be different for each time-step so this isn't a problem. This is because the recurrent modules (i.e. AbstractRecurrent instances) handle manage the memory allocated to different time-steps internally. That way you can call forward multiple times on a recurrent module to produce a different output (with its own memory) every time. Does this answer your question?

from dpnn.

eriche2016 avatar eriche2016 commented on July 17, 2024

hi, nicolas, thanks for your reply, and can you be more specific. coz i noticed that in https://github.com/Element-Research/rnn/blob/master/Recurrent.lua#L73, if recurrentModule reference the same module at each timestep, then wouldnot this module's(i.e, recurrentModule) output reference the same memory chunk? Besides I donot see any strategy like copy or clone you do here to avoid such problem, does not this cause a problem, or I must misunderstand such things?

from dpnn.

eriche2016 avatar eriche2016 commented on July 17, 2024

what do you mean internally? can you point it out?

from dpnn.

nicholas-leonard avatar nicholas-leonard commented on July 17, 2024

@eriche2016 So when you an AbstractRecurrent calls getStepClone to get a clone for that time-step, there are two use cases. If the internal recurrentModule is also an AbstractRecurrent instance, then it returns itself when calling stepClone(), otherwise it creates a sharedClone. The latter is a clone of itself where the parameter and gradParameters are shared between clone and original. In the former case, the recurrentModule, which is also an AbstractRecurrent will do its own internal stepClone of its recurrentModule when a forward is called and the time-step is incremented. In which case, the resulting clone shares parameters with the original, but has its own output, gradInput and other such stateful tensors. Get it?

from dpnn.

eriche2016 avatar eriche2016 commented on July 17, 2024

@nicholas-leonard hi, thanks for your reply. but I am still confused. so I declare a test model named below:
r = nn.Recurrent( 7, nn.LookupTable(100, 7), nn.Linear(7, 7), nn.Sigmoid(), 5 ) ,
an after checking the code and insert print(self.dpnn_stepclone) in method sharedClone. And i run it interactive mode, run below commands for first several times:
r:forward(torch.Tensor{2}),
I will got nil value for the inserted print statement, which means the self.dpnn_stepclone is useless here, note that self.dpnn_stepclone has been set to be true in AbstractRecurrent.lua:
https://github.com/Element-Research/rnn/blob/master/AbstractRecurrent.lua#L6. self.dpnn_stepclone is an attribute of AbstractRecurrent class, but it is not an attributes of Module here, so self.dpnn_stepclone is useless here, am i correct, or do i miss something here?

from dpnn.

nicholas-leonard avatar nicholas-leonard commented on July 17, 2024

@eriche2016 It should be print(self.dpnn_stepClone).

from dpnn.

eriche2016 avatar eriche2016 commented on July 17, 2024

@nicholas-leonard , no , it is print(self.dpnn_stepclone), you can check it in https://github.com/Element-Research/rnn/blob/master/AbstractRecurrent.lua#L6 and https://github.com/Element-Research/dpnn/blob/master/Module.lua#L43. Note that i have also tested in the if statements in https://github.com/Element-Research/dpnn/blob/master/Module.lua#L43, never print true. I guess self.dpnn_stepclone is an attribute of AbstractRecturrent class, not an attribute of class Module. am i correct, if so, it will be useless here to use self.dpnn_stepclone in Module.lua.

from dpnn.

eriche2016 avatar eriche2016 commented on July 17, 2024

I know the difference , the example I use is not proper, cause r is a module of type nn.AbstractRecurrent, however, within the Reccurrent.lua file, the self.reccurrentModule is not a module of type 'nn.AbstractRecurrent', but a module of type nn.Module. so it has no attribute of self.dpnn_stepclone. thank you very much anyway,

from dpnn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.