Light

Questions on method sharedClone in Module.lua about dpnn HOT 8 CLOSED

element-research commented on July 17, 2024

Questions on method sharedClone in Module.lua

from dpnn.

Comments (8)

nicholas-leonard commented on July 17, 2024

when calling forward on recurrent modules during training, the output will be different for each time-step so this isn't a problem. This is because the recurrent modules (i.e. AbstractRecurrent instances) handle manage the memory allocated to different time-steps internally. That way you can call forward multiple times on a recurrent module to produce a different output (with its own memory) every time. Does this answer your question?

from dpnn.

eriche2016 commented on July 17, 2024

hi, nicolas, thanks for your reply, and can you be more specific. coz i noticed that in https://github.com/Element-Research/rnn/blob/master/Recurrent.lua#L73, if recurrentModule reference the same module at each timestep, then wouldnot this module's(i.e, recurrentModule) output reference the same memory chunk? Besides I donot see any strategy like copy or clone you do here to avoid such problem, does not this cause a problem, or I must misunderstand such things?

from dpnn.

eriche2016 commented on July 17, 2024

what do you mean internally? can you point it out?

from dpnn.

nicholas-leonard commented on July 17, 2024

@eriche2016 So when you an AbstractRecurrent calls getStepClone to get a clone for that time-step, there are two use cases. If the internal recurrentModule is also an AbstractRecurrent instance, then it returns itself when calling stepClone(), otherwise it creates a sharedClone. The latter is a clone of itself where the parameter and gradParameters are shared between clone and original. In the former case, the recurrentModule, which is also an AbstractRecurrent will do its own internal stepClone of its recurrentModule when a forward is called and the time-step is incremented. In which case, the resulting clone shares parameters with the original, but has its own output, gradInput and other such stateful tensors. Get it?

from dpnn.

eriche2016 commented on July 17, 2024

@nicholas-leonard hi, thanks for your reply. but I am still confused. so I declare a test model named below:
r = nn.Recurrent( 7, nn.LookupTable(100, 7), nn.Linear(7, 7), nn.Sigmoid(), 5 ) ,
an after checking the code and insert print(self.dpnn_stepclone) in method sharedClone. And i run it interactive mode, run below commands for first several times:
r:forward(torch.Tensor{2}),
I will got nil value for the inserted print statement, which means the self.dpnn_stepclone is useless here, note that self.dpnn_stepclone has been set to be true in AbstractRecurrent.lua:
https://github.com/Element-Research/rnn/blob/master/AbstractRecurrent.lua#L6. self.dpnn_stepclone is an attribute of AbstractRecurrent class, but it is not an attributes of Module here, so self.dpnn_stepclone is useless here, am i correct, or do i miss something here?

from dpnn.

nicholas-leonard commented on July 17, 2024

@eriche2016 It should be print(self.dpnn_stepClone).

from dpnn.

eriche2016 commented on July 17, 2024

@nicholas-leonard , no , it is print(self.dpnn_stepclone), you can check it in https://github.com/Element-Research/rnn/blob/master/AbstractRecurrent.lua#L6 and https://github.com/Element-Research/dpnn/blob/master/Module.lua#L43. Note that i have also tested in the if statements in https://github.com/Element-Research/dpnn/blob/master/Module.lua#L43, never print true. I guess self.dpnn_stepclone is an attribute of AbstractRecturrent class, not an attribute of class Module. am i correct, if so, it will be useless here to use self.dpnn_stepclone in Module.lua.

from dpnn.

eriche2016 commented on July 17, 2024

I know the difference , the example I use is not proper, cause r is a module of type nn.AbstractRecurrent, however, within the Reccurrent.lua file, the self.reccurrentModule is not a module of type 'nn.AbstractRecurrent', but a module of type nn.Module. so it has no attribute of self.dpnn_stepclone. thank you very much anyway,

from dpnn.

Related Issues (20)

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.