Hello! I just found your project, this is very exciting work! In my

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Functional style for neural network modules (JAX) about burn HOT 6 CLOSED

xmakro commented on May 14, 2024 2

Functional style for neural network modules (JAX)

from burn.

Comments (6)

nathanielsimard commented on May 14, 2024 1

For now there isn't much benefit of separating the optimizer from the module, even if it would be quite easy to do. This is because the code is 100% generated and the method signature makes sense: updating a module with a mutable reference to an optimizer and the gradients.

We have to keep in mind that the Module trait declares how the parameters are serialized, deserialized and updated, not about how they are used. It's all about handling the state without any boilerplate code (all generated). At some point, it might be good to have a more generic trait to update the weights without an explicit dependency to the Gradients type and the Optimizer trait.

from burn.

nathanielsimard commented on May 14, 2024

Hi,

Burn is still different from PyTorch in its module implementation. The module trait and derive only define a state with its parameters and potentially other fields.

Computing the forward pass doesn't change the state of the module, and you are free to implement it as you chose. It doesn't have to be a method attached to the struct, you can pass the state to other functions if you prefer.

In your example, it seems that the Linear struct only contains the hyper-parameters, and the params are owned by the LinearParams struct, which is created by a method on the hyper-parameter struct. You could replicate that pattern with burn if your wanned to by attaching the forward pass to the config struct as well.

I don't think this is a superior design because you may need to pass the state and the hyper-parameter struct around instead of just one, which increases the amount of arguments your functions take for no apparent benefit.

This is still interesting to think about and there is no apparent limitation on how the Module trait is defined. Note that I just removed the Forward trait from burn because there seems to be no need for an abstraction to define the forward pass. At the moment this is just plain, simple methods on modules.

You can also call them like function:

let params = Forward::new(&config);
Linear::forward(params, x);

from burn.

nathanielsimard commented on May 14, 2024

There are still some mutable methods on modules, namely updating the parameters with an optimizer and updating the current device. This may be better if those methods returned the updated state instead, like pretty much all tensor functions. Do you think it would be an improvement?

from burn.

xmakro commented on May 14, 2024

Yes I think that goes in the direction that I imagined. It might be desirable if the Module was independent of the optimizer

from burn.

antimora commented on May 14, 2024

@nathanielsimard does your recent work overhauling the training and module cover anything in this request? Since you had much discussion with James about the functional aspect of burn, can you please tell what we will do and will not? With that we can have some resolution and close or keep it open if some parts are remain valid. Regardless we should update this ticket since the work is recent and fresh.

from burn.

nathanielsimard commented on May 14, 2024

I think we should close this issue, but it doesn't mean there isn't any more work to be done to improve the API. However, this issue is probably too vague to ever be considered complete. We might open more focused issues that link to this one for continuity.

from burn.

Functional style for neural network modules (JAX) about burn HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent