Giter VIP home page Giter VIP logo

Comments (5)

drasmuss avatar drasmuss commented on June 16, 2024

Yeah that definitely seems like something that'd be good to have. One challenge I can foresee with the config approach is that I don't think there's an easy way to differentiate different regularization targets within an Ensemble. E.g., if I do net.config[nengo.Ensemble].l2_regularization, am I regularizing the encoders? biases? output activities?

One possibility is that nengo.Ensemble targets encoders, and nengo.ensemble.Neurons targets output activities. Biases wouldn't be targetable, but they aren't targetable with the Probe approach either (since biases aren't probeable), so we're not any worse off.

Another option would be to use a helper-function approach, instead of the config system. Something like:

with nengo_dl.Simulator(net) as sim:
    my_loss = {my_probe: "mse"}
    my_loss.update(nengo_dl.utils.l2_regularization(nengo.Ensemble, 0.001))
    sim.train(..., objective=my_loss)

Under the hood this would just be using the Probe approach, but it would automate the creation of the Probes and objective functions. Advantage of this is that we wouldn't have to add any new logic to nengo_dl, everything would be encapsulated within that helper function.

The third option would be to go more in-depth, adding functionality directly to the nengo_dl Simulator to support regularization. Something like

with nengo_dl.Simulator(net) as sim:
    sim.add_l2_regularization(nengo.Ensemble, "bias", 0.001)

The advantage of this approach is that it's more flexible, and would let us do things like target the biases. But I'm definitely a bit reluctant to add new top-level functions to the Simulator like that, as it adds complexity that is directly exposed to new users.

The fourth option would be the most general, allowing users to pass arbitrary Tensors for the objective. For example, you could do something like

with nengo_dl.Simulator(net) as sim:
    reg_loss = tf.reduce_sum([tf.nn.l2_loss(v) for v in tf.trainable_variables])
    sim.train(..., objective=reg_loss)

That is, rather than us trying to add support for these things directly in nengo_dl, we just let users do whatever they want through TensorFlow, and make it easier to insert that TensorFlow logic into a nengo_dl model. Advantage of this approach is that it is the most flexible (users could do all kinds of things, not just regularization). But it requires users to be more familiar with TensorFlow.

Another consideration; rather than directly specifying the regularization type through the parameter, we could allow users to pass the regularization function they want. E.g.

net.config[nengo.Connection].regularization = tf.nn.l2_loss

I like this because it means that users can now use the same method for whatever regularization type they want. The disadvantage is that it's a bit more awkward to specify scaling weights:

net.config[nengo.Connection].regularization = lambda x: 0.001 * tf.nn.l2_loss(x)

That doesn't seem toooo bad though?

I think my overall inclination would be the net.config[nengo.Connection].regularization = tf.nn.l2_loss approach, to start. I would keep our eyes on that though, and if it isn't meeting our needs go to one of the more general approaches.

from nengo-dl.

hunse avatar hunse commented on June 16, 2024

from nengo-dl.

arvoelke avatar arvoelke commented on June 16, 2024

I'm finding that nengo_dl is over-fitting to my training data. Wondering what approach is currently recommended? I'm thinking the easiest solution right now would be to add noise to my training input data?

from nengo-dl.

drasmuss avatar drasmuss commented on June 16, 2024

There are a lot of different ways that you can try to avoid overfitting. Adding more training data is a good approach; perturbing your data with noise is one way to do that, but you can use various different data augmentation methods depending on what your data looks like (e.g. shifting, cropping, or just collecting/generating more raw data). You could also add noise to the activations (firing rates) within your model. The most common way to do this nowadays is through dropout layers. You can set this up so that it is only active during training, so those dropout layers effectively disappear when you're running your network later. Adding weight regularization can also help avoid overfitting. In practice it's probably a combination of all of those techniques, but if I was picking one to start with it would just be adding more training data, as that is usually the easiest (doesn't require any changes to your model).

from nengo-dl.

drasmuss avatar drasmuss commented on June 16, 2024

Some new tools for regularization were implemented in #73. It's basically the helper function approach from above, with some modifications. And I added an option to reduce probe memory usage, as @hunse suggested. I'm going to close this for now, but we can definitely re-open it if we find we want to explore one of the other options discussed above.

from nengo-dl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.