Giter VIP home page Giter VIP logo

Comments (4)

botev avatar botev commented on August 17, 2024 1

Ok, so to understand this part you need a bit to read the advanced docs here.

The way KFAC works is in two stages:
It first marks/labels certain computations with specific "layer_tags" and then separately to each tag it assigns a block class.
The generic_tag is only compatible with NaiveDiagonal and NaiveFull curvature blocks, while the Dense blocks are compatible only with the dense_tag currently.

So what needs to happen, is you need to correctly have a dense tag computation in your model in order to be able to use the DenseTwoKroneckerFactored block. If this is not picked up automatically, you can in your code use manually kfac_jax.register_dense to manually register your computation as dense.
I can't comment why it is not picked up automatically, without seeing your model code.

from kfac-jax.

botev avatar botev commented on August 17, 2024

PS: If you resolve your problem please do close the issue tracker.

from kfac-jax.

joeryjoery avatar joeryjoery commented on August 17, 2024

Ok, so to understand this part you need a bit to read the advanced docs here.

The way KFAC works is in two stages: It first marks/labels certain computations with specific "layer_tags" and then separately to each tag it assigns a block class. The generic_tag is only compatible with NaiveDiagonal and NaiveFull curvature blocks, while the Dense blocks are compatible only with the dense_tag currently.

So what needs to happen, is you need to correctly have a dense tag computation in your model in order to be able to use the DenseTwoKroneckerFactored block. If this is not picked up automatically, you can in your code use manually kfac_jax.register_dense to manually register your computation as dense. I can't comment why it is not picked up automatically, without seeing your model code.

Right, I see. I was using my own defined Dense layers using hk.Module and I assumed that the optimizer would recognize this automatically. My implementation is a bit different from hk.Linear as I was writing my own implementation for KFAC, and variants thereof, and I concatenated the bias vector into the weight matrix. Could this be the cause?

Anyways, I can indeed get it working when I switch out my implementation for the hk.Linear modules. Your comment clears up my confusion! Thanks!

from kfac-jax.

botev avatar botev commented on August 17, 2024

Yes, if as inputs they are separate parameters, but then you concatenate them and use them in that way, KFAC would not automatically recognize it as it does not have the notion that it is a dense layer computation.

from kfac-jax.

Related Issues (12)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.