Giter VIP home page Giter VIP logo

Comments (2)

zhixuan-lin avatar zhixuan-lin commented on August 16, 2024

Hi @nevakanezzar,

This magic number is introduced in early versions of code to keep z_pres_logits bounded. I don't remember exactly why we were doing that, but it might be that having unbounded z_pres_logits would cause numerical errors. Though I'm not not 100% sure, just using z_pres_logits = self.z_pres_net(cat_enc) may also be fine.

There are several things you may try to encourage the model to learn more objects:

  1. Set the object size prior z_scale_mean_start_value and z_scale_end_start_value to proper values. Otherwise, objects that are too large or too small might not be detected. You would want sigmoid(z_scale_mean_start_value) to be roughly size(object) / size(image).
  2. Start with high z_pres_start_value, like 0.8.
  3. Increase the number of cells. That is to set G to 16 if you are currently using 8.
  4. Using smaller bg_sigma than fg_sigma and set proper values for fix_alpha_steps and fix_alpha_value. For example, for joint training on 10 Atari games, bg_sigma=0.1, fg_sigma=0.2, fix_alpha_value=0.1, fix_alpha_steps>2000 tend to work well. This one is a bit subtle and tends to have a huge impact on what would be learned, but I'm not completely sure how the combination of these hyperparameters would affect the training process. But you may try it. A reasonable combination to try is bg_sigma=0.15, fg_sigma=0.2, fix_alpha_value=0.1, fix_alpha_steps=2000.

Please let me know if need further clarification or help :)

from space.

ThomasRot avatar ThomasRot commented on August 16, 2024

If someone stumbles upon this again, I'll add my interpretation of the 8.8 after working with SPACE a little:
We are passing logits, i.e. this corresponds to probabilities with the use of the (exponential-based) sigmoid. But note that all these values are in the domain (-1, 1) after the z_pres result is passed through the tanh. That means the resulting probabilities lie in (sigmoid(-1) ~= 0.25, sigmoid(1)) which is unintuitive. Hence steepening the sigmoid by the 8.8 means sigmoid(8.8 * -1) ~= 0.002. So in short the 8.8. acts as a domain transfer from (-1, 1) to sigmoids (0, 1).
So why exactly 8.8? I do not know; in Germany this would be seen as weird, but afaik in other places such a value is simply for good luck!

from space.

Related Issues (6)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.