Comments (2)
Hi @nevakanezzar,
This magic number is introduced in early versions of code to keep z_pres_logits
bounded. I don't remember exactly why we were doing that, but it might be that having unbounded z_pres_logits
would cause numerical errors. Though I'm not not 100% sure, just using z_pres_logits = self.z_pres_net(cat_enc)
may also be fine.
There are several things you may try to encourage the model to learn more objects:
- Set the object size prior
z_scale_mean_start_value
andz_scale_end_start_value
to proper values. Otherwise, objects that are too large or too small might not be detected. You would wantsigmoid(z_scale_mean_start_value)
to be roughlysize(object) / size(image)
. - Start with high
z_pres_start_value
, like0.8
. - Increase the number of cells. That is to set
G
to16
if you are currently using8
. - Using smaller
bg_sigma
thanfg_sigma
and set proper values forfix_alpha_steps
andfix_alpha_value
. For example, for joint training on 10 Atari games,bg_sigma=0.1
,fg_sigma=0.2
,fix_alpha_value=0.1
,fix_alpha_steps>2000
tend to work well. This one is a bit subtle and tends to have a huge impact on what would be learned, but I'm not completely sure how the combination of these hyperparameters would affect the training process. But you may try it. A reasonable combination to try isbg_sigma=0.15
,fg_sigma=0.2
,fix_alpha_value=0.1
,fix_alpha_steps=2000
.
Please let me know if need further clarification or help :)
from space.
If someone stumbles upon this again, I'll add my interpretation of the 8.8 after working with SPACE a little:
We are passing logits, i.e. this corresponds to probabilities with the use of the (exponential-based) sigmoid. But note that all these values are in the domain (-1, 1) after the z_pres result is passed through the tanh. That means the resulting probabilities lie in (sigmoid(-1) ~= 0.25, sigmoid(1)) which is unintuitive. Hence steepening the sigmoid by the 8.8 means sigmoid(8.8 * -1) ~= 0.002. So in short the 8.8. acts as a domain transfer from (-1, 1) to sigmoids (0, 1).
So why exactly 8.8? I do not know; in Germany this would be seen as weird, but afaik in other places such a value is simply for good luck!
from space.
Related Issues (6)
- Find configs for mspacman HOT 1
- Bounding box HOT 8
- AssertionError: Metric file does not exist HOT 2
- negative loss HOT 2
- Object Extraction HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from space.