Comments (6)
This code is correct. *2
because each center position has (x, y)
, so the next dense layer should have k*2
inputs.
from brain-tokyo-workshop.
Thanks @lerrytang! I was just about to comment saying I tried removing it and the code's not working anymore :)
I guess I need to dive deeper into the attention module code to really understand what it's doing. Thanks again!
from brain-tokyo-workshop.
Again, thanks for being interested. You are always welcome for more questions :)
from brain-tokyo-workshop.
You are awesome @lerrytang! If you offer so generously... I'm confused by the interaction between SelfAttention
and MLPSolution
... As far as I can tell, the MLPSolution
is only receiving the (x,y)
coordinates of the best top_k
patches... and making a decision on which action to take only with that info? From reading your article I thought that the agent could 'look' at those patches to make the decision, but I'm not seeing anywhere in the code where the MLPSolution
is able to access the pixel info from the patches! Is this magic or what!?!?!? :P
For example, setting top_k to 1 and printing the dimension of MLPSolution
's input (print('MLPSolution input', inputs, inputs.shape)
), I get:
...
MLPSolution input tensor([0.9219, 0.9219]) torch.Size([2])
MLPSolution input tensor([0.4219, 0.4844]) torch.Size([2])
MLPSolution input tensor([0.9219, 0.9219]) torch.Size([2])
MLPSolution input tensor([0.4219, 0.4844]) torch.Size([2])
MLPSolution input tensor([0.8594, 0.9219]) torch.Size([2])
...
So... where is the agent actually looking at the pixel input from the patches??!?!?
from brain-tokyo-workshop.
That's the interesting part :)
The agent has 2 parts: the self-attention visual module (
MLPSolution
is the latter (despite its name).
The former part gets all the RGB images and does the patch voting to get the K patches, this is the only place in the code where image info is looked at. After that we discard the non-important patches, extract features from these K selected ones, and feed the feature to the controller. About feature extraction, one can extract any feature from these K patches, but in our experiments, we simply used the locations and disregard the content info, so that's why you don't see any pixel processing code in MLPSolution
.
Hope this helps.
from brain-tokyo-workshop.
OMG that's amazing!! I guess I had missed that part on the article:
This is even more impressive then!! ok... I need to think about what all this means. Thanks again @lerrytang !
from brain-tokyo-workshop.
Related Issues (20)
- any google colab implementations
- wann_train.py error HOT 8
- index XXX is out of bounds for axis 1 with size 10
- Hi,I have some problems when I run the vae_racing in prettyNEAT? HOT 5
- Ranking method inconsistency between duplicated projects HOT 1
- ListXor and alg_act parameter
- Why parameter ‘prob_crossover’ is 0? HOT 4
- What is the reason for the restriction on introducing new edges (source node must be in same or lower layer)? HOT 5
- Question regarding the innovation record HOT 2
- Creating the initial population HOT 1
- Why is the network's fitness smoother when the weight is positive HOT 1
- How about input and output activation functions changing? HOT 4
- AttentionAgent: can't run pretrained CarRacing examples because of doom problems? HOT 2
- Version of MNIST used to produce the results presented in the paper HOT 2
- AttentionAgent: Is there GPU support? How long approximately is training? HOT 1
- AttentionNeuron - Permutation invariance HOT 4
- Google Colab GPU Error HOT 2
- Security Policy violation Binary Artifacts HOT 9
- Applicability to non-RL problems? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from brain-tokyo-workshop.