Comments (8)
Hi,
You can use this flag (https://github.com/shariqiqbal2810/MAAC/blob/master/utils/critics.py#L162) to return the attention weights of each each agent over the other agents for all the time points that are passed in as input.
from maac.
Thanks for the instructions! But I still have two questions:
- Is the flag used for returning the attention weights of the samples collected in the training process?
- Can I obtain the attention weights on the fixed time-step in the evaluating process (the decentralized execution process)?
from maac.
The flag is simply used whenever you call the forward pass on the critic module. Example:
critic = AttentionCritic(sa_sizes)
rets = critic(return_q=True, return_attend=True)
# rets[0][0] contains Q-value for agent 0 corresponding to inputs and rets[0][1] contains attention weights for agent 0
As such, the attention weights are calculated for whatever states and actions you pass into the critic during the forward pass, so you can calculate the attention weights both during training and execution if you would like.
from maac.
Thanks for your advice! In the execution process, the agents only get observations.
- Should I first get the actions from the policies and then send the obs-action pair to the critic to calculate the attention weights? Is there any method to calculate the weights depending only on observations?
- As for Figure 6 in your article, when the rover is paired with different towers, are the attention weights calculated in the training process averaged over several times or execution process?
- If the attention weights are dynamically changed within an episode, then how to make a visualization? Thanks very much!
from maac.
- Yes that is correct. The attention weights are calculated as part of the state-action value prediction network, so there is no way to get them without inputting actions.
- For Figure 6, the "attention entropy" is reported as an average over all data points in the mini-batch provided during training. It's important to note here that Figure 6 is not plotting the actual attention weights, but rather their entropy (i.e. how uniformly the attention weights are distributed).
- You can simply plot the attention weights on a per timepoint basis.
from maac.
Thank you very much! In fact, the figure I have mentioned is the following one (maybe figure 7 in your final version), whose caption is " Attention weights when subjected to different Tower pairings for Rover 1 in Rover-Tower environment
":
Are the attention weights calculated in the training process averaged over several times or execution process?
from maac.
Oh I see. These are calculated from a single timepoint during execution.
from maac.
Hi,
You can use this flag (https://github.com/shariqiqbal2810/MAAC/blob/master/utils/critics.py#L162) to return the attention weights of each each agent over the other agents for all the time points that are passed in as input.
Hi, sir, is the all_attend_probs[i]
the attention weights of agent i
or it is the attention weights of other agents except itself?
from maac.
Related Issues (20)
- Problem of optimizing policy HOT 4
- Seeding fails to produce deterministic results HOT 9
- About SAC implementation HOT 1
- question about reward HOT 10
- How to implement MADDPG+SAC and COMA+SAC HOT 2
- About query, key and value input embedding HOT 1
- How does the gradient back-propagate from Q to the action $a_i$? HOT 2
- When I run "python main.py fullobs_collect_treasure V1" I meet error "ImportError: cannot import name 'Wall'"
- Critic encoders as shared modules ? HOT 3
- Bias on value extractors ?
- Memory usage increases a lot when use the latest version of OpenAI baselines
- Memory Leak HOT 1
- How to solve env_id? HOT 2
- Where is the code to load the model?
- Critic function learning
- Why does your implementation of MADDPG not work in your fork of MPE?
- The function names of "update_policies" and "update_critic" are reversed
- How to visualize during training
- issue thanks!
- Is this code applicable to continuous actions?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from maac.