Giter VIP home page Giter VIP logo

Comments (3)

yuchen-x avatar yuchen-x commented on July 28, 2024

Dear Shariq,

I have the same question as jeanibarz.

Also, in figure 1, the 2nd MLP layer receives the output of the 1st MLP, however, in your code, I assume the 2nd MLP layer is called self.critics:

Screenshot from 2020-12-18 10-31-14

The input of self.critics is an encoding of only observations (called s_encoding) plus other values, rather than the output of 1st MLP layer:

Screenshot from 2020-12-18 10-34-50

Is my understanding correct?

Thanks!

from maac.

aklein1995 avatar aklein1995 commented on July 28, 2024

By analizing the code, I intuitively let to this conclusion...

Each agent has a unique encoding function, state_encoder, which contrary to which is said in the paper, the embedding is based only based on the state, e_i:

MAAC/utils/critics.py

Lines 52 to 59 in 6174a01

state_encoder = nn.Sequential()
if norm_in:
state_encoder.add_module('s_enc_bn', nn.BatchNorm1d(
sdim, affine=False))
state_encoder.add_module('s_enc_fc1', nn.Linear(sdim,
hidden_dim))
state_encoder.add_module('s_enc_nl', nn.LeakyReLU())
self.state_encoders.append(state_encoder)

However, as it is explained in the paper, we also need the embedding of other agents too, e_j, which is done with the self.critics_encoder which it takes both state and action of agents, and that is shared:

MAAC/utils/critics.py

Lines 35 to 44 in 6174a01

for sdim, adim in sa_sizes:
idim = sdim + adim
odim = adim
encoder = nn.Sequential()
if norm_in:
encoder.add_module('enc_bn', nn.BatchNorm1d(idim,
affine=False))
encoder.add_module('enc_fc1', nn.Linear(idim, hidden_dim))
encoder.add_module('enc_nl', nn.LeakyReLU())
self.critic_encoders.append(encoder)

Lately, in the forward function, the key and value outputs are discarded for current agent:

MAAC/utils/critics.py

Lines 128 to 130 in 6174a01

for i, a_i, selector in zip(range(len(agents)), agents, curr_head_selectors):
keys = [k for j, k in enumerate(curr_head_keys) if j != a_i]
values = [v for j, v in enumerate(curr_head_values) if j != a_i]

Thus, by looking only to the code, I would say that the Figure 1 is not completely correct @jeanibarz

Moreover, @yuchen-x, I think you are right, the 2nd MLP is called self.critics. The input of these second MLP is correct, as the s_encodings refer to those e_i, values and the *other_all_values refer to the x_i:

MAAC/utils/critics.py

Lines 111 to 112 in 6174a01

# extract state encoding for each agent that we're returning Q for
s_encodings = [self.state_encoders[a_i](states[a_i]) for a_i in agents]

This is, state_encoder refers to the first MLP, and its output would be e_i, which only takes into account the state.
self.citics_encoder takes both state and action into accounts, and is used to get the e_j.

This is not any true evidence, just my self conclusion after taking a look to the code and the paper explanation.

from maac.

shariqiqbal2810 avatar shariqiqbal2810 commented on July 28, 2024

Hi all, sorry for the confusion. The deviation of the code from the figure is described in the section of the paper entitled "Multi-Agent Advantage Function." Essentially, we want to calculate a Q-value for each possible action such that we can compute an advantage function, so we remove actions from the input and feed the state alone to a separate encoder from which we compute queries. I guess my intention was that the simplified version (with no advantage function) was easier to understand visually in a figure, but I realize how that can be misleading. Hope this clears things up!

from maac.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.