facebookresearch / collaq Goto Github PK

A code implementation for our arXiv paper "Multi-agent Adhoc Team Play using Decompositional Q function"

License: Other

Python 100.00%

collaq's Introduction

Overview

We propose CollaQ, a novel way to decompose Q function for decentralized policy in multi-agent modeling. In StarCraft II Multi-Agent Challenge, CollaQ outperforms existing state-of-the-art techniques (i.e., QMIX, QTRAN, and VDN) by improving the win rate by 40% with the same number of samples. In the more challenging ad hoc team play setting (i.e., reweight/add/remove units without re-training or finetuning), CollaQ outperforms previous SoTA by over 30%.

Please check our website for comprehensive results.

3-min video for paper introduction.

Please cite our arXiv paper if you use this codebase:

@article{zhang2020multi,
  title={Multi-Agent Collaboration via Reward Attribution Decomposition},
  author={Zhang, Tianjun and Xu, Huazhe and Wang, Xiaolong and Wu, Yi and Keutzer, Kurt and Gonzalez, Joseph E and Tian, Yuandong},
  journal={arXiv preprint arXiv:2010.08531},
  year={2020}
}

Note: We are using SC2.4.6.2 and all baselines are run by ourselves using this version of SC2.

Installation instructions

git clone [email protected]:facebookresearch/CollaQ.git

The requirements.txt file can be used to install the necessary packages into a virtual environment with python == 3.6.0 (not recomended).

Install smac and sacred:

git submodule sync && git submodule update --init --recursive
cd third_party/sacred
git apply ../sacred.patch
cd ../smac
git apply ../smac.patch
cd ../pymarl
git apply ../pymarl.patch

Building up src folder for code

cd ../..
cp -r third_party/pymarl/src .
cp src_code/config/* src/config/algs/
cp src_code/controllers/* src/controllers/
cp src_code/learners/* src/learners/
cp src_code/modules/* src/modules/agents/

Results

Run an experiment

SC2PATH=.../pymarl/StarCraftII

QMIX

python src/main.py --config=qmix --env-config=sc2 with env_args.map_name=MMM2,

CollaQ

python src/main.py --config=qmix_interactive_reg --env-config=sc2 with env_args.map_name=MMM2,

CollaQ with Attn

python src/main.py --config=qmix_interactive_reg_attn --env-config=sc2 with env_args.map_name=MMM2,

CollaQ Removing Agents

python src/main.py --config=qmix_interactive_reg_attn --env-config=sc2 with env_args.map_name=29m_vs_30m,28m_vs_30m, obs_agent_id=False

CollaQ Removing Agents

python src/main.py --config=qmix_interactive_reg_attn --env-config=sc2 with env_args.map_name=27m_vs_30m,28m_vs_30m, obs_agent_id=False

CollaQ Swapping Agents

python src/main.py --config=qmix_interactive_reg_attn --env-config=sc2 with env_args.map_name=3s1z_vs_zg_easy, 1s3z_vs_zg_easy,2s2z_vs_zg_easy, obs_agent_id=False

They are all located in src/config. --config refers to the config files in src/config/algs --env-config refers to the config files in src/config/envs

All results will be stored in the Results folder.

Watching Replay

python src/main.py --config=qmix --env-config=sc2 with env_args.map_name=5m_vs_6m, evaluate=True checkpoint_path=results/models/5m_vs_6m/... save_replay=True

Saving and loading learnt models

Saving models

You can save the learnt models to disk by setting save_model = True, which is set to False by default. The frequency of saving models can be adjusted using save_model_interval configuration. Models will be saved in the result directory, under the folder called models. The directory corresponding each run will contain models saved throughout the experiment, each within a folder corresponding to the number of timesteps passed since starting the learning process.

Loading models

Learnt models can be loaded using the checkpoint_path parameter, after which the learning will proceed from the corresponding timestep.

Watching StarCraft II replays

save_replay option allows saving replays of models which are loaded using checkpoint_path. Once the model is successfully loaded, test_nepisode number of episodes are run on the test mode and a .SC2Replay file is saved in the Replay directory of StarCraft II. Please make sure to use the episode runner if you wish to save a replay, i.e., runner=episode. The name of the saved replay file starts with the given env_args.save_replay_prefix (map_name if empty), followed by the current timestamp.

The saved replays can be watched by double-clicking on them or using the following command:

python -m pysc2.bin.play --norender --rgb_minimap_size 0 --replay NAME.SC2Replay

Note: Replays cannot be watched using the Linux version of StarCraft II. Please use either the Mac or Windows version of the StarCraft II client.

Acknowledgement

Our vanilla RL algorithm is based on PyMARL, which is an open source implementation of algorithms in StarCraft II.

License

This code is under the CC-BY-NC 4.0 (Attribution-NonCommercial 4.0 International) license.

collaq's People

Contributors

Stargazers

Watchers

collaq's Issues

OSError: [Errno 12] Cannot allocate memory

When I run the following command:

python src/main.py --config=qmix_interactive_reg_attn --env-config=sc2 with env_args.map_name=29m_vs_30m,28m_vs_30m, obs_agent_id=False

occured follwoing erros:

Maps "1s3z_vs_zg_easy,2s2z_vs_zg_easy," not find

when I run

python3 src/main.py --config=qmix_interactive_reg_attn --env-config=sc2 with env_args.map_name=3s1z_vs_zg_easy, 1s3z_vs_zg_easy,2s2z_vs_zg_easy, obs_agent_id=False

occured error:

in smac maps directory could not find the Maps "1s3z_vs_zg_easy,2s2z_vs_zg_easy," :

will you help to provide the maps?

Question of copyright of this repo

Hi, I found files in this repo contain many

# Copyright (c) Facebook, Inc. and its affiliates.

However, to my knowledge, this repo is mainly based on https://github.com/oxwhirl/pymarl. My question is: is it proper to add copyright claim in files from pymarl.

4 questions about the experiment mentioned in the paper

Dear author, I have 4 questions for you about the experiment mentioned in the paper as followings:
#1 Question ：What is the map used in Table 4？

#2 Question：Which parameter is used for "without random agent IDs" in Figure 10?

#3 Question：How to set the configuration file of "Random Action" algorithm in Figure 9（In the Ad hoc Resource Collection Env）?

#4 Question：Some experimental maps mentioned in the article were not found. Can the designed maps 3s1z_vs_16zg, 1s3z_vs_16zg and 2s2z_vs_16zg be shared?
Can the modified maps 28m_vs_30m and 29m_vs_30m be shared?

Could

Could you tell me the pytorch version ???Because, I think that couse this error.

I use :
torch 1.7.1
torchvision 0.8.2

[INFO 22:03:34] my_main Saving models to results/models/M-M-M-2/qmix_interactive_reg__2020-12-21_22-02-55/38
[INFO 22:06:29] my_main Updated target network
[ERROR 22:06:57] pymarl Failed after 0:04:02!
Traceback (most recent calls WITHOUT Sacred internals):
File "src/main.py", line 35, in my_main
run(_run, config, _log)
File "/media/ps/data/StarCraft2/CollaQ/src/run.py", line 48, in run
run_sequential(args=args, logger=logger)
File "/media/ps/data/StarCraft2/CollaQ/src/run.py", line 287, in run_sequential
logger.print_recent_stats()
File "/media/ps/data/StarCraft2/CollaQ/src/utils/logging.py", line 48, in print_recent_stats
item = "{:.4f}".format(np.mean([x[1] for x in self.stats[k][-window:]]))
File "/home/ps/anaconda3/envs/collaq/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 2920, in mean
out=out, **kwargs)
File "/home/ps/anaconda3/envs/collaq/lib/python3.6/site-packages/numpy/core/_methods.py", line 85, in _mean
ret = ret.dtype.type(ret / rcount)
AttributeError: 'torch.dtype' object has no attribute 'type'
[INFO 22:06:57] absl Shutdown gracefully.

How to set gpu id?

There is something wrong

when I run
python src/main.py --config=qmix --env-config=sc2 with env_args.map_name=3m

Traceback (most recent call last):
File "src/main.py", line 19, in
ex = Experiment("pymarl")
File "/home/XXXXXX/miniconda3/envs/mycollq/lib/python3.7/site-packages/sacred/experiment.py", line 75, in init
_caller_globals=caller_globals)
File "/home/XXXXXX/miniconda3/envs/mycollq/lib/python3.7/site-packages/sacred/ingredient.py", line 57, in init
gather_sources_and_dependencies(_caller_globals)
File "/home/XXXXXX/miniconda3/envs/mycollq/lib/python3.7/site-packages/sacred/dependencies.py", line 487, in gather_sources_and_dependencies
sources = gather_sources(globs, experiment_path)
File "/home/XXXXXX/miniconda3/envs/mycollq/lib/python3.7/site-packages/sacred/dependencies.py", line 440, in get_sources_from_imported_modules
return get_sources_from_modules(iterate_imported_modules(globs), base_path)
File "/home/XXXXXX/miniconda3/envs/mycollq/lib/python3.7/site-packages/sacred/dependencies.py", line 409, in get_sources_from_modules
filename = os.path.abspath(mod.file)
File "/home/XXXXXX/miniconda3/envs/mycollq/lib/python3.7/posixpath.py", line 378, in abspath
path = os.fspath(path)
TypeError: expected str, bytes or os.PathLike object, not NoneType

Hyperparameters for Difficult Maps

I find it quite hard to reproduce your results using your default hyperparameters in maps such as MMM2, is there any suggestion for tuning the hyperparamters?

env_info KeyError: 'obs_alone_shape'

I built a virtual environment with python == 3.6.12 and installed the packages with requirements.txt. However, when I tried to run with the following command:

python src/main.py --config=qmix --env-config=sc2 with env_args.map_name=MMM2

I got an error:
[ERROR 11:51:20] pymarl Failed after 0:00:00!
Traceback (most recent calls WITHOUT Sacred internals):
File "src/main.py", line 41, in my_main
run(_run, config, _log)
File "/home/wang/files/MARL/CollaQ/src/run.py", line 54, in run
run_sequential(args=args, logger=logger)
File "/home/wang/files/MARL/CollaQ/src/run.py", line 101, in run_sequential
"obs_alone": {"vshape": env_info["obs_alone_shape"], "group": "agents"},
KeyError: 'obs_alone_shape'

After that, I went to check the generation of 'env_info', and found that there was no such key 'obs_alone_shape':

def get_env_info(self):
        env_info = {"state_shape": self.get_state_size(),
                    "obs_shape": self.get_obs_size(),
                    "n_actions": self.get_total_actions(),
                    "n_agents": self.n_agents,
                    "episode_limit": self.episode_limit}
        return env_info

in file smac/env/multiagentenv.py

Was that because I had installed the packages in the wrong way?
Hope to receive a reply soon.

There is something wrong

when I run
python3 src/main.py --config=qmix --env-config=sc2 with env_args.map_name=2s3z, (note ,)
raise error :
[ERROR 23:02:44] pymarl Failed after 0:00:00!
Traceback (most recent calls WITHOUT Sacred internals):
File "src/main.py", line 35, in my_main
run(_run, config, _log)
File "/media/ps/data/StarCraft2/CollaQ/src/run.py", line 48, in run
run_sequential(args=args, logger=logger)
File "/media/ps/data/StarCraft2/CollaQ/src/run.py", line 78, in run_sequential
runner = r_REGISTRY[args.runner](args=args, logger=logger)
File "/media/ps/data/StarCraft2/CollaQ/src/runners/episode_runner.py", line 15, in init
self.env = env_REGISTRYself.args.env
File "/media/ps/data/StarCraft2/CollaQ/src/envs/init.py", line 7, in env_fn
return env(**kwargs)
File "/media/ps/data/StarCraft2/CollaQ/src/smac/env/starcraft2/starcraft2.py", line 198, in init
map_params = get_map_params(self.map_name[0])
File "/media/ps/data/StarCraft2/CollaQ/src/smac/env/starcraft2/maps/init.py", line 10, in get_map_params
return map_param_registry[map_name]
KeyError: 'M'
Then I found that You can change
" self.map_name = map_name " to " self.map_name = [map_name]"
in " src/smac/env/starcraft2/starcraft2.py line 196"

And then run
python3 src/main.py --config=qmix --env-config=sc2 with env_args.map_name=2s3z

Could anyone fix this error?

git submodule sync && git submodule update --init --recursive
cd third_party/sacred
git apply ../sacred.patch
cd ../smac
git apply ../smac.patch
cd ../pymarl
git apply ../pymarl.patch

error: patch failed: src/components/transforms.py:19
error: src/components/transforms.py: patch does not apply
error: patch failed: src/controllers/init.py:2
error: src/controllers/init.py: patch does not apply
error: patch failed: src/modules/agents/init.py:1
error: src/modules/agents/init.py: patch does not apply
error: patch failed: src/modules/critics/coma.py:67
error: src/modules/critics/coma.py: patch does not apply
error: patch failed: src/modules/mixers/vdn.py:7
error: src/modules/mixers/vdn.py: patch does not apply

facebookresearch / collaq Goto Github PK

collaq's Introduction

Overview

Installation instructions

Results

Run an experiment

QMIX

CollaQ

CollaQ with Attn

CollaQ Removing Agents

CollaQ Removing Agents

CollaQ Swapping Agents

Watching Replay

Saving and loading learnt models

Saving models

Loading models

Watching StarCraft II replays

Acknowledgement

License

collaq's People

Contributors

Stargazers

Watchers

Forkers

collaq's Issues

Recommend Projects

Recommend Topics

Recommend Org