Giter VIP home page Giter VIP logo

collaq's Introduction

Overview

We propose CollaQ, a novel way to decompose Q function for decentralized policy in multi-agent modeling. In StarCraft II Multi-Agent Challenge, CollaQ outperforms existing state-of-the-art techniques (i.e., QMIX, QTRAN, and VDN) by improving the win rate by 40% with the same number of samples. In the more challenging ad hoc team play setting (i.e., reweight/add/remove units without re-training or finetuning), CollaQ outperforms previous SoTA by over 30%.

Please check our website for comprehensive results.

3-min video for paper introduction.

Please cite our arXiv paper if you use this codebase:

@article{zhang2020multi,
  title={Multi-Agent Collaboration via Reward Attribution Decomposition},
  author={Zhang, Tianjun and Xu, Huazhe and Wang, Xiaolong and Wu, Yi and Keutzer, Kurt and Gonzalez, Joseph E and Tian, Yuandong},
  journal={arXiv preprint arXiv:2010.08531},
  year={2020}
}

Note: We are using SC2.4.6.2 and all baselines are run by ourselves using this version of SC2.

Installation instructions

git clone [email protected]:facebookresearch/CollaQ.git

The requirements.txt file can be used to install the necessary packages into a virtual environment with python == 3.6.0 (not recomended).

Install smac and sacred:

git submodule sync && git submodule update --init --recursive
cd third_party/sacred
git apply ../sacred.patch
cd ../smac
git apply ../smac.patch
cd ../pymarl
git apply ../pymarl.patch

Building up src folder for code

cd ../..
cp -r third_party/pymarl/src .
cp src_code/config/* src/config/algs/
cp src_code/controllers/* src/controllers/
cp src_code/learners/* src/learners/
cp src_code/modules/* src/modules/agents/

Results

sc2_standard sc2_vip sc2_sar

Run an experiment

SC2PATH=.../pymarl/StarCraftII

QMIX

python src/main.py --config=qmix --env-config=sc2 with env_args.map_name=MMM2,

CollaQ

python src/main.py --config=qmix_interactive_reg --env-config=sc2 with env_args.map_name=MMM2,

CollaQ with Attn

python src/main.py --config=qmix_interactive_reg_attn --env-config=sc2 with env_args.map_name=MMM2,

CollaQ Removing Agents

python src/main.py --config=qmix_interactive_reg_attn --env-config=sc2 with env_args.map_name=29m_vs_30m,28m_vs_30m, obs_agent_id=False

CollaQ Removing Agents

python src/main.py --config=qmix_interactive_reg_attn --env-config=sc2 with env_args.map_name=27m_vs_30m,28m_vs_30m, obs_agent_id=False

CollaQ Swapping Agents

python src/main.py --config=qmix_interactive_reg_attn --env-config=sc2 with env_args.map_name=3s1z_vs_zg_easy, 1s3z_vs_zg_easy,2s2z_vs_zg_easy, obs_agent_id=False

They are all located in src/config. --config refers to the config files in src/config/algs --env-config refers to the config files in src/config/envs

All results will be stored in the Results folder.

Watching Replay

python src/main.py --config=qmix --env-config=sc2 with env_args.map_name=5m_vs_6m, evaluate=True checkpoint_path=results/models/5m_vs_6m/... save_replay=True

Saving and loading learnt models

Saving models

You can save the learnt models to disk by setting save_model = True, which is set to False by default. The frequency of saving models can be adjusted using save_model_interval configuration. Models will be saved in the result directory, under the folder called models. The directory corresponding each run will contain models saved throughout the experiment, each within a folder corresponding to the number of timesteps passed since starting the learning process.

Loading models

Learnt models can be loaded using the checkpoint_path parameter, after which the learning will proceed from the corresponding timestep.

Watching StarCraft II replays

save_replay option allows saving replays of models which are loaded using checkpoint_path. Once the model is successfully loaded, test_nepisode number of episodes are run on the test mode and a .SC2Replay file is saved in the Replay directory of StarCraft II. Please make sure to use the episode runner if you wish to save a replay, i.e., runner=episode. The name of the saved replay file starts with the given env_args.save_replay_prefix (map_name if empty), followed by the current timestamp.

The saved replays can be watched by double-clicking on them or using the following command:

python -m pysc2.bin.play --norender --rgb_minimap_size 0 --replay NAME.SC2Replay

Note: Replays cannot be watched using the Linux version of StarCraft II. Please use either the Mac or Windows version of the StarCraft II client.

Acknowledgement

Our vanilla RL algorithm is based on PyMARL, which is an open source implementation of algorithms in StarCraft II.

License

This code is under the CC-BY-NC 4.0 (Attribution-NonCommercial 4.0 International) license.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.