Giter VIP home page Giter VIP logo

i2c's Introduction

Installation

  • Known dependencies: Python (3.5.4), OpenAI gym (0.10.5), tensorflow (1.14.0), numpy (1.18.2)

Environment options

  • --scenario: defines which environment in the MPE is to be used (default: "cn")

  • --max-episode-len maximum length of each episode for the environment (default: 25)

  • --num-episodes total number of training episodes (default: 60000)

  • --num-adversaries: number of adversaries in the environment (default: 0)

Core training parameters

  • --lr: learning rate (default: 1e-2)

  • --gamma: discount factor (default: 0.95)

  • --batch-size: batch size (default: 800)

  • --num-units: number of units in the MLP (default: 128)

Training for prior network

  • --prior-buffer-size: prior network training buffer size

  • --prior-num-iter: prior network training iterations

  • --prior-training-rate: prior network training rate

  • --prior-training-percentile: control threshold for KL value to get labels

Checkpointing

  • --exp-name: name of the experiment, used as the file name to save all results (default: None)

  • --save-dir: directory where intermediate training results and model will be saved (default: "/tmp/policy/")

  • --save-rate: model is saved every time this number of episodes has been completed (default: 1000)

  • --load-dir: directory where training state and model are loaded from (default: "")

  • --plots-dir: directory where training curves are saved (default: "./learning_curves/")

  • --restore_all: whether to restore existing I2C network

Training procedure

I2C be learned end-to-end or in a two-phase manner. This code is implemented for end-to-end manner which could take more training time compared with the latter manner

For Cooperative Navigation, python3 train.py --scenario 'cn' --prior-training-percentile 70

For Predator Prey, python3 train.py --scenario 'pp' --prior-training-percentile 80

Citations

If you are using the codes, please cite our paper.

Ziluo Ding, Tiejun Huang, and Zongqing Lu. Learning Individually Inferred Communication for Multi-Agent Cooperation. NeurIPS'20.

@inproceedings{ding2020learning,
    	title={Learning Individually Inferred Communication for Multi-Agent Cooperation},
    	author={Ding, Ziluo and Huang, Tiejun and Lu, Zongqing},
    	booktitle={NeurIPS},
    	year={2020}
}

Acknowledgements

This code is developed based on the source code of MADDPG by Ryan Lowe

i2c's People

Contributors

leonkding avatar z0ngqing avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.