Giter VIP home page Giter VIP logo

multiagent's Introduction

Multi-Agent Cooperation in Sequential Social Dilemmas

This project was completed as part of the senior requirement for Yale Computer Science. Spring 2019. More Information

This work is an implementation and exploration of current work in Multiagent Reinforcement Learning (MARL). It is highly recommended that you read the following two papers before diving in.

  1. Multi-agent reinforcement learning in sequential social dilemmas
  2. Intrinsic Social Motivation via Causal Influence in Multi-Agent RL

Quick Start

  1. Switch to your virtual env
  2. pip install -r requirements.txt
  3. python train.py
  4. python test.py ~/ray_results/prison_A3C/[training_instance]/ [checkpoint_num]

Training results are usually saved in your ray_results directory located in the root directory

Environments

Pycolab provides the abstraction for creating environments. Although this repository includes three environments, only the PrisonEnvironment has been fully developed and tested.

The PrisonEnvironment instantiates a gridworld variant of the classic Prisoner's Dilemma. At each step of the game, both agents independently choose to move left, move right, or stay still. The left side of the board represents full defection and the right side of the board represents full cooperation. Intermediate positions are a linear combination of the extremes. Rewards are distributed every 10 timesteps of the game. The figure below shows the corresponding rewards for four primary states of the game.

Game states

python play.py allows you to quickly run a manual version of the game. The script is extremely helpful when debugging the environment alone.

Learning

Reinforcement Learning is handled by RLLib. Currently all training is done using the A3C algorithm.

Unresolved Issues

When initializing A3C agents in test.py, asynchronous changes to the environment mess with the game visualization. One potential solution is to wait until the interactions with the environment have finished before starting the game. This only takes a few seconds. The better solution would be to fix the issue and submit a PR!

Related Works

  1. Leibo, J. Z., Zambaldi, V., Lanctot, M., Marecki, J., & Graepel, T. (2017). Multi-agent reinforcement learning in sequential social dilemmas. In Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems (pp. 464-473).

  2. Hughes, E., Leibo, J. Z., Phillips, M., Tuyls, K., Dueñez-Guzman, E., Castañeda, A. G., Dunning, I., Zhu, T., McKee, K., Koster, R., Tina Zhu, Roff, H., Graepel, T. (2018). Inequity aversion improves cooperation in intertemporal social dilemmas. In Advances in Neural Information Processing Systems (pp. 3330-3340).

  3. Jaques, N., Lazaridou, A., Hughes, E., Gulcehre, C., Ortega, P. A., Strouse, D. J., Leibo, J. Z. & de Freitas, N. (2018). Intrinsic Social Motivation via Causal Influence in Multi-Agent RL. arXiv preprint arXiv:1810.08647.

  4. Credit to Sequential Social Dilemma Games for providing a useful example of RLLib.

multiagent's People

Contributors

jweinstein2 avatar annieechen avatar simonjmendelsohn avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.