Giter VIP home page Giter VIP logo

rainbow-iqn-apex's Introduction

Rainbow-IQN Ape-X :

License: Apache 2.0

Rainbow-IQN Ape-X is a new distributed state-of-the-art algorithm on Atari coming from the combination of the 3 following papers:
Rainbow: Combining Improvements in Deep Reinforcement Learning [1].
IQN: Implicit Quantile Networks (IQN) for Distributional Reinforcement Learning [2].
Ape-X: Distributed Prioritized Experience Replay [3].

Introduction

This repository is an open-source implementation of a distributed version of Rainbow-IQN following Ape-X paper for the distributed part (there is also a distributed version of Rainbow only, i.e. Rainbow Ape-X).
The code presented here is at the basis of our paper Is Deep Reinforcement Learning really superhuman on Atari [4] on which we introduce SABER: a Standardized Atari BEnchmark for general Reinforcement learning algorithms.

Importantly this code was the Reinforcement Learning part of the algorithm I developed to win the CARLA challenge on Track 2 Cameras Only. This success showed the strength of Rainbow-IQN Ape-X as a general algorithm.

Requirements/Installation

To install all dependencies with Anaconda run $ conda env create -f environment.yml.
If no Anaconda, install pytorch and then install the following packages with pip: atari-py, redlock-py, plotly, opencv-python.
You can take a look at the Dockerfile if you are uncertain about steps to install this project.

Afterwards, you can install the package with:

$ pip install --editable ./rainbow-iqn-apex

You will be able to use functions and classes from this project into other projects, if you make changes to the sources files, those changes will be immediately seen next time you restart the python interpreter (or reload the package with importlib):

import rainbowiqn

Uninstall it with:

$ pip uninstall rainbow-iqn-apex

This code has been tested on Ubuntu 16 and 18.

Sanity check

Open 3 terminal to sanity check if every thing is working (this will launch an experiment with one actor on space_invaders):

# Terminal 1. This launchs the redis servor on port 6379.
$ redis-server redis_rainbow_6379.conf 
 
# Terminal 2. This launchs the learner.
$ python rainbowiqn/launch_learner.py --memory-capacity 100000 \
                                      --learn-start 8000 \
                                      --log-interval 2500
                                      
# Terminal 3. This launchs the actor.
$ python rainbowiqn/launch_actor.py --id-actor 0 \
                                    --memory-capacity 100000 \
                                    --learn-start 8000 \
                                    --log-interval 2500

If after a short time (1 minute probably), you see some logs like the following one appearing in the learner and the actor terminal, everything is OK!

[2019-08-12T17:40:11] T = 12500 / 50000000
Time between 2 log_interval for learner (14.410 sec)  # (for the learner)

[2019-08-12T17:40:06] T = 12500 / 50000000
Time between 2 log_interval for actor 0 (13.249 sec)  # (for the actor)

Kill all 3 terminals after and see the wiki to know how to launch experiments for real!

To test a pretrained snapshot, you must download trained weight from the release and then prompt the following command:

# Remove rendering for faster evaluation
$ python rainbowiqn/test_multiple_seed.py --model with_weight/Rainbow_IQN/space_invaders/last_model_space_invaders_50000000.pth \
                                          --game space_invaders --render

By default all experiments will be made on SABER. This includes all recommendations of Machado et al. [5] (i.e. ignore life signal, using sticky actions, always use 18 action set, report results as the mean score over 100 consecutive training episodes) and a new parameter which we call max_stuck_time (5 minutes by default).
This parameter allows to set infinite episode and still terminate episode when agent is stuck. More details can be found in our paper Is Deep Reinforcement Learning really superhuman on Atari [4].
In our paper we discuss how setting infinite episode is really important to allow for fair and comparable results. Moreover this allows to compare against the human world record and shows the incredibly high gap remaining before claiming of superhuman performances.
We showed that the use of superhuman performances in previous papers is indeed misleading. General RL agents are definitely far from superhuman on most Atari games!

  • This codebase is heavily borrowed from @kaixhin for Rainbow (see Kaixhin license there MIT License)
  • @dopamine for the Tensorflow implementation of IQN (see compute_loss_iqn.py for Dopamine license)

[1] Rainbow: Combining Improvements in Deep Reinforcement Learning
[2] Implicit Quantile Networks (IQN) for Distributional Reinforcement Learning
[3] Distributed Prioritized Experience Replay
[4] Is Deep Reinforcement Learning really superhuman on Atari?
[5] Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents

rainbow-iqn-apex's People

Contributors

gabrieldemarmiesse avatar marintoro avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

rainbow-iqn-apex's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.