Giter VIP home page Giter VIP logo

code_with_rl's Introduction

Code With Deep Reinforcement Learning

Single Agent Algorithm

Value Based

  • Deep Q Network(DQN) (off-policy)
  • Double Deep Q Network(Double DQN) (off-policy)
  • Dueling Deep Q Network(Dueling DQN) (off-policy)
  • Duelling Double Deep Q Network(D3QN) (off-policy)
  • Noisy Networks for Exploration(NoisyDQN) (off-policy)

Actor-Critic Method

  • Advantage Actor-Critic(A2C) (on-policy)
  • Asynchronous Advantage Actor-Critic(A3C) (on-policy)
  • Proximal Policy Optimization(PPO)(GAE) (on-policy)(Nearing off-policy)
  • Proximal Policy Gradient(PPG) (on-policy PPO + off-policy Critic[Let it share parameters with PPO's Critic])
  • Deep Deterministic Policy Gradient(DDPG) (off-policy)
  • Twin Delayed Deep Deterministic policy gradient(TD3) (off-policy)
  • Soft Actor-Critic(SAC) (off-policy)
  • Truncated Quantile Critics(TQC) (off-policy)
  • Distribution Correction(DisCor) based on Soft Actor-Critic(DisCor)
  • Randomized Ensembled Double Q-Learning(REDQ)

Deep reinforcement learning with a latent variable model

  • Stochastic Latent Actor-Critic(SLAC)
  • SAC with AutoEncoder(SAC_AE)

Regularizing Deep Reinforcement Learning from Pixels

  • Data regularized Q(DrQ-v1)
  • Data regularized Q(DrQ-v2)

Imitation Learning / Inverse Reinforcement Learning

  • Behavior Cloning(BC)
  • Generative Adversarial Imitation Learning(GAIL)

ReplayBuffer Structure

  • Prioritized Experience Replay(PER)
  • Hindsight Experience Replay(HER)

Neural network architecture designed for deep reinforcement learning

  • Deep Dense Architectures in reinforcement learning(D2RL)

Explore

  • Intrinsic Curiosity Module(ICM)

Distributed Reinforcement Learning

  • APEX(resemblance)
  • MPI

Multi Agent Algorithm

Actor-Critic Method

  • Multi Agent Deep Deterministic Policy Gradient(MADDPG)
  • MADDPG Method TD3, SAC
  • Multi Agent Proximal Policy Optimization(MAPPO)
  • COMA

Value Based

  • QMIX

Installation

  • Clone the repo and cd into it:
    git clone https://github.com/namjiwon1023/Code_With_RL
    cd Code_With_RL
  • If you don't have Pytorch installed already, install your favourite flavor of Pytorch. In most cases, you may use
    pip3 install torch torchvision torchaudio -f https://download.pytorch.org/whl/lts/1.8/torch_lts.html # pytorch 1.8.1 LTS CUDA 10.2 version. if you have GPU.
    or
    pip3 install torch==1.8.1+cpu torchvision==0.9.1+cpu torchaudio==0.8.1 -f https://download.pytorch.org/whl/lts/1.8/torch_lts.html # pytorch 1.8.1 LTS CPU version. if you don`t have GPU.
    to install Pytorch GPU or CPU version.

File Structure

  • Hyperparameter # Algorithm Hyperparameters
    • dqn.yaml
    • doubledqn.yaml
    • duelingdqn.yaml
    • d3qn.yaml
    • noisydqn.yaml
    • ddpg.yaml
    • td3.yaml
    • sac.yaml
    • ppo.yaml
    • a2c.yaml
    • behaviorcloning.yaml
    • etc.
  • agent.py
    • reinforcement learning algorithm
  • network.py
    • QNetwork
    • NoisyLinear
    • ActorNetwork
    • CriticNetwork
  • replaybuffer.py
    • Simple PPO Rollout Buffer
    • Off-Policy Experience Replay
  • runner.py
    • Training loop
    • Evaluator
  • main.py
    • Start training
    • Start evaluation
  • utils.py
    • Make gif image
    • Drawing
    • Basic tools

Quick Start

To train a new network : run python main.py --algorithm=selection algorithm

To test a preTrained network : run python main.py --algorithm=selection algorithm --evaluate=True

Reinforcement learning algorithms that can now be selected:

  • DQN
  • Double_DQN
  • Dueling_DQN
  • D3QN
  • Noisy_DQN
  • DDPG
  • TD3
  • SAC
  • PPO
  • A2C
  • BC_SAC

Discrete action space recommendation: Dueling DoubleQN (D3QN)

Continuous action space recommendation: use TD3 if you are good at tuning parameters, use PPO or SAC if you are not good at tuning parameters, if the training environment Reward function is written by beginners, then use PPO .

Training Environment

Discrete action :

Continuous action :

Multi-Agent Training Environment:

Training Result

Value Based Algorithm Compare Result:



Policy Based Algorithm Compare Result:


Distributed Reinforcement Learning Structure

DRL Structure

Requirements

Python 3.6+ : conda create -n icsl_rl python=3.6
Pytorch 1.6+ : https://pytorch.org/get-started/locally/
Numpy : pip install numpy
openai gym : https://github.com/openai/gym
matplotlib : pip install matplotlib
tensorboard : pip install tensorboard

Citation:

To cite this repository:

@misc{algorithms_drl,
  author = {Zhiyuan Nan},
  title = {Code With Deep Reinforcement Learning},
  year = {2021},
  publisher = {Github},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/namjiwon1023/Code_With_RL}},
}

References

Key Papers in Deep RL

PG Travel Guide

utilForever/rl-paper-study

Khanrc's blog

CUN-bjy/rl-paper-review

code_with_rl's People

Contributors

namjiwon1023 avatar

Stargazers

Byung Chan Choi (Luwis) avatar  avatar Rodrigo de Lazcano avatar  avatar  avatar

Watchers

Rodrigo de Lazcano avatar  avatar

Forkers

icsl-hanyang

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.