Giter VIP home page Giter VIP logo

cleanrl's Introduction

CleanRL (Clean Implementation of RL Algorithms)

tests ci Code style: black Imports: isort

CleanRL is a Deep Reinforcement Learning library that provides high-quality single-file implementation with research-friendly features. The implementation is clean and simple, yet we can scale it to run thousands of experiments using AWS Batch. The highlight features of CleanRL are:

  • ๐Ÿ“œ Single-file implementation
    • Every detail about an algorithm is put into the algorithm's own file. It is therefore easier to fully understand an algorithm and do research with.
  • ๐Ÿ“Š Benchmarked Implementation (7+ algorithms and 34+ games at https://benchmark.cleanrl.dev)
  • ๐Ÿ“ˆ Tensorboard Logging
  • ๐Ÿช› Local Reproducibility via Seeding
  • ๐ŸŽฎ Videos of Gameplay Capturing
  • ๐Ÿงซ Experiment Management with Weights and Biases
  • ๐Ÿ’ธ Cloud Integration with docker and AWS

You can read more about CleanRL in our technical paper and documentation.

Good luck have fun ๐Ÿš€

Get started

Prerequisites:

To run experiments locally, give the following a try:

git clone https://github.com/vwxyzjn/cleanrl.git && cd cleanrl
poetry install

# alternatively, you could use `poetry shell` and do
# `python run cleanrl/ppo.py`
poetry run python cleanrl/ppo.py \
    --seed 1 \
    --env-id CartPole-v0 \
    --total-timesteps 50000

# open another temrminal and enter `cd cleanrl/cleanrl`
tensorboard --logdir runs

To use experiment tracking with wandb, run

wandb login # only required for the first time
poetry run python cleanrl/ppo.py \
    --seed 1 \
    --env-id CartPole-v0 \
    --total-timesteps 50000 \
    --track \
    --wandb-project-name cleanrltest

To run training scripts in other games:

poetry shell

# classic control
python cleanrl/dqn.py --env-id CartPole-v1
python cleanrl/ppo.py --env-id CartPole-v1
python cleanrl/c51.py --env-id CartPole-v1

# atari
poetry install -E atari
python cleanrl/dqn_atari.py --env-id BreakoutNoFrameskip-v4
python cleanrl/c51_atari.py --env-id BreakoutNoFrameskip-v4
python cleanrl/ppo_atari.py --env-id BreakoutNoFrameskip-v4
python cleanrl/apex_dqn_atari.py --env-id BreakoutNoFrameskip-v4

# NEW: 3-4x side-effects free speed up with envpool's atari (only available to linux)
poetry install -E envpool
python cleanrl/ppo_atari_envpool.py --env-id BreakoutNoFrameskip-v4
# Learn Pong-v5 in ~5-10 mins
# Side effects such as lower sample efficiency might occur
poetry run python ppo_atari_envpool.py --clip-coef=0.2 --num-envs=16 --num-minibatches=8 --num-steps=128 --update-epochs=3

# pybullet
poetry install -E pybullet
python cleanrl/td3_continuous_action.py --env-id MinitaurBulletDuckEnv-v0
python cleanrl/ddpg_continuous_action.py --env-id MinitaurBulletDuckEnv-v0
python cleanrl/sac_continuous_action.py --env-id MinitaurBulletDuckEnv-v0

# procgen
poetry install -E procgen
python cleanrl/ppo_procgen.py --env-id starpilot
python cleanrl/ppg_procgen.py --env-id starpilot

# ppo + lstm
python cleanrl/ppo_atari_lstm.py --env-id BreakoutNoFrameskip-v4
python cleanrl/ppo_memory_env_lstm.py

You may also use a prebuilt development environment hosted in Gitpod:

Open in Gitpod

Algorithms Implemented

Algorithm Variants Implemented
โœ… Proximal Policy Gradient (PPO) ppo.py, docs
ppo_atari.py, docs
ppo_continuous_action.py, docs
ppo_atari_lstm.py
ppo_procgen.py
โœ… Deep Q-Learning (DQN) dqn.py
dqn_atari.py
โœ… Categorical DQN (C51) c51.py
c51_atari.py
โœ… Apex Deep Q-Learning (Apex-DQN) apex_dqn_atari.py
โœ… Soft Actor-Critic (SAC) sac_continuous_action.py
โœ… Deep Deterministic Policy Gradient (DDPG) ddpg_continuous_action.py
โœ… Twin Delayed Deep Deterministic Policy Gradient (TD3) td3_continuous_action.py

Open RL Benchmark

CleanRL has a sub project called Open RL Benchmark (https://benchmark.cleanrl.dev/), where we have tracked thousands of experiments across domains. The benchmark is interactive, and researchers can easily query information such as GPU utilization and videos of an agent's gameplay that are normally hard to acquire in other RL benchmarks. Here are some screenshots.

Support and get involved

We have a Discord Community for support. Feel free to ask questions. Posting in Github Issues and PRs are also welcome. Also our past video recordings are available at YouTube

Citing CleanRL

If you use CleanRL in your work, please cite our technical paper:

@article{huang2021cleanrl,
    title={CleanRL: High-quality Single-file Implementations of Deep Reinforcement Learning Algorithms}, 
    author={Shengyi Huang and Rousslan Fernand Julien Dossa and Chang Ye and Jeff Braga},
    year={2021},
    eprint={2111.08819},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

cleanrl's People

Contributors

vwxyzjn avatar dosssman avatar bragajj avatar yooceii avatar jkterry1 avatar adamcakg avatar felipemartins96 avatar helges avatar sudo-michael avatar bentrevett avatar chutaklee avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.