Giter VIP home page Giter VIP logo

joyrl's Introduction

JoyRL

PyPI GitHub issues GitHub stars GitHub forks GitHub license

JoyRL is a parallel reinforcement learning library based on PyTorch and Ray. Unlike existing RL libraries, JoyRL is helping users to release the burden of implementing algorithms with tough details, unfriendly APIs, and etc. JoyRL is designed for users to train and test RL algorithms with only hyperparameters configuration, which is mush easier for beginners to learn and use. Also, JoyRL supports plenties of state-of-art RL algorithms including RLHF(core of ChatGPT)(See algorithms below). JoyRL provides a modularized framework for users as well to customize their own algorithms and environments.

Install

⚠️ Note that donot install JoyRL through any mirror image!!!

# you need to install Anaconda first
conda create -n joyrl python=3.10
conda activate joyrl
pip install -U joyrl

Torch install:

# CPU
pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1
# CUDA 11.8
pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu118
# CUDA 12.1
pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu121

Usage

Quick Start

the following presents a demo to use joyrl. As you can see, first create a yaml file to config hyperparameters, then run the command as below in your terminal. That's all you need to do to train a DQN agent on CartPole-v1 environment.

joyrl --yaml ./presets/ClassControl/CartPole-v1/CartPole-v1_DQN.yaml

or you can run the following code in your python file.

import joyrl
if __name__ == "__main__":
    print(joyrl.__version__)
    yaml_path = "./presets/ClassControl/CartPole-v1/CartPole-v1_DQN.yaml"
    joyrl.run(yaml_path = yaml_path)

Documentation

More tutorials and API documentation are hosted on JoyRL docs or JoyRL 中文文档.

Algorithms

Name Reference Author Notes
Q-learning RL introduction johnjim0816
Sarsa RL introduction johnjim0816
DQN DQN Paper johnjim0816
Double DQN DoubleDQN Paper johnjim0816
Dueling DQN DuelingDQN Paper johnjim0816
NoisyDQN NoisyDQN Paper johnjim0816
DDPG DDPG Paper johnjim0816
TD3 TD3 Paper johnjim0816
A2C/A3C A3C Paper johnjim0816
PPO PPO Paper johnjim0816
SoftQ SoftQ Paper johnjim0816

Why JoyRL?

RL Platform GitHub Stars # of Alg. (1) Custom Env Async Training RNN Support Multi-Head Observation Backend
Baselines GitHub stars 9 ✔️ (gym) ✔️ TF1
Stable-Baselines GitHub stars 11 ✔️ (gym) ✔️ TF1
Stable-Baselines3 GitHub stars 7 ✔️ (gym) ✔️ PyTorch
Ray/RLlib GitHub stars 16 ✔️ ✔️ ✔️ ✔️ TF/PyTorch
SpinningUp GitHub stars 6 ✔️ (gym) PyTorch
Dopamine GitHub stars 7 TF/JAX
ACME GitHub stars 14 ✔️ (dm_env) ✔️ ✔️ TF/JAX
keras-rl GitHub stars 7 ✔️ (gym) Keras
cleanrl GitHub stars 9 ✔️ (gym) poetry
rlpyt GitHub stars 11 ✔️ ✔️ PyTorch
ChainerRL GitHub stars 18 ✔️ (gym) ✔️ Chainer
Tianshou GitHub stars 20 ✔️ (Gymnasium) ✔️ ✔️ PyTorch
JoyRL GitHub stars 11 ✔️ (Gymnasium) ✔️ ✔️ ✔️ PyTorch

Here are some other highlghts of JoyRL:

  • Provide a series of Chinese courses JoyRL Book (with the English version in progress), suitable for beginners to start with a combination of theory

Contributors

pic
John Jim

Peking University

pic
Qi Wang

Shanghai Jiao Tong University

pic
Yiyuan Yang

University of Oxford

joyrl's People

Contributors

johnjim0816 avatar kailigithub avatar qiwang067 avatar scchy avatar skypow2012 avatar zdynb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

joyrl's Issues

关于joyrl,初学者的困惑,请作者帮忙答疑解惑

网上很多能找到的教程都是利用已有的环境来教学,但实际上对于很多非强化学习专业研究者来说如何快速利用强化学习包来解决自己领域内场景的问题才是最棘手的,因此希望作者可以出一个教程甚至可以录一个视频,从下载joyrl,到编写自己的一个环境,到调用joyrl对应的rl算法,到最后解决,能够写出一整个完整的教程,我觉得这是对于其他教程来说一个很好的突破。希望作者可以考虑一下并尽快抓住这个初学者的痛处,毕竟现在考研结束了,很多研0的同学都开始找自己感兴趣的方向了,如果趁这个时候可以做出一个好的教程,那么对于想要学习rl的研0同学一定是帮助很大的。

offline_run.py中_check_obs_action_space_info方法有误

执行离线测试代码时报错如下:

(RL) liber@DESKTOP-HJ34P4I D:\Projects\joyrl>python offline_run.py --yaml presets/ClassControl/CartPole-v1/CartPole-v1_DQN.yaml
2024-09-03 10:35:53,380 INFO worker.py:1612 -- Started a local Ray instance. View the dashboard at 127.0.0.1:8265 
Traceback (most recent call last):
  File "D:\Projects\joyrl\offline_run.py", line 240, in <module>
    launcher.run()
  File "D:\Projects\joyrl\offline_run.py", line 229, in run
    self._check_obs_action_space_info(env)
  File "D:\Projects\joyrl\offline_run.py", line 217, in _check_obs_action_space_info
    action_type_list, action_size_list = self._check_obs_action_space_info(env)
  File "D:\Projects\joyrl\offline_run.py", line 217, in _check_obs_action_space_info
    action_type_list, action_size_list = self._check_obs_action_space_info(env)
  File "D:\Projects\joyrl\offline_run.py", line 217, in _check_obs_action_space_info
    action_type_list, action_size_list = self._check_obs_action_space_info(env)
  [Previous line repeated 991 more times]
  File "D:\Projects\joyrl\offline_run.py", line 216, in _check_obs_action_space_info
    self.cfg.obs_space_info = ObsSpaceInfo(size = state_size_list, type = state_type_list)
  File "D:\Projects\joyrl\joyrl\framework\core_types.py", line 78, in __init__
    self._check_type_size()
  File "D:\Projects\joyrl\joyrl\framework\core_types.py", line 81, in _check_type_size
    assert len(self.type) == len(self.size), 'obs type and size must have the same length'
RecursionError: maximum recursion depth exceeded while calling a Python object

经检查发现定义了两个_check_obs_action_space_info方法,将第二个方法改成_check_obs_state_space_info,同时修改run方法中self._check_obs_action_space_info(env)为self._check_obs_state_space_info(env)之后问题解决。

其他很多.yaml文件运行不了

作者你好,我想知道为什么我将.yaml文件为preset里的其他文件之后,运行会一直出错。难道只能运行你给的例子吗?其他的很多.yaml文件都运行不了

Issues of `observation_space` and `action_space` of costumed environment

Python version: 3.10.14
joyrl version: 0.6.5.1
Pytorch version: torch 2.2.1+cu121
torchaudio 2.2.1+cu121
torchvision 0.17.1+cu121

I intend to define observation space as follows:
[Image_of_one_channel, 1d-vector, 1d-vector]
And output space as follows:
[1d-vector]

For the observation spaces, I defined it as follows:

self.observation_space = spaces.Tuple(spaces=[
    spaces.Box(low=0, high=float("inf"), shape=(1, N, N)),
    spaces.Box(low=0, high=float("inf"), shape=(A,)),
    spaces.Box(low=0, high=float("inf"), shape=(B, ))
])
self.action_space = spaces.Box(low=0, high=float("inf"), shape=(B, 1))

in which capitalized letters represent numbers. But I notice that functions for spaces.Tuple is not yet implemented, as shown below:

# run.py, class Launcher
...
    def _check_obs_action_space_info(self, env):
        obs_space = env.observation_space
        if isinstance(obs_space, Box):
            if len(obs_space.shape) == 3:
                state_type_list = [ObsType.IMAGE]
                state_size_list = [[obs_space.shape[0], obs_space.shape[1], obs_space.shape[2]]]
            else:
                state_type_list = [ObsType.VECTOR]
                state_size_list = [[obs_space.shape[0]]]
        elif isinstance(obs_space, Discrete):
            state_type_list = [ObsType.VECTOR]
            state_size_list = [[obs_space.n]]
        else:
            raise ValueError('obs_space type error')
...

The second problem arises when I try to assign action space to continuous vector in $\mathbf{R}^k$ but I didn't save the error log. In general, the action layers is parsed to have the last layer with output shape 0, and raise the error. When I tried to modify the source code to force it not be 0, other errors ocurrs.

Finally, the corresponding network architecture field in the .yaml file is

algo_cfg:
  branch_layers:
    - name: view
      layers:
      - layer_type: conv2d
        in_channel: 1
        out_channel: 16 
        kernel_size: 4
        stride: 2
        activation: relu
      - layer_type: pooling
        pooling_type: max2d
        kernel_size: 2
        stride: 2
        padding: 0
      - layer_type: flatten
      - layer_type: norm
        norm_type: LayerNorm
        normalized_shape: 512
      - layer_type: linear
        layer_size: [128]
        activation: relu
    - name: lwh
      layers:
      - layer_type: linear
        layer_size: [32]
        activation: relu
      - layer_type: linear
        layer_size: [32]
        activation: relu
    - name: parts
      layers:
      - layer_type: linear
        layer_size: [256]
        activation: relu
      - layer_type: linear
        layer_size: [256]
        activation: relu
  merge_layers:
    - layer_type: linear
      layer_size: [256]
      activation: relu
    - layer_type: linear
      layer_size: [256]
      activation: relu
...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.