Clean, Robust, and Unified PyTorch implementation of popular Deep Reinforcement Learning (DRL) algorithms (Q-learning, Duel DDQN, PER, C51, Noisy DQN, PPO, DDPG, TD3, SAC, ASL)

Python 100.00%

deep-reinforcement-learning pytorch reinforcement-learning machine-learning asl c51 categorical-dqn ddpg double-dqn dueling-dqn noisynet-dqn ppo prioritized-experience-replay q-learning sac td3

deep-reinforcement-learning-algorithms-with-pytorch's Introduction

Clean, Robust, and Unified PyTorch implementation of popular DRL Algorithms

0.Star History

1.Dependencies

This repository uses the following python dependencies unless explicitly stated:

gymnasium==0.29.1
numpy==1.26.1
pytorch==2.1.0

python==3.11.5

2.How to use my code

Enter the folder of the algorithm that you want to use, and run the main.py to train from scratch:

python main.py

For more details, please check the README.md file in the corresponding algorithm folder.

3. Separate links of the code

4. Recommended Resources for DRL

4.1 Simulation Environments:

gym and gymnasium (Lightweight & Standard Env for DRL; Easy to start; Slow):

Isaac Gym (NVIDIA’s physics simulation environment; GPU accelerated; Superfast):

Sparrow (Light Weight Simulator for Mobile Robot; DRL friendly):

ROS (Popular & Comprehensive physical simulator for robots; Heavy and Slow):

Webots (Popular physical simulator for robots; Faster than ROS; Less realistic):

Envpool (Fast Vectorized Env)
Other Popular Envs

4.2 Books：

《Reinforcement learning: An introduction》--Richard S. Sutton
《深度学习入门：基于Python的理论与实现》--斋藤康毅

4.3 Online Courses:

RL Courses(bilibili)--李宏毅(Hongyi Li)
RL Courses(Youtube)--李宏毅(Hongyi Li)
UCL Course on RL--David Silver
动手强化学习--上海交通大学
DRL Courses--Shusen Wang

4.4 Blogs:

5. Important Papers

DQN: Mnih V, Kavukcuoglu K, Silver D, et al. Human-level control through deep reinforcement learning[J]. nature, 2015, 518(7540): 529-533.

Double DQN: Van Hasselt H, Guez A, Silver D. Deep reinforcement learning with double q-learning[C]//Proceedings of the AAAI conference on artificial intelligence. 2016, 30(1).

Duel DQN: Wang, Ziyu, et al. "Dueling network architectures for deep reinforcement learning." International conference on machine learning. PMLR, 2016.

PER: Schaul T, Quan J, Antonoglou I, et al. Prioritized experience replay[J]. arXiv preprint arXiv:1511.05952, 2015.

C51: Bellemare M G, Dabney W, Munos R. A distributional perspective on reinforcement learning[C]//International conference on machine learning. PMLR, 2017: 449-458.

NoisyNet DQN: Fortunato M, Azar M G, Piot B, et al. Noisy networks for exploration[J]. arXiv preprint arXiv:1706.10295, 2017.

PPO: Schulman J, Wolski F, Dhariwal P, et al. Proximal policy optimization algorithms[J]. arXiv preprint arXiv:1707.06347, 2017.

DDPG: Lillicrap T P, Hunt J J, Pritzel A, et al. Continuous control with deep reinforcement learning[J]. arXiv preprint arXiv:1509.02971, 2015.

TD3: Fujimoto S, Hoof H, Meger D. Addressing function approximation error in actor-critic methods[C]//International conference on machine learning. PMLR, 2018: 1587-1596.

SAC: Haarnoja T, Zhou A, Abbeel P, et al. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor[C]//International conference on machine learning. PMLR, 2018: 1861-1870.

ASL: Train a Real-world Local Path Planner in One Hour via Partially Decoupled Reinforcement Learning and Vectorized Diversity

6. Training Curves of my Code:

Duel Double DQN:

CartPole	LunarLander

Noisy Duel DDQN on Atari Game:

Pong	Enduro

Prioritized DQN/DDQN:

CartPole	LunarLander

Categorical DQN:

CartPole	LunarLander

NoisyNet DQN:

CartPole	LunarLander

DDPG:

Pendulum	LunarLanderContinuous

deep-reinforcement-learning-algorithms-with-pytorch's People

Contributors

Stargazers

Watchers

Forkers

pikachuzhao clear430 zeyunh gaoloveai liu666666666 chenz5518 aipla999 ustc-lizheng roe-luo tjevgerres shixiaoyu0216 halozyhgo zhangshuhao0928 hayes-mao xiaohua-boy sanmu123-lab tianyuzelin ice-bear-git youspeak crazy5288299 2644556969 bob-afei bowenff fzenjoys noewangjy razrhj sy596565683 gongyuchen wind-knows superman13579 qiu-2020 hitersyw christophernwufo1 whisperart 1229685850 xjyang2420 sliptogether sk-coi sz-zr ssdutyuyang199401 zhuiyuea seucccyy esanben 12kobe ram-ahmed qiu-jianrong jm-wu-bit davidwang527 oceanhwang 2232088201 jocyei richard-wth ymudrl oneduter zxinzzz zif004 anthonybla lionel1202 l1224655201 imcabbage whpy wannain sundogs8603 hubtitw pinrucloth nefeithu dfcs008 hezeqiang advaitp programmmmmmmer mando1106 ykkjin skrlin h-fu jonsen-brad godofgeek ttxdy rbinzhang qqsq12321 melih96 evelyenwang123 yixinz-nus jingaolouis lxb678 linhuanzhong frined8579 funnelchao fengyuhun zhangbryce vik0909 xuyanhai2004 hanqaq liyuming-china sarsigmadelta shuaiwen-cui reyna168 stonezuohui leihu0724 zhigaoqin albud187

deep-reinforcement-learning-algorithms-with-pytorch's Issues

Hello, I am a beginner in reinforcement learning. Thank you very much for providing me with such a library for easy learning. As the step size of each of my epochs is not fixed, I would like to know how to implement it if I want to train and record on an episode basis. If this is not easy to implement, I would like to know if training on steps would cause any problems and what content is being recorded at this time? Is it the average of several episodes or something? Thank you!

Issuee of 2.1

Hello, this is the issue for file 2.1. Since I ran the program with the CPU, not CUDA, I changed the default to the CPU in the main dvc, and the model has been successfully trained.
But when loading the second image, I get "raise RuntimeError('Attempting to deserialize object on a CUDA'RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.” I don't know how to modify it

SAC-Discrece

请问有没有SAC-离散动作的代码

DDPG的也写一份吧

Looking for cooperation

Dear XinJingHao,

I hope this message finds you well. I am a PhD student from SJTU. A friend of mine recommended this repository to me. I noticed similarities between the tutorial I am currently developing and yours.

I am in the process of creating a Reinforcement Learning (RL) tutorial that aims to provide a comprehensive resource with both code examples and in-depth mechanism explanations. You can find the initial codebase for my tutorial at this repository: https://github.com/SCP-CN-001/RL101. At present, it appears that both of us have completed the coding segment of our respective tutorials.

I am reaching out to gauge your interest in collaborating on the documentation aspect of these tutorials. If you find merit in the idea of combining our efforts to enhance the educational value of our materials, I believe we can create a more comprehensive and impactful resource.

If this proposal intrigues you, please feel free to reach out to me via the email address provided on my GitHub profile: https://github.com/SCP-CN-001.

Looking forward to the possibility of collaborating with you on this endeavor.

My environment cannot display the rendering interface

Has anyone encountered a situation where the model cannot be rendered? My server is Ubuntu 20.04.5 LTS, torch2.1.0+cu121, gymnasium 0.29.1

Always predict boundary action in TD3

Hi, I'm having problems with action prediction with TD3. The agent tends to predict boundary actions after it starts learning. Do you know what is causing this problem?

xinjinghao / deep-reinforcement-learning-algorithms-with-pytorch Goto Github PK