This repository provides code, exercises and implementations of popular Reinforcement Learning Algorithms. I compiled this repo from different resources listed below to complement my learning with:
Each folder in corresponds to one or more chapters of the above textbook and/or course. In addition to exercises and solution, each folder also contains a list of learning goals, a brief concept summary, and links to the relevant readings.
tylertaewook/RLpractice
├── 1. MDP
│ └── gym_test.py
├── 2. Dynamic Programming
│ ├── Gamblers Problem.ipynb
│ ├── Policy Evaluation.ipynb
│ ├── Policy Iteration.ipynb
│ └── Value Iteration.ipynb
├── 3. Monte Carlo
│ ├── Blackjack Playground.ipynb
│ ├── MC Control with Epsilon-Greedy Policies.ipynb
│ ├── MC Prediction.ipynb
│ └── Off-Policy MC Control with Weighted Importance Sampling.ipynb
├── 4. Temporal Difference
│ ├── Cliff Environment Playground.ipynb
│ ├── Q-Learning.ipynb
│ ├── SARSA.ipynb
│ └── Windy Gridworld Playground.ipynb
├── DQN
│ ├── Breakout Playground.ipynb
│ ├── Deep Q Learning.ipynb
│ └── dqn.py
├── Function Approximation
│ ├── MountainCar Playground.ipynb
│ └── Q-Learning with Value Function Approximation.ipynb
├── LICENSE
├── Policy Gradient
│ ├── CliffWalk Actor Critic Solution.ipynb
│ ├── CliffWalk REINFORCE with Baseline Solution.ipynb
│ ├── Continuous MountainCar Actor Critic Solution.ipynb
│ ├── README.md
│ └── a3c
├── Pytorch
│ ├── CNN-Transfer.ipynb
│ ├── CNN-advanced.ipynb
│ ├── CNN.ipynb
│ ├── DNN.ipynb
│ ├── GAN.ipynb
│ ├── PyTorch Tutorial.ipynb
│ ├── Tutorial_Autograd.ipynb
│ ├── Tutorial_DQN.ipynb
│ ├── Tutorial_Dataloader.ipynb
│ ├── Tutorial_Model.ipynb
│ ├── Tutorial_Optimization SaveLoading model.ipynb
│ ├── Tutorial_Savemodel.ipynb
│ ├── Tutorial_Tensors.ipynb
├── README.md
RL
PyTorch
Master resource: https://github.com/ritchieng/the-incredible-pytorch#Tutorials
Fundamental concepts of PyTorch: https://github.com/jcjohnson/pytorch-examples
Minimal tutorial (no comments): https://github.com/vinhkhuc/PyTorch-Mini-Tutorials
After official pytorch tutorial: https://github.com/yunjey/pytorch-tutorial
- Dynamic Programming Policy Evaluation
- Dynamic Programming Policy Iteration
- Dynamic Programming Value Iteration
- Monte Carlo Prediction
- Monte Carlo Control with Epsilon-Greedy Policies
- Monte Carlo Off-Policy Control with Importance Sampling
- SARSA (On Policy TD Learning)
- Q-Learning (Off Policy TD Learning)
- Q-Learning with Linear Function Approximation
- Deep Q-Learning for Atari Games
- Double Deep-Q Learning for Atari Games
- Deep Q-Learning with Prioritized Experience Replay (WIP)
- Policy Gradient: REINFORCE with Baseline
- Policy Gradient: Actor Critic with Baseline
- Policy Gradient: Actor Critic with Baseline for Continuous Action Spaces
- Deterministic Policy Gradients for Continuous Action Spaces (WIP)
- Deep Deterministic Policy Gradients (DDPG) (WIP)
- Asynchronous Advantage Actor Critic (A3C)