Various gradient based policy agents have been implemented for discrete enviroment(CartPole-v0) and continuos enviroment(Pendulum-v0) from OpenAI Gym. All the agents have been implemented using tensorflow.
The CartPole-v0 enviroment from OpenAI Gym is a discrete enviroment with action space containing 0 and 1. The agents implemented are:-
- Vanilla Policy Gradient
- Policy Gradient with Baselines
- Actor-Critic Agent
The Pendulum-v0 enviroment from OpenAI Gym is a continuous enviroment with action space [-2, 2]. The agents implemented are:-
- Vanilla Policy Gradient
- Policy Gradient with Baselines
- Actor-Critic Agent
- Guassian Policy Agent
- Normalized Guassian Policy Agent