Light

muhamuttaqien / deep-reinforcement-learning-udacity Goto Github PK

View Code? Open in Web Editor NEW

This project forked from rafael1s/deep-reinforcement-learning-algorithms

0.0 2.0 0.0 83.53 MB

Projects and algorithms in the framework of Deep Reinforcement Learning

Python 1.11% Jupyter Notebook 98.89%

deep-reinforcement-learning-udacity's Introduction

Deep Reinforcement Learning Nanodegree Udacity

Here you can find several projects dedicated to the Deep Reinforcement Learning methods.
These projects are developed as part of the Udacity Deep Reinforcement Learning Nanodegree Program.
Several projects are devoted to Deep Reinforcement Learning Architectures, Value-Based Methods and Bellman Equation, Policy-Based Methods, Policy-Gradient Methods and Actor-Critic Methods.

Monte-Carlo Methods
In Monte Carlo (MC), we play episodes of the game until we reach the end, we grab the rewards collected on the way
and move backward to the start of the episode. We repeat this method a sufficient number of times and we average
the value of each state.
Temporal Difference Methods and Q-learning
Reinforcement Learning in Continuous Space (Deep Q-Network)
Function Approximation and Neural Network
The Universal Approximation Theorem (UAT) states that feed-forward neural networks containing a
single hidden layer with a finite number of nodes can be used to approximate any continuous function provided
rather mild assumptions about the form of the activation function are satisfied.
Policy-Based Methods, Hill Climbing, Simulating Annealing
Random-restart hill climbing is a surprisingly effective algorithm in many cases. Simulated annealing is a good probabilistic technique because it does not accidentally think a local extrema is a global extrema.
Policy-Gradient Methods, REINFORCE, PPO
Define a performance measure J(\theta) to maximaze. Learn policy paramter \theta throgh approximate gradient ascent.
Actor-Critic Methods, A3C, A2C, DDPG, SAC
The key difference from A2C is the Asynchronous part. A3C consists of multiple independent agents(networks) with
their own weights, who interact with a different copy of the environment in parallel. Thus, they can explore
a bigger part of the state-action space in much less time.

Projects, models and methods

CartPole, Policy Based Methods, Hill Climbing

CartPole, Policy Gradient Methods, REINFORCE

Markov Decision Process, Monte-Carlo, Gridworld 6x6

Pong, Policy Gradient Methods, PPO

Pong, Policy Gradient Methods, REINFORCE

Project 1: Navigation, Deep-Q-Network, ReplayBuffer

Project 2: Continuous Control-Reacher, DDPG, environment Reacher (Double-Jointed-Arm)

Project 2: Continuous Control-Crawler, PPO, environment Crawler

Project 3: Collaboration_Competition-Tennis, Multi-agent DDPG, environment Tennis

BipedalWalker, Twin Delayed DDPG (TD3)

BipedalWalker, PPO, Vectorized Environment

BipedalWalker, Soft-Actor-Critic (SAC)

BipedalWalker, A2C, Vectorized Environment

CarRacing with PPO, Learning from Raw Pixels

Projects with PPO

Pong, 8 parallel agents
CarRacing, Single agent, Learning from pixels
C r a w l e r , 12 parallel agents
BipedalWalker, 16 parallel agents

For more links

on Policy-Gradient Methods, see 1, 2, 3.
on REINFORCE, see 1, 2, 3.
on PPO, see 1, 2, 3, 4, 5.
on DDPG, see 1, 2.
on Actor-Critic Methods, and A3C, see 1, 2, 3, 4.
on TD3, see 1, 2, 3
on SAC, see 1, 2, 3, 4, 5
on A2C, see 1, 2, 3, 4, 5

Paper on Medium.com

How does the Bellman equation work in Deep Reinforcement Learning?

Videos

deep-reinforcement-learning-udacity's People

Contributors

Watchers

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.