Giter VIP home page Giter VIP logo

luke-davidson / reinforcementlearning Goto Github PK

View Code? Open in Web Editor NEW
3.0 2.0 0.0 27.24 MB

Programming assignments completed for my Reinforcement Learning course: Topics include Bandit Algorithms, Dynamic Programming, policy iteration, Monte-Carlo methods, SARSA, Q-Learning, Dyna-Q/Dyna-Q+, gradient control methods, state aggregation methods, and Deep Q-Learning Networks (DQNs).

Jupyter Notebook 100.00%
bandit-algorithms deep-learning deep-q-network deep-reinforcement-learning dyna-q dynamic-programming gradient-descent-algorithm monte-carlo policy-gradient policy-iteration q-learning reinforcement-learning sarsa-learning

reinforcementlearning's Introduction

Reinforcement Learning

This repo holds all programming assignments completed for my Reinforcement Learning course (Fall 2022).

Note: Scaffolding code was given for some of these assignments. All of my work is located inside block comments labeled ##### MY WORK START ##### and ##### MY WORK END #####.

Assignment Descriptions

Ex0 --- Exploration Policies

Introducing Reinforcement Learning and policies --- rewards and effects of random, expected-better and expected-worse policies.

Ex1 --- Exploration, Exploitation and Action Selection

Exploring the effects of exploration, exploitation and action selection within the k-arm bandit environment --- epsilon-greedy policies, Q-value initialization, UCB action selection.

Note: Ex2 was written only, so has been left out.

Ex3 --- Dynamic Programming + Policy Iteration

Implementing Dynamic Programming policy iteration in a grid world environment --- value iteration, transition probabilities, policy evaluation + improvement.

Ex4 --- Monte Carlo Control

Implementing Monte Carlo policy iteration in Blackjack, four-rooms, and racetrack environments --- first-visit MC, exploring starts, MC policy iteration.

Ex5 --- Q-Learning, SARSA, Expected SARSA and Bias/Variance in Temporal Differencing and Monte Carlo

Implementing Q-Learning, SARSA and expected SARSA policies in a windy grid world environment. Exploring the bias-variance trade-off between Temporal Differencing and Monte Carlo methods.

Ex6 --- Dyna-Q and Dyna-Q+

Implementing the Dyna-Q and Dyna-Q+ algorithms in an adaptive blocking maze environment.

Ex7 --- Semi-gradient SARSA, State Aggregation and Linear Function Approximation

Implementing semi-gradient SARSA learning with state aggregation techniques and linear function approximation methods.

Ex8 --- Deep Q-Learning Networks (DQNs)

Implementing DQNs using PyTorch for non-linear function approximation: epsilon schedules, replay buffers, optimization.

reinforcementlearning's People

Contributors

luke-davidson avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.