Giter VIP home page Giter VIP logo

imbizo2023rl's Introduction

Tutorial for IBRO-Simons Computational Neuroscience Imbizo 2023 - Dongyan Lin

Getting started

Recommended path: Clone repository and run RL_tut_exercise.ipynb locally

Alternatively, you can run COLAB_FRIENDLY_RL_tut_exercise.ipynb with Google colab

This tutorial is to lay out the major concepts of reinforcement learning and to offer some small practical exercises for testing intuition.

The tutorial covers the following:

  • Markov Decision Processes (MDPs)
  • Environments, specifying transition/reward functions
  • Agents (basic model-free, value-based agents)

The notebook RL_tut_exercises.ipynb leaves small segments code to be completed within pre-written functions. There are some exercises that involve full code to be left to the reader. Solutions for this code are available in the RL_tut_solutions.ipynb notebook.

Section 1: Building up to a Markov Decision Process

Based on David Silver's RL lecture series, we explore a Markov Chain, a Markov Reward Process, and a Markov Decision process. Functions to sample trajectories, collect rewards, and compute returns are defined in mdp_functions.py.

Section 2: Understanding the Environment

In this section we explore specifying the dynamics of the environment with transition and reward functions. We start with a linear track and specify a reward vector and transition dynamics tensor which specifies the probability of moving between states along the track. We then move on to a gridworld environment which follows the same pattern. Gridworld functions are defined in from env_functions.py. Here we also write a function to control interaction between an "agent" (with a random policy) and the environment. We explore this random agent in a few different versions of the same environment (open field, obstacles, edge of terminal states). Here also we demonstrate a few different gridworlds from env_functions.py.

Section 3: Understanding the Agent

We write a general tabular agent class where we define a Q-table to store state-action values. The general tabular agent class also has a choose_action method which takes state as an argument and returns an action from the $\epsilon$-greedy policy. This is the parent class for the Monte-Carlo, Q-Learning, and SARSA agents we explore in this section. For each of these types of agents, we define the methods update_q_table and navigate which have slightly different functionality for each.

We explore performance of each of these agents in the same 10x10 open field gridworld environment. It is left as an exercise to compare performance of these agents across different types of environments.

Miniprojects

Here we outline some projects for learners at different levels of familiarity and comfort with the subject. These projects should take roughly a week of work during afternoon free-work periods at the Imbizo.

imbizo2023rl's People

Contributors

dongyanl1n avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.