Giter VIP home page Giter VIP logo

gridworld's Introduction

Gridworld - Reinforcement learning

Anaconda environment

conda env create --file requirement.yml

Problem description

The gridworld problem is a good way to illustrate the fundamental aspects of reinforcement learning. The problem is to find a policy (i.e a mapping for each possible state give to an action) that maximize the payoff. For the gridworld problem, we have two cost. The first one is associate with two specific terminal state and a cost by move in the grid. We thus have an agent moving in a grid trying to find the grid where the treasure is (payoff of 1 unit) without falling into a hole (payoff of -1) with a cost per step that be associate with fuel in the real world.

Since we have only the payoff for terminal states, the first step is to evaluate the payoff for other states. Those local payoff will depend of the policy chosen. We thus have a loop where we evaluate the payoff to find a good strategy and use that strategy to evaluate new payoffs which will give a new strategy etc.

Algorithm

Principal reference Reinforcement Learning: An Introduction by Andrew Barton and Richard S. Sutton

The basic steps are:

  1. Initialisation
  2. Value function evaluation
  3. Policy Improvement

One can evaluate a policy by iteration with the pseudo code



or by using the Monte Carlo method



A short example

Initial policy

The initial policy for the example is:

We will evaluate the value function by value iteration or Monte Carlo t o to obtain the expected payoff at each state:

Improving the policy

We can improve the policy by a greedy approach from the expected payoff for each states to obtain:

and evaluate the new policy values (using Monte Carlo or Value Iteration) to obtain

Unittest

To run the tests:

python grid_test.py

gridworld's People

Contributors

simpliphai avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.