Giter VIP home page Giter VIP logo

q_learning's Introduction

Introduction

The challenge was to build a Q-learning algorithm by using existing code (which I referenced in the credits). The header mprovements describes which improvements I made.

Run

Run 'Python3 Learner.py'

Improvements

I improved the code by changing the following:

  • Replace unnececary functions/code (LOTS of code cleaning).
  • Optimize by using list comprehension (loads of for loops were used, while this wasn't neccecary in all cases)
  • Use of classes ('World' is now a class, which in my opinion is a cleaner way of coding)
  • Merged walls/specials to objects variable.
  • Walls give a negative result of -1 to discourage the bot to go near. I could have used a very high value, but I believe we want the bot to totally avoid the red square and avoid (as much as possible) the walls. If we tell the player that there is a huge loss for touching the wall, then the bot won't go near the walls anymore, which is a problem since there are lots of them. We want the bot to know that it shouldn't go near the walls, but it should also know that there is a higher loss for going near the red squares.
  • After every reset the player position is changed so that the Q-matrix get initialized quicker. This way we have a higher chance that every cell is touched.
  • Extra function so that player/green square is not initialized on an already exisiting object position
  • Dynamic generation of matrix (user can give in his/her own dimensions).

I thought about generating a random matrix everytime the game restarts. However this wouldn't make much sense, since you're trying to find the optimal path and by changing the matrix, the optimal path would change. So I decided to not make that change.

Example

Below is an example of a randomly generated matrix (10x10). This is however not the max. The user can give in any value he/she wants. sample_grid_q_learning

Summary

Simple Reinforcement learning example, based on the Q-function.

  • Rules: The agent (yellow box) has to reach one of the goals to end the game (green or red cell).
  • Rewards: Each step gives a negative reward of -0.04. The red cell gives a negative reward of -5. The green one gives a positive reward of +5. The black walls give a negative reward of -1.
  • States: Each cell is a state the agent can be.
  • Actions: There are only 4 actions. Up, Down, Right, Left.

Credits

Credit for the vast majority of code here goes to PhilippeMorere.

Credits for being awesome go to @Sirajology for enabling us to learn so much about ML!

q_learning's People

Contributors

mickvanhulst avatar

Stargazers

 avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.