Light

llsourcell / deep_q_learning Goto Github PK

View Code? Open in Web Editor NEW

162.0 11.0 96.0 3 KB

This is the Code for "Deep Q Learning - The Math of Intelligence #9" By Siraj Raval on Youtube

Jupyter Notebook 100.00%

deep_q_learning's Introduction

deep_q_learning

This is the Code for "Deep Q Learning - The Math of Intelligence #9" By Siraj Raval on Youtube

Coding Challenge - Due Date, Thursday August 17th 2017

This weeks challenge is to use Q learning to train an agent for any game you'd like. You can use OpenAI's Gym or Universe as a simulation testbed, but for the Q-Learning algorithm itself don't use any libraries. Bonus points if you use a deep convolutional network from scratch as well to learn from pixels, which means your agent is generalized to more than just one game (Deep Q Learning). Good luck!

Overview

This is the code for this video on Youtube by Siraj Raval as part of the Math of Intelligence series. We're going to rerecreate DeepMind's Deep Q Learner for a variety of games.

Dependencies

keras (http://www.pyimagesearch.com/2016/11/14/installing-keras-with-tensorflow-backend/)
tensorflow (https://www.tensorflow.org/install/)
gym (https://github.com/openai/gym)
collections

Use pip to install any dependencies.

Usage

Just run jupyter notebook in terminal and the code will run. If you'd like to run this code on Super Mario, you need to install this additonal dependency.

Credit

The credit for this code goes to PeterWittek. I've merely created a wrapper to get people started.

deep_q_learning's People

Contributors

Stargazers

Watchers

Forkers

christfan868 raphaelhpze jwuthri joshnewnham prcer stevenlol zahorecztibor ii0 sainiudit kumarankit0411 sherazkhan attilaborcs annusgit zhan4402 chesstrees juggernaut5k randy3465 gaurav780 355380o726602 shreyanshpandey ja1r0 loftina frperezga surgeony madongmingming anmolduainter engahmedragab ashihskumar713 jbdatascience franckjay carlintj harisan1 neurale nbarendes readthenews chadmckenna chambana zouyunzhe jdesrosiers01 mikeccy kt-ujwal hemanthsavasere gsant1601 krisnarengga jagwire16 bantamjoe 1165048017 gabeochieng aifullstack rickyzhang82 tiravata sudarsanghosh goswamig janjagusch jmphil09 amughalbscs16 karthik-bhaskar achouhan93 gieger chococigar satishjaiswal navalevarun maranimatias oneness66 dibya-pati caikn drwtham tekinalpturk byronbay nouraldenmohammed silvalipe mdmshf shichaosuper shaikhhasib zhaoyutim thejasprasad manzanillo hafidzdaud wonderfulvamsi a10mic leechangyo ralphbrodriguez shashikanthkbagali5 afcarl zoombiegroove lpcteste aligeekk andreasdahlnielsen messorem7 python-repository-hub pio3m 1feralcat tusharj3011 phamnguyenlongvu ankurhcu iqbal89co

deep_q_learning's Issues

How can we make the bot play multiple instances?

I ran the code and I see that the bot only plays once and the script ends when one game is completed(without completing the goal of the game). How can we make it run multiple instances as we observe its movements?

Observation and training implementations don't appear to be correct.

The observation implementation seems broken. Random sampling is done from the game state with probability 'epsilon', which makes sense, but then the model is sampled with probability (1-epsilon); however, during this time, the model is completely untrained, is not updated during the observation process and is producing outputs that are not related to the problem at all.

Not only that, but in the phase where you're supposed to be training the model, you're taking predictions from the untrained model as your targets, basically teaching the model to reproduce random results based on it's own uniform weight initialization. So far as I can tell, this behavior translates well to the final 'play' phase, as the agent is never able to complete the game and behaves fairly randomly, running from the stock implementation.

Perhaps I'm confused, but, could you elaborate on what you think is happening in each phase of the notebook?

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.