Giter VIP home page Giter VIP logo

dqn-flappybird's Introduction

Flappy Bird With DQN

DQN is a technology to realize reinforcement learning, first proposed by Deep Mind in NIPS13(paper in arxiv), whose input is raw pixels and whose output is a value function estimating future rewards. Using Experience Replay, they overcame the problem of network training.

This demo is about using DQN to train a convolutional neural network to play flappy bird game. It is a practice when I learned reinforcement learning and partly reused songrotek's code, especially the game engine and basic idea. Thanks for sharing, thanks to the spirit and community of open source.

A video of the demo can be found on YouTube or 优酷 Youku if you don't have access to YouTube.

DQN implemented by PyTorch

PyTorch is an elegant framework published by Facebook. I implemented the neural network and training/testing procedure using PyTorch. So you need install PyTorch to run this demo. Besides, pygame package is needed by the game engine.

How to run the demo

Play the game with pretrained model

At the beginning, you can play the game with a pretrained model by me. You can download the pretrained model from Google Drive (or Baidu Netdisk if Google Drive is not available) and use the following commands to play the game. Make sure that the pretrained model is in the root directory of this project.

chmod +x play.sh
./play.sh

For more detail infomation about the meaning of the arguments of the program, run python main.py --help or refer to the code in main.py.

Train DQN

You can use the following commands to train the model from scrach or finetuning(if pretrained weight file is provided).

chmod +x train.sh
./train.sh   # please see `main.py` for detail info about the variables

Some tips for training:

  • Do not set memory_size too large(or too small). It depends on available memory in your computer.

  • It takes a long time to complete training. I finetuned the model several times and change epsilon of ϵ-greedy exploration manually every time.

  • When choose action randomly in training, I prefer to 'DO Nothing' compared to 'UP'. I think it can accelarate convergence. See get_action_randomly method in BrainDQN.py for detail.

Disclaimer

This work is based on the repo songrotek/DRL-FlappyBird and yenchenlin1994/DeepLearningFlappyBird. Thanks two authors!

dqn-flappybird's People

Contributors

xmfbit avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.