Giter VIP home page Giter VIP logo

chess-alpha-zero's Introduction

About

Chess reinforcement learning by AlphaZero methods.

This project is based on the following resources:

  1. DeepMind's Oct. 19th publication: Mastering the Game of Go without Human Knowledge
  2. DeepMind's recent arxiv paper Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
  3. The great Reversi development of the DeepMind ideas that @mokemokechicken did in his repo: https://github.com/mokemokechicken/reversi-alpha-zero

Note: This project is still under construction!!

Environment

  • Python 3.6.3
  • tensorflow-gpu: 1.3.0
  • Keras: 2.0.8

Modules

Reinforcement Learning

This AlphaZero implementation consists of two workers, self and opt.

  • self plays the newest model against itself to generate self-play data for use in training.
  • opt trains the existing model to create further new models, using the most recent self-play data.

Evaluation

Evaluation options are provided by eval and gui.

  • eval automatically tests the newest model by playing it against an older model (whose age can be specified).
  • gui allows you to personally play against the newest model.

Data

  • data/model/model_*: newest model.
  • data/model/old_models/*: archived old models.
  • data/play_data/play_*.json: generated training data.
  • logs/main.log: log file.

If you want to train a model from scratch, delete the above directories.

How to use

Setup

install libraries

pip install -r requirements.txt

If you want use GPU,

pip install tensorflow-gpu

set environment variables

Create .env file and write this.

KERAS_BACKEND=tensorflow

Basic Usage

To train a model or further train an existing model, execute Self-Play and Trainer.

Self-Play

python src/chess_zero/run.py self

When executed, Self-Play will start using BestModel. If the BestModel does not exist, new random model will be created and become BestModel.

options

  • --new: create new newest model from scratch
  • --type mini: use mini config for testing, (see src/chess_zero/configs/mini.py)
  • --type small: use small config for commodity hardware, (see src/chess_zero/configs/small.py)

Trainer

python src/chess_zero/run.py opt

When executed, Training will start. A base model will be loaded from latest saved next-generation model. If not existed, BestModel is used. Trained model will be saved every 2000 steps(mini-batch) after epoch.

options

  • --type mini: use mini config for testing, (see src/chess_zero/configs/mini.py)
  • --type small: use small config for commodity hardware, (see src/chess_zero/configs/small.py)
  • --total-step: specify an artificially nonzero starting point for total steps (mini-batches)

Evaluator

python src/chess_zero/run.py eval

When executed, Evaluation will start. It evaluates BestModel and the latest next-generation model by playing about 200 games. If next-generation model wins, it becomes BestModel.

options

  • --type mini: use mini config for testing, (see src/chess_zero/configs/mini.py)
  • --type small: use small config for commodity hardware, (see src/chess_zero/configs/small.py)

Play Game

python src/chess_zero/run.py gui

When executed, ordinary chess board will be displayed in unicode and you can play against the newest model.

options

  • --type mini: use mini config for testing, (see src/chess_zero/configs/mini.py)
  • --type small: use small config for commodity hardware, (see src/chess_zero/configs/small.py)

Tips and Memos

GPU Memory

Usually the lack of memory cause warnings, not error. If error happens, try to change per_process_gpu_memory_fraction in src/worker/{evaluate.py,optimize.py,self_play.py},

tf_util.set_session_config(per_process_gpu_memory_fraction=0.2)

Less batch_size will reduce memory usage of opt. Try to change TrainerConfig#batch_size in NormalConfig.

Tablebases

This implementation supports using the Gaviota tablebases for endgame evaluation. The tablebase files should be placed into the directory chess-alpha-zero/tablebases. The Gaviota bases can be generated from scratch (see the repository), or downloaded directly via torrent (see "Gaviota" on the Olympus Tracker).

chess-alpha-zero's People

Contributors

benediamond avatar yhyu13 avatar samuelstarshot avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.