piyushc1480 / chess-deep-rl-cpp Goto Github PK

View Code? Open in Web Editor NEW

This project forked from zjeffer/chess-deep-rl-cpp

0.0 0.0 0.0 2.84 MB

C++ version of my chess-deep-rl project. WIP

Shell 0.09% C++ 99.13% CMake 0.58% Dockerfile 0.20%

chess-deep-rl-cpp's Introduction

chess-deep-rl-cpp

C++ version of my chess-deep-rl project. WIP

TODO

How do normal chess engines work?

Contemporary chess engines use the minimax algorithm: for a given position, the engine tries to find the best move by creating a tree of all possible moves to a certain depth, and cutting down paths that lead to bad positions (alpha-beta pruning). It evaluates a position based on a pre-written evaluation function with the pieces as input. One player tries to maximize their score, and the other tries to minimize it.

Image source: By Jez9999, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=3708424

How does this chess engine work?

This chess engine is based on AlphaZero by Deepmind. It uses a neural network to predict the next best move. The neural network learns by playing against itself for a high amount of games, and using their results to train the network. The newly trained neural network is evaluated against the old network by playing many games against each other, and the best network is kept. This process is repeated for a long time.

To play one move, an Agent sends the environment's current position to the MCTS class to run hundreds of simulations of an adapted version of the MCTS algorithm. Every simulation, the input is fed to the neural network. Two outputs are returned: a tensor of probabilities representing the value of each possible move, and a value between -1 and 1 representing the value of the given input state.

After these simulations, the resulting MCTS tree is used to select the best move:

There are two ways to select these moves:

The move with greatest $N$ (deterministically)
- 200 in the above example
According to a distribution (stochastically): $\pi \sim N$
- 50% for action 1 (200/400)
- 10% for action 2 (40/400)
- 40% for action 3 (160/400)

Deterministic selection should be used when evaluating two models, or when playing against the model yourself. Stochastic selection is used when generating games through self-play.

Creating a training set

To train the network, you need a lot of data
You create this data through self-play: letting the AI play against a copy of itself for many games.
For every move, store:
- The state
- The search probabilities
- The winner, (added once the game is over)

Training the network

Sample a mini-batch from a high amount of positions (see training set)
Train the network on the mini-batch

Usage

Clone the repository recursively:

git clone --recursive [email protected]:zjeffer/chess-deep-rl-cpp.git

Build and open the devcontainer using Visual Studio Code. You can also use this engine without a devcontainer, provided you install all dependencies:
- cmake make gcc gdb opencv git vtk hdf5 glew openmpi fmt sudo qt5-base qt6 ninja
- g3log (which can be found in vendor/)
- All installation instructions can be found in .devcontainer/Dockerfile
It is necessary to run the following command on the host system to give the container access to the X11 display: xhost +
Build the executable with CMake
See a list of arguments: ./build/chess-rl --help

A GUI is included to create chessgames through selfplay. Example of the GUI running self-play with 2 games in parallel:

In the GUI, you can: * select the model to use for self-play * choose the folder to save the games in * set the number of games to play in parallel * the amount of MCTS simulations to run every move.

Recommend Projects

piyushc1480 / chess-deep-rl-cpp Goto Github PK

chess-deep-rl-cpp's Introduction

chess-deep-rl-cpp

TODO

How do normal chess engines work?

How does this chess engine work?

Creating a training set

Training the network

Usage

chess-deep-rl-cpp's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent