fqjin / 2048nn Goto Github PK
View Code? Open in Web Editor NEWTrain a neural network to play 2048
License: GNU General Public License v3.0
Train a neural network to play 2048
License: GNU General Public License v3.0
The current mcts algorithm chooses the move with the highest average log score. A more conservative strategy might choose the move with the highest minimum score. This strategy would choose moves that lead to higher probability of near-term survival, hopefully leading to games that choose high-probability but low-reward lines rather than low-probability but high-reward lines.
An added benefit is that the mcts process can be terminated as soon as a single game dies. I estimate that this will save a significant amount of computation time, up to 2 orders of magnitude.
I hypothesize that the resulting games will likely not achieve high scores, but should be more consistent in terms of not dying early. If the goal is to achieve 2048 consistently but nothing higher, this might be the way to go.
Ideally, training should sample from a set of many games. This is achievable when the game generating process (batch MCTS) is accelerated (see #10 ).
Using make_data with mcts_nn and mcts_batch gives different results. I noticed that mcts_batch games end earlier on average and scores are up to 1/2 as much. I briefly tested both algorithms and currently do not have an explanation. Cannot train if cannot generate good quality data.
Using pytorch
Implement method to convert boards to pretty graphics for displaying on main page.
Monte Carlo search takes too long and may not converge to the best move due to the non-normal stopping behavior of the game. Neuroevolution is an alternative, where game playouts are used to evaluate a population, and the best players are selected. This allows direct selection of models that lead to high scores / move counts rather than trying to fit models to a proxy, the estimated best move.
Unit tests are helpful to make sure core mechanics are not broken.
Boards can be scaled so that the maximum tile is unit-valued. However, this removes information about the value of the minimum value (2), which is transformed into a fraction. A minimum scale value may need to be input as a channel. This can be tested by scaling board right before training.
Current selfplay speed is still too long. The biggest own-time bottlenecks are:
CPU (7 seconds), for some reason, loading the model takes much longer here (+3 sec)
conv2d
: 31%move_batch
: 14%merge_row_batch
: 13%generate_tile
: 12%GPU (15 seconds):
move_batch
: 32%zeros
: 18%generate_tile
: 15%merge_row_batch
: 15%Add a generate_tile() at the end of each move so that it does not need to be called separately after each move in the mcts and play functions.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.