Giter VIP home page Giter VIP logo

alphacats's Introduction

AlphaCats

AlphaCats was a failed attempt to solve the game of Exploding Kittens using Deep Counterfactual Regret Minimization. AlphaCats is built around the go-cfr package.

Due to the depth of the game tree, external sampling is intractable, and other forms of MC-CFR sampling (such as outcome sampling), led to high-variance samples and a model that struggled to converge.

Future areas of investigation could include variance-reduction and improved sampling techniques.

No Maintenance Intended GoDoc

Usage

cmd/alphacats is the main driver binary. CFR iteration can be launched with:

./cmd/alphacats/alphacats -logtostderr \
    -decktype core -cfrtype deep -iter 10 \
    -sampling.num_sampling_threads 5000 \
    -sampling.max_num_actions 2 \
    -sampling.exploration_eps 1.0 \
    -deepcfr.traversals_per_iter 10000 \
    -deepcfr.buffer.size 10000000 \
    -deepcfr.model.num_encoding_workers 4 \
    -deepcfr.model.batch_size 10000 \
    -deepcfr.model.max_inference_batch_size 10000 \
    -output_dir output -v 1 2>&1 | tee run.log

This will run DeepCFR with a reservoir buffer of size 10 million, and sample the game tree using robust sampling with K=2.

Tabular CFR can also be launched with -cfrtype tabular. It requires a large amount of memory and therefore a smaller test game can be selected with -decktype test. Tabular CFR is not thread-safe and must be run with -sampling.num_sampling_threads 1.

Model

The underlying model used in AlphaCats is an LSTM over the game history that feeds forward into a deep fully connected network.

# The history (LSTM) arm of the model.
history_input = Input(name="history", shape=history_shape)
lstm = Bidirectional(CuDNNLSTM(32, return_sequences=False))(history_input)

# The private hand arm of the model.
hands_input = Input(name="hands", shape=hands_shape)

# Concatenate and predict advantages.
merged_inputs = concatenate([lstm, hands_input])
merged_hidden_1 = Dense(128, activation='relu')(merged_inputs)
merged_hidden_2 = Dense(128, activation='relu')(merged_hidden_1)
merged_hidden_3 = Dense(128, activation='relu')(merged_hidden_2)
merged_hidden_4 = Dense(64, activation='relu')(merged_hidden_3)
merged_hidden_5 = Dense(64, activation='relu')(merged_hidden_4)
normalization = BatchNormalization()(merged_hidden_5)
advantages_output = Dense(N_OUTPUTS, activation='linear', name='output')(normalization)

model = Model(
    inputs=[history_input, hands_input],
    outputs=[advantages_output])
model.compile(
    loss='mean_squared_error',
    optimizer=Adam(clipnorm=1.0),
    metrics=['mean_absolute_error'])

See model/train.py for the training script. During training, samples are first generated using a go-cfr sampler, saved to *.npz files, and then loaded by the script in minibatches. The resulting model is saved in TensorFlow format, and loaded for inference (see model/lstm.go).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.