Giter VIP home page Giter VIP logo

polygames's Introduction

CircleCI

Polygames

This README is a work in progress, please feel very free to post issues - we are happy to help. Save up computational power: you can find checkpoints here: http://dl.fbaipublicfiles.com/polygames/checkpoints/list.txt (feel free to open an issue for discussing which checkpoint you should use for which game/problem!).

For Nix users: see this doc.

Requirement:

C++17 compatible compiler
miniconda3

Compilation Guide:

First install conda and pytorch

Create a fresh conda environment with python3.7, install pytorch and dependencies.

# create a fresh conda environment with python3
# you will need to have miniconda3 set up
conda create --name [your env name] python=3.7 pip

conda activate [your env name] # Or source activate [your env name], depending on conda version.

conda install numpy pyyaml mkl mkl-include setuptools cmake cffi typing
conda install pytorch cudatoolkit=10.1 -c pytorch
conda install -c conda-forge tensorboardx
conda install -c conda-forge openjdk  # optional
conda install -c conda-forge graphviz # optional

pip install visdom
pip install torchviz				  # optional

Clone the repo and build

git clone --recursive https://github.com/facebookincubator/polygames
cd polygames

mkdir build
cd build

cmake .. -DCMAKE_BUILD_TYPE=relwithdebinfo -DPYTORCH15=ON
make -j

Ludii support can be disabled by appending -DWITH_LUDII=OFF to the cmake command (required if you don't have jdk)

Content

The repo contains mostly the following folders:

  • the pypolygames python package, which serves as an entry point for the application
  • the src folder, containing all C++ source code and third party libraries
    • the src/games folder, containing the games coded in C++

How to use the application

The application is launched from the pypolygames python package, in either of the following modes:ar

  • pypolygames train (training mode): a game and a model (as well as several other options, see below) are chosen and the model is iteratively trained with MCTS
  • pypolygames eval (evaluation mode): the model confronts either a pure MCTS or another neural network powered MCTS. The evaluation of a training can be done either offline (from checkpoints periodically saved) or in real time; in that case, the evaluation considers only the most recent checkpoint in order to follow closely the training, skipping some checkpoints in case the eval computation takes longer than the time becween consecutive checkpoints. It is displayed through visdom.
  • pypolygames traineval (training + evaluation mode): it mixes the two previous modes and allow to launch one command instead of two. With the real_time option the modes can be launched in parallel instead of sequentially.
  • pypolygames human (human mode): a human player plays against the machine

When a training is launched, it creates a game_GAMENAME_model_MODELNAME_feat_FEATURIZATION_GMT_YYYYMMDDHHMMSS within the save_dir where it will log relevant files:

  • model.pt
  • train.log
  • stat.tb
  • checkpoints_EPOCH.pt for for checkpoints saved each saving_period epoch (e.g., if saving_period == 10, checkpoints_0.pt, checkpoints_10.pt, checkpoints_20.pt, checkpoints_30.pt)

This directory will be the checkpoint_save_dir directory used by evaluation to retrieve the checkpoints to perform eval computation.

Parameters

The list of parameters for each mode is available with

python -m pypolygames {train,eval,traineval,human} --help

Threads

In train (resp. eval) mode, num_game * num_actor (resp. num_game * num_actor_eval * num_actor_opponent) is the total number of threads. The more num_actor (and num_actor_eval, num_actor_opponent), the larger the MCTS is for a given player.

In human mode, since num_game is set to one, for leveraging the computing power available on the platform, a rule-of-thumb is to set num_actor to 5 times the number of CPUs available (it is platform-dependent though, and performance tests should be done).

Model zoo

All models can be found in pypolygames/model_zoo. They come with a set of sensible parameters that can be customized as well as default games.

Usually models come in pair: MODELNAMEFCLogitModel and MODELNAMEConvLogitModel:

  • FCLogit models use a fully-connected layer for logit inference and are compatible with all games
  • ConvLogit models use a convolutional layer for logit inference and are only compatible with games whose action space if of same dimensions than their input space (an exception will be raised in case of an attempt to use an incompatible game)

So far the models being implemented are the folling:

  • GenericModel: generic model compatible with all games, default when no model_name is specified
  • NanoFCLogitModel: a simple model with a logit-inference fully-connected layer
  • NanoConvLogitModel: a simple model with a logit-inference convolutional layer
  • ResConvFCLogitModel: resnets with a logit-inference fully-connected layer
  • ResConvConvLogitModel: resnets with a logit-inference convolutional layer
  • UConvFCLogitModel: unets (direct paths between first and last layers) with a logit-inference fully-connected layer
  • UConvConvLogitModel: unets (direct paths between first and last layers) with a logit-inference convolutional layer
  • AmazonsModel: only for the Amazons game

Depending on the actual model chosen, some parameters might not have any use.

Featurization

--out_features=True: the input to the NN includes a channel with 1 on the frontier.
--turn_features=True: the input to the NN includes a channel with the player index broadcasted.
--geometric_features=True: the input to the NN includes 4 geometric channels representing the position on the board.
--random_features=4: the input to the NN includes 4 random features.
--one_feature=True: the input to the NN includes a channel with 1 everywhere.
--history=3: the representation from the last 3 steps is added in the featurization.

Examples

Run the following command before running the code

export OMP_NUM_THREADS=1

Examples for the training mode

  • Launch the game Connect4 with the GenericModel
python -m pypolygames train --game_name="Connect4"
  • Launch a game with a specific model and specific parameters
python -m pypolygames train --game_name="Connect4" --out_features=True \
    --model_name="UConvFCLogitModel" \
    --nnsize=16 \
    --nnks=3 \
    --pooling
  • Save checkpoints every 20 epochs in a specific folder
python -m pypolygames train --game_name="Connect4" --model_name="UConvFCLogitModel" \
    --saving_period=20 \
    --save_dir="/checkpoints"
  • Run training on GPU for a max time
python -m pypolygames train --game_name="Connect4" --model_name="UConvFCLogitModel" \
    --device="cuda:0" \
    --max_time=3600
  • Resume training from a given epoch
python -m pypolygames train \
    --save_dir="/checkpoints/game_Connect4_model_GenericModel_feat..._GMT_20190717103728" \
    --init_epoch=42
  • Initiate from a pretrained model
python -m pypolygames train --init_checkpoint="path/to/pretrained_model.pt" \
    --lr=0.001

Note that any checkpoint can serve as a pretrained model

  • Train on multiple GPUs
python -m pypolygames train --init_checkpoint "path/to/pretrained_model.pt" \
    --device cuda:0 cuda:1 cuda:2 cuda:3 cuda:4

In this case cuda:0 will be used for training the model while cuda:1, cuda:2 and cuda:3 will be used for generating games. If there is only one device specified, it will be used for both purposes.

Notes:

  • By default, the number of threads used for processing and batch sizes for inference are set automatically. These can be overriden with num_thread and per_thread_batchsize respectively.
  • num_game specifies the number of "master" threads scheduling games, and the total number of games being run in parallel will be num_game * per_thread_batchsize. Since per_thread_batchsize is automatically determined by default, this could be a large number in some instances.

Examples for the evaluation mode

  • Run offline evaluation
python -m pypolygames eval \
    --checkpoint_dir="/checkpoints/game_Connect4_model_GenericModel_feat..._GMT_20190717103728"
  • Plot evaluation on http://localhost:10000 as the same time as training happens (training needs to be run from another process)
python -m pypolygames eval \
    --checkpoint_dir="/checkpoints/game_Connect4_model_GenericModel_feat..._GMT_20190717103728" \
    --real_time \
    --plot_enabled \
    --plot_port=10000
  • Run evaluation on cpu with 100 games per evaluation, the pure-MCTS opponent playing 1000 rollouts while the model plays 400 rollouts
python -m pypolygames eval \
    --checkpoint_dir="/checkpoints/game_Connect4_model_GenericModel_feat..._GMT_20190717103728" \
    --device_eval="cpu" \
    --num_game_eval=100 \
    --num_rollouts_eval=400 \
    --num_actor_eval=8 \
    --num_rollouts_opponent=1000 \
    --num_actor_opponent=8
  • A specific checkpoint plays against another neural-network-powered MCTS
python -m pypolygames eval \
    --checkpoint="/checkpoints/checkpoint_600.zip" \
    --num_rollouts_eval=400 \
    --num_actor_eval=8 \
    --checkpoint_opponent="/checkpoints/checkpoint_200.zip" \
    --num_rollouts_opponent=1000 \
    --num_actor_opponent=8
  • Four GPUs are used for evaluating the model, all for inference
python -m pypolygames eval \
    --checkpoint="/checkpoints/checkpoint_600.zip" \
    --device_eval cuda:0 cuda:1 cuda:2 cuda:3 \
    --num_rollouts_eval=400 \
    --num_actor_eval=8 \
    --num_rollouts_opponent=1000 \
    --num_actor_opponent=8

Notes:

  • num_actor_eval, num_rollouts_eval, num_actor_opponent and num_rollouts_opponent are independent from the values used during training; in particular for proper benchmarking num_actor_eval and num_rollouts_eval should be set to the values used in human mode
  • num_game_eval * num_actor_eval (resp. num_game_eval * num_actor_opponent) is the number of threads used by the model to be evaluated (resp. the opponent)
  • there is no per_thread_batchsize in this mode
  • the higher num_actor_eval (resp. num_actor_opponent), the larger MCTS for a move in a given game will be, up to a limit where overheads between threads lead to decreasing returns. Empiracally this limit seems to be around 8. This limit may be game/model/platform dependent and should be tuned for a given instance.
  • against a pure MCTS opponent, num_rollouts_opponent should be set significantly higher than num_rollouts_eval

Examples for the training+evaluation mode

  • Run first training then evaluation on the last checkpoint
python -m pypolygames traineval --game_name="Connect4" \
    --save_dir="/checkpoints" \
    --num_epoch=1000
  • Plot evaluation on http://localhost:10000 as the same time as training happens
python -m pypolygames traineval --game_name="Connect4" \
    --save_dir="/checkpoints" \
    --real_time \
    --plot_enabled \
    --plot_port=10000

Examples for the human mode

  • Play to Connect4 against a pure MCTS as the second player with 8 threads
python -m pypolygames human --game_name="Connect4" \
    --pure_mcts \
    --num_actor 8
  • Play to Connect4 against a pretrained model as the second player
python -m pypolygames human \
    --init_checkpoint="/checkpoints/checkpoint_600.zip" \
    --human_first
  • Play with a timer, each side having 1800s in total, and the model playing each move with 0.07 of the remaining time
python -m pypolygames human \
    --init_checkpoint="/checkpoints/checkpoint_600.zip" \
    --total_time=1800 \
    --time_ratio=0.07
  • The model uses four GPUs, all for inference
python -m pypolygames human \
    --init_checkpoint "/checkpoints/checkpoint_600.zip" \
    --device cuda:0 cuda:1 cuda:2 cuda:3
  • The model uses four GPUs, all for inference, and uses the text protocol (actions are represented by x y z, each on one line):
python -m pypolygames tp \
    --init_checkpoint "/checkpoints/checkpoint_600.zip" \
    --device cuda:0 cuda:1 cuda:2 cuda:3

Notes:

  • in human mode, the model being fixed, the goal is to maximize performance given the platform running the model
  • the most effective way to improve model performance is to increase the MCTS size
  • as for training and evaluation, but given that there is only one game played, num_actor is the total number of threads
  • the higher num_actor, the larger the MCTS, up to a limit where overheads between threads lead to decreasing returns. Empiracally this limit seems to be around 8. This limit may be game/model/platform dependent and should be tuned for a given instance.
  • in a time-limited game num_rollouts should not be specified as it is maximized within each time_ratio * remaining time period

Examples for converting models

Saved checkpoints of models also store details about the game for which they were trained, and can only be used directly for the game in which they were trained. This is why eval runs do not require the --game_name to be specified; this is inferred from the model. The pypolygames convert command can be used to convert models to different games.

  • Fully automated convert between games:
python -m pypolygames convert \
    --init_checkpoint "/checkpoints/checkpoint_600.pt.gz" \
	--game_name="LudiiGomoku.lud" \
	--out="/checkpoints/converted/XToGomoku.pt.gz"

This takes the previously-trained model stored in "/checkpoints/checkpoint_600.pt.gz", modifies it such that it can be used to play the Ludii implementation of Gomoku, and stores this modified version of the model in the new file "/checkpoints/converted/XToGomoku.pt.gz".

This works best when using neural network architectures that are compatible with arbitrary board shapes (such as ResConvConvLogitPoolModel), and source and target games that have identical numbers of channels for state and move tensors, as well as identical semantics for those channels. For instance, the Ludii implementation of Yavalath has the same number of channels with identical semantics (in the same order) as Gomoku. Therefore, if the source model in "/checkpoints/checkpoint_600.pt.gz" was trained using --model_name=ResConvConvLogitPoolModel and --game_name="LudiiYavalath.lud", this conversion can be performed directly without having to delete any parameters or add any new parameters.

  • Fully automated convert between game options:
python -m pypolygames convert \
    --init_checkpoint "/checkpoints/checkpoint_600.pt.gz" \
	--game_options="Board Size/19x19" \
	--out="/checkpoints/converted/Gomoku/15x15_to_19x19.pt.gz"

This example will convert the source checkpoint "/checkpoints/checkpoint_600.pt.gz" into a model that can be used in a game loaded with the additional --game_options="Board Size/19x19" argument. For example, --game_name=LudiiGomoku.lud is by default played on a 15x15 board, but can be played on a larger 19x19 board with the --game_options="Board Size/19x19" argument.

Note that the convert command only takes game options into account if some form of --game_options is explicitly provided among the command line arguments. This means that, if a model was first trained for --game_options=Board Size/19x19, and the goal is to convert it into one for the default board size of 15x15, it is still necessary to provide either --game_options (without any values after it) or --game_options=Board Size/15x15 to the convert script. This tells it that the goal is indeed to revert to default options, rather than just leaving whichever options were baked into the source model.

Examples for generating figures of models

If the optional graphviz and torchviz dependencies are installed, we can use torchviz to automatically generate figures of our models. This can be done using draw_model script:

python -m pypolygames draw_model \
	--game_name="Hex5pie" \
	--model_name="ResConvConvLogitPoolModelV2" \
	--out="/private/home/$USER/ImageName"

This command will generate an image of the ResConvConvLogitPoolModelV2 architecture when playing Hex5pie, and save it to /private/home/$USER/ImageName.png (note that the .png extension will be automatically appended).

Any arguments that can be used to modify the game, or any aspect of the Neural Network architecture, can be used in this command.

Running games through Ludii

See detailed documentation on the Ludii integration here.

Contributing

We welcome contributions! Please check basic instructions here

Initial contributors

Contributors to the early version of Polygames (before open source release) include:

Tristan Cazenave, Univ. Dauphine; Yen-Chi Chen, National Taiwan Normal University; Guan-Wei Chen, National Dong Hwa University; Shi-Yu Chen, National Dong Hwa University; Xian-Dong Chiu, National Dong Hwa University; Julien Dehos, Univ. Littoral Cote d’Opale; Maria Elsa, National Dong Hwa University; Qucheng Gong, Facebook AI Research; Hengyuan Hu, Facebook AI Research; Vasil Khalidov, Facebook AI Research; Chen-Ling Li, National Dong Hwa University; Hsin-I Lin, National Dong Hwa University; Yu-Jin Lin, National Dong Hwa University; Xavier Martinet, Facebook AI Research; Vegard Mella, Facebook AI Research; Jeremy Rapin, Facebook AI Research; Baptiste Roziere, Facebook AI Research; Gabriel Synnaeve, Facebook AI Research; Fabien Teytaud, Univ. Littoral Cote d’Opale; Olivier Teytaud, Facebook AI Research; Shi-Cheng Ye, National Dong Hwa University; Yi-Jun Ye, National Dong Hwa University; Shi-Jim Yen, National Dong Hwa University; Sergey Zagoruyko, Facebook AI Research

License

polygames is released under the MIT license. See LICENSE for additional details about it. Third-party libraries are also included under their own license.

polygames's People

Contributors

dennissoemers avatar fteytaud avatar jrapin avatar juliendehos avatar reidsanders avatar remilacroix-idris avatar teytaud avatar tscmoo avatar zxkyjimmy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

polygames's Issues

Connect Four: model currently not improving, best parameters for training.

Hey,

I'm trying to understand better the convergence speed of zero learning algorithms. The game I am using is Connect Four which should be simple enough, but currently the model isn't getting better. As you can see below it seems to be overfitting to a subpar policy.

Do you know what are the best parameters for training connect four? I used most of the default parameters as I thought that would suffice.
By the way, currently the hardware I'm using is two Tesla T4s and a CPU with 32 cores. Do you have any idea if the per_thread_batchsize=64 is ideal?

Steps to reproduce

  1. python -m pypolygames traineval --per_thread_batchsize=64 --num_game=32 --model_name=GenericModel --game_name=Connect4 --batchsize=128 --real_time --saving_period=5 --checkpoint_dir="checkpoints" --device cuda:0 cuda:1 --num_rollouts_opponent=1000

Observed Results

2020-03-16-112834_1366x768_scrot

loading checkpoint #5...
Playing 100 games of Connect4:
- GenericModel player uses 400 rollouts per actor with 1 actor
- pure MCTS opponent uses 1000 rollouts per actor with 1 actor
@@@eval: win: 20.00, tie: 3.00, loss: 77.00, avg: 21.50
loading checkpoint #10...
@@@eval: win: 20.00, tie: 2.00, loss: 78.00, avg: 21.00
loading checkpoint #15...
@@@eval: win: 20.00, tie: 1.00, loss: 79.00, avg: 20.50
loading checkpoint #20...
@@@eval: win: 24.00, tie: 1.00, loss: 75.00, avg: 24.50
loading checkpoint #25...
@@@eval: win: 21.00, tie: 0.00, loss: 79.00, avg: 21.00
loading checkpoint #30...
@@@eval: win: 9.00, tie: 5.00, loss: 86.00, avg: 11.50
loading checkpoint #35...
@@@eval: win: 18.00, tie: 1.00, loss: 81.00, avg: 18.50
loading checkpoint #40...
@@@eval: win: 20.00, tie: 2.00, loss: 78.00, avg: 21.00
loading checkpoint #45...
@@@eval: win: 25.00, tie: 3.00, loss: 72.00, avg: 26.50
loading checkpoint #50...
@@@eval: win: 30.00, tie: 5.00, loss: 65.00, avg: 32.50
loading checkpoint #55...
@@@eval: win: 23.00, tie: 4.00, loss: 73.00, avg: 25.00
loading checkpoint #60...
@@@eval: win: 18.00, tie: 4.00, loss: 78.00, avg: 20.00
loading checkpoint #65...
@@@eval: win: 21.00, tie: 3.00, loss: 76.00, avg: 22.50
loading checkpoint #70...
@@@eval: win: 21.00, tie: 4.00, loss: 75.00, avg: 23.00
loading checkpoint #75...
@@@eval: win: 17.00, tie: 6.00, loss: 77.00, avg: 20.00
loading checkpoint #80...
@@@eval: win: 18.00, tie: 2.00, loss: 80.00, avg: 19.00

missing "internal" test ?

.github/CONTRIBUTING.md suggests "testing the python tools" but it seems there is no such test.

Steps to reproduce

pytest internal pypolygames --durations=10 --verbose

Observed Results

=========================================================================== no tests ran in 0.00s ===========================================================================
ERROR: file not found: internal

Can't Build in Windows 10

Steps to reproduce

Everything work correctly until I try the cmake command.

Observed Results

cmake .. -DCMAKE_BUILD_TYPE=relwithdebinfo –DPYTORCH15=ON
-- The C compiler identification is MSVC 19.29.30037.0
-- The CXX compiler identification is MSVC 19.29.30037.0
-- Detecting C compiler ABI info
CMake Error: Generator: execution of make failed. Make command was /nologo cmTC_e285f\fast &&
-- Detecting C compiler ABI info - failed
-- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.29.30037/bin/Hostx64/x86/cl.exe
CMake Error: Generator: execution of make failed. Make command was /nologo cmTC_e285f\fast &&
-- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.29.30037/bin/Hostx64/x86/cl.exe - broken
CMake Error at C:/Users/glenn/miniconda3/Library/share/cmake-3.18/Modules/CMakeTestCCompiler.cmake:66 (message):
The C compiler

"C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.29.30037/bin/Hostx64/x86/cl.exe

is not able to compile a simple test program.

It fails with the following output:

Change Dir: C/Users/glenn/Documents/Polygames/polygames/build/CMakeFiles/CMakeTmp

Run Build Command(s):nmake /nologo cmTC_0b369\fast && The system cannot find the file specified
Generator: execution of make failed. Make command was: nmake /nolog cmTC_0b369\fast &&

CMake will not be able to correctly generate this project.
Call Stack (most recent call first):
CMakeLists.txt:2 (project)

-- Configuring incomplete, errors occurred!
See also "C:/Users/glenn/Documents/Polygames/polygames/build/CMakeFiles/CMakeOutput.log".
See also "C:/Users/glenn/Documents/Polygames/polygames/build/CMakeFiles/CMakeError.log".

------------------------------------------------------CMakeError.log file--------------------------------------------------

Compiling the C compiler identification source file "CMakeCCompilerId.c" failed.
Compiler:
Build flags:
Id flags:

The output was:
1
Microsoft (R) Build Engine version 16.10.1+2fd48ab73 for .NET Framework
Copyright (C) Microsoft Corporation. All rights reserved.

Build started 6/11/2021 12:29:01 AM.
Project "C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdC\CompilerIdC.vcxproj" on node 1 (default targets).
PrepareForBuild:
Creating directory "Debug".
Creating directory "Debug\CompilerIdC.tlog".
InitializeBuildStatus:
Creating "Debug\CompilerIdC.tlog\unsuccessfulbuild" because "AlwaysCreate" was specified.
ClCompile:
C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30037\bin\HostX64\x64\CL.exe /c /nologo /W0 /WX- /diagnostics:column /Od /D _MBCS /Gm- /EHsc /RTC1 /MDd /GS /fp:precise /Zc:wchar_t /Zc:forScope /Zc:inline /Fo"Debug\" /Fd"Debug\vc142.pdb" /external:env:EXTERNAL_INCLUDE /external:W0 /Gd /TC /FC /errorReport:queue CMakeCCompilerId.c
CMakeCCompilerId.c
C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdC\CMakeCCompilerId.c(20,52): error C2146: syntax error: missing ';' before identifier 'COMPILER_ID' [C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdC\CompilerIdC.vcxproj]
C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdC\CMakeCCompilerId.c(20,64): error C2143: syntax error: missing '{' before 'string' [C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdC\CompilerIdC.vcxproj]
C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdC\CMakeCCompilerId.c(20,64): error C2059: syntax error: 'string' [C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdC\CompilerIdC.vcxproj]
Done Building Project "C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdC\CompilerIdC.vcxproj" (default targets) -- FAILED.

Build FAILED.

"C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdC\CompilerIdC.vcxproj" (default target) (1) ->
(ClCompile target) ->
C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdC\CMakeCCompilerId.c(20,52): error C2146: syntax error: missing ';' before identifier 'COMPILER_ID' [C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdC\CompilerIdC.vcxproj]
C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdC\CMakeCCompilerId.c(20,64): error C2143: syntax error: missing '{' before 'string' [C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdC\CompilerIdC.vcxproj]
C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdC\CMakeCCompilerId.c(20,64): error C2059: syntax error: 'string' [C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdC\CompilerIdC.vcxproj]

0 Warning(s)
3 Error(s)

Time Elapsed 00:00:02.89

Compiling the C compiler identification source file "CMakeCCompilerId.c" failed.
Compiler:
Build flags:
Id flags:

The output was:
1
Microsoft (R) Build Engine version 16.10.1+2fd48ab73 for .NET Framework
Copyright (C) Microsoft Corporation. All rights reserved.

Build started 6/11/2021 12:29:04 AM.
Project "C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdC\CompilerIdC.vcxproj" on node 1 (default targets).
PrepareForBuild:
Creating directory "Debug".
Creating directory "Debug\CompilerIdC.tlog".
InitializeBuildStatus:
Creating "Debug\CompilerIdC.tlog\unsuccessfulbuild" because "AlwaysCreate" was specified.
ClCompile:
C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30037\bin\HostX64\x64\CL.exe /c /nologo /W0 /WX- /diagnostics:column /Od /D _MBCS /Gm- /EHsc /RTC1 /MDd /GS /fp:precise /Zc:wchar_t /Zc:forScope /Zc:inline /Fo"Debug\" /Fd"Debug\vc142.pdb" /external:env:EXTERNAL_INCLUDE /external:W0 /Gd /TC /FC /errorReport:queue CMakeCCompilerId.c
CMakeCCompilerId.c
C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdC\CMakeCCompilerId.c(20,52): error C2146: syntax error: missing ';' before identifier 'COMPILER_ID' [C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdC\CompilerIdC.vcxproj]
C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdC\CMakeCCompilerId.c(20,64): error C2143: syntax error: missing '{' before 'string' [C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdC\CompilerIdC.vcxproj]
C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdC\CMakeCCompilerId.c(20,64): error C2059: syntax error: 'string' [C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdC\CompilerIdC.vcxproj]
Done Building Project "C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdC\CompilerIdC.vcxproj" (default targets) -- FAILED.

Build FAILED.

"C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdC\CompilerIdC.vcxproj" (default target) (1) ->
(ClCompile target) ->
C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdC\CMakeCCompilerId.c(20,52): error C2146: syntax error: missing ';' before identifier 'COMPILER_ID' [C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdC\CompilerIdC.vcxproj]
C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdC\CMakeCCompilerId.c(20,64): error C2143: syntax error: missing '{' before 'string' [C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdC\CompilerIdC.vcxproj]
C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdC\CMakeCCompilerId.c(20,64): error C2059: syntax error: 'string' [C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdC\CompilerIdC.vcxproj]

0 Warning(s)
3 Error(s)

Time Elapsed 00:00:01.34

Compiling the CXX compiler identification source file "CMakeCXXCompilerId.cpp" failed.
Compiler:
Build flags:
Id flags:

The output was:
1
Microsoft (R) Build Engine version 16.10.1+2fd48ab73 for .NET Framework
Copyright (C) Microsoft Corporation. All rights reserved.

Build started 6/11/2021 12:29:06 AM.
Project "C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdCXX\CompilerIdCXX.vcxproj" on node 1 (default targets).
PrepareForBuild:
Creating directory "Debug".
Creating directory "Debug\CompilerIdCXX.tlog".
InitializeBuildStatus:
Creating "Debug\CompilerIdCXX.tlog\unsuccessfulbuild" because "AlwaysCreate" was specified.
ClCompile:
C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30037\bin\HostX64\x64\CL.exe /c /nologo /W0 /WX- /diagnostics:column /Od /D _MBCS /Gm- /EHsc /RTC1 /MDd /GS /fp:precise /Zc:wchar_t /Zc:forScope /Zc:inline /Fo"Debug\" /Fd"Debug\vc142.pdb" /external:env:EXTERNAL_INCLUDE /external:W0 /Gd /TP /FC /errorReport:queue CMakeCXXCompilerId.cpp
CMakeCXXCompilerId.cpp
C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdCXX\CMakeCXXCompilerId.cpp(14,52): error C2146: syntax error: missing ';' before identifier 'COMPILER_ID' [C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdCXX\CompilerIdCXX.vcxproj]
C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdCXX\CMakeCXXCompilerId.cpp(14,64): error C2143: syntax error: missing ';' before 'string' [C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdCXX\CompilerIdCXX.vcxproj]
C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdCXX\CMakeCXXCompilerId.cpp(14,64): error C4430: missing type specifier - int assumed. Note: C++ does not support default-int [C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdCXX\CompilerIdCXX.vcxproj]
Done Building Project "C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdCXX\CompilerIdCXX.vcxproj" (default targets) -- FAILED.

Build FAILED.

"C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdCXX\CompilerIdCXX.vcxproj" (default target) (1) ->
(ClCompile target) ->
C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdCXX\CMakeCXXCompilerId.cpp(14,52): error C2146: syntax error: missing ';' before identifier 'COMPILER_ID' [C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdCXX\CompilerIdCXX.vcxproj]
C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdCXX\CMakeCXXCompilerId.cpp(14,64): error C2143: syntax error: missing ';' before 'string' [C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdCXX\CompilerIdCXX.vcxproj]
C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdCXX\CMakeCXXCompilerId.cpp(14,64): error C4430: missing type specifier - int assumed. Note: C++ does not support default-int [C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdCXX\CompilerIdCXX.vcxproj]

0 Warning(s)
3 Error(s)

Time Elapsed 00:00:01.60

Compiling the CXX compiler identification source file "CMakeCXXCompilerId.cpp" failed.
Compiler:
Build flags:
Id flags:

The output was:
1
Microsoft (R) Build Engine version 16.10.1+2fd48ab73 for .NET Framework
Copyright (C) Microsoft Corporation. All rights reserved.

Build started 6/11/2021 12:29:08 AM.
Project "C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdCXX\CompilerIdCXX.vcxproj" on node 1 (default targets).
PrepareForBuild:
Creating directory "Debug".
Creating directory "Debug\CompilerIdCXX.tlog".
InitializeBuildStatus:
Creating "Debug\CompilerIdCXX.tlog\unsuccessfulbuild" because "AlwaysCreate" was specified.
ClCompile:
C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30037\bin\HostX64\x64\CL.exe /c /nologo /W0 /WX- /diagnostics:column /Od /D _MBCS /Gm- /EHsc /RTC1 /MDd /GS /fp:precise /Zc:wchar_t /Zc:forScope /Zc:inline /Fo"Debug\" /Fd"Debug\vc142.pdb" /external:env:EXTERNAL_INCLUDE /external:W0 /Gd /TP /FC /errorReport:queue CMakeCXXCompilerId.cpp
CMakeCXXCompilerId.cpp
C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdCXX\CMakeCXXCompilerId.cpp(14,52): error C2146: syntax error: missing ';' before identifier 'COMPILER_ID' [C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdCXX\CompilerIdCXX.vcxproj]
C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdCXX\CMakeCXXCompilerId.cpp(14,64): error C2143: syntax error: missing ';' before 'string' [C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdCXX\CompilerIdCXX.vcxproj]
C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdCXX\CMakeCXXCompilerId.cpp(14,64): error C4430: missing type specifier - int assumed. Note: C++ does not support default-int [C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdCXX\CompilerIdCXX.vcxproj]
Done Building Project "C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdCXX\CompilerIdCXX.vcxproj" (default targets) -- FAILED.

Build FAILED.

"C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdCXX\CompilerIdCXX.vcxproj" (default target) (1) ->
(ClCompile target) ->
C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdCXX\CMakeCXXCompilerId.cpp(14,52): error C2146: syntax error: missing ';' before identifier 'COMPILER_ID' [C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdCXX\CompilerIdCXX.vcxproj]
C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdCXX\CMakeCXXCompilerId.cpp(14,64): error C2143: syntax error: missing ';' before 'string' [C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdCXX\CompilerIdCXX.vcxproj]
C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdCXX\CMakeCXXCompilerId.cpp(14,64): error C4430: missing type specifier - int assumed. Note: C++ does not support default-int [C:\Users\glenn\polygames\build\CMakeFiles\3.19.6\CompilerIdCXX\CompilerIdCXX.vcxproj]

0 Warning(s)
3 Error(s)

Time Elapsed 00:00:01.27

Expected Results

  • What did you expect to happen?

Relevant Code

// TODO(you): code here to reproduce the problem

build fails on AArch64, Fedora 33

[jw@cn06 build]$ make VERBOSE=1 -j1
/usr/bin/cmake -S/data/jw/polygames -B/data/jw/polygames/build --check-build-system CMakeFiles/Makefile.cmake 0
/usr/bin/cmake -E cmake_progress_start /data/jw/polygames/build/CMakeFiles /data/jw/polygames/build//CMakeFiles/progress.marks
make -f CMakeFiles/Makefile2 all
make[1]: Entering directory '/data/jw/polygames/build'
make -f src/third_party/fmt/CMakeFiles/fmt.dir/build.make src/third_party/fmt/CMakeFiles/fmt.dir/depend
make[2]: Entering directory '/data/jw/polygames/build'
cd /data/jw/polygames/build && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /data/jw/polygames /data/jw/polygames/src/third_party/fmt /data/jw/polygames/build /data/jw/polygames/build/src/third_party/fmt /data/jw/polygames/build/src/third_party/fmt/CMakeFiles/fmt.dir/DependInfo.cmake --color=
make[2]: Leaving directory '/data/jw/polygames/build'
make -f src/third_party/fmt/CMakeFiles/fmt.dir/build.make src/third_party/fmt/CMakeFiles/fmt.dir/build
make[2]: Entering directory '/data/jw/polygames/build'
make[2]: Nothing to be done for 'src/third_party/fmt/CMakeFiles/fmt.dir/build'.
make[2]: Leaving directory '/data/jw/polygames/build'
[ 4%] Built target fmt
make -f src/tube/CMakeFiles/_tube.dir/build.make src/tube/CMakeFiles/_tube.dir/depend
make[2]: Entering directory '/data/jw/polygames/build'
cd /data/jw/polygames/build && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /data/jw/polygames /data/jw/polygames/src/tube /data/jw/polygames/build /data/jw/polygames/build/src/tube /data/jw/polygames/build/src/tube/CMakeFiles/_tube.dir/DependInfo.cmake --color=
make[2]: Leaving directory '/data/jw/polygames/build'
make -f src/tube/CMakeFiles/_tube.dir/build.make src/tube/CMakeFiles/_tube.dir/build
make[2]: Entering directory '/data/jw/polygames/build'
[ 5%] Building CXX object src/tube/CMakeFiles/_tube.dir/src_cpp/data_channel.cc.o
cd /data/jw/polygames/build/src/tube && /usr/lib64/ccache/c++ -I/data/jw/torch/install/include -I/data/jw/torch/install/include/TH -I/usr/lib/jvm/java/include -I/usr/lib/jvm/java/include/linux -I/data/jw/polygames/src -I/data/jw/polygames/src/third_party -I/data/jw/polygames/src/third_party/fmt/include -I/usr/include/python3.9 -fsized-deallocation -O3 -ffast-math -fPIC -std=gnu++17 -o CMakeFiles/_tube.dir/src_cpp/data_channel.cc.o -c /data/jw/polygames/src/tube/src_cpp/data_channel.cc
In file included from /data/jw/polygames/src/tube/src_cpp/data_channel.h:16,
from /data/jw/polygames/src/tube/src_cpp/data_channel.cc:8:
/data/jw/polygames/src/tube/src_cpp/data_block.h:10:10: fatal error: torch/extension.h: No such file or directory
10 | #include <torch/extension.h>
| ^~~~~~~~~~~~~~~~~~~
compilation terminated.
make[2]: *** [src/tube/CMakeFiles/_tube.dir/build.make:82: src/tube/CMakeFiles/_tube.dir/src_cpp/data_channel.cc.o] Error 1
make[2]: Leaving directory '/data/jw/polygames/build'
make[1]: *** [CMakeFiles/Makefile2:493: src/tube/CMakeFiles/_tube.dir/all] Error 2
make[1]: Leaving directory '/data/jw/polygames/build'
make: *** [Makefile:114: all] Error 2
[jw@cn06 build]$

Using pytorch 1.5 fails with "undefined symbol: THPVariableClass"

Steps to reproduce

  1. install exactly as non devfair instructions, except instead of

conda install pytorch=1.1.0 cuda92 -c pytorch

run:

conda install pytorch=1.5 cuda100 -c pytorch

Then build as usual except use:

cmake -DPYTORCH15=ON ..

Then run connect4 test with:

python -m pypolygames train --game_name="Connect4"

Observed Results

Fails with:
$ python -m pypolygames train --game_name="Connect4" --num_games 4
Traceback (most recent call last):
File "/home/rs/mirror/anaconda3/envs/polygames4/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/rs/mirror/anaconda3/envs/polygames4/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/rs/crypt/rl/polygames3/pypolygames/main.py", line 24, in
from .utils import CommandHistory
File "/home/rs/crypt/rl/polygames3/pypolygames/utils/init.py", line 6, in
from .checkpoint import Checkpoint, save_checkpoint, load_checkpoint, gen_checkpoints
File "/home/rs/crypt/rl/polygames3/pypolygames/utils/checkpoint.py", line 16, in
import tube
ImportError: /home/rs/crypt/rl/polygames3/build/torchRL/tube/tube.cpython-37m-x86_64-linux-gnu.so: undefined symbol: THPVariableClass

Expected Results

Standard training. Pytorch 1.1 does work as expected.

Attempted Fixes:

Tried all variations of PYTORCH15 PYTORCH12 flags
Tried different pytorch-magma100 pytorch-magma102 versions. tried cuda92, cuda100 (there is no cuda102)
Tried 71a6 commit where pytorch 1.5 support was officially added.

g++ (GCC) 10.1.0

conda list:

Name Version Build Channel

_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 0_gnu conda-forge
blas 2.16 mkl conda-forge
bzip2 1.0.8 h516909a_2 conda-forge
ca-certificates 2020.6.20 hecda079_0 conda-forge
certifi 2020.6.20 py37hc8dfbb8_0 conda-forge
cffi 1.14.0 py37hd463f26_0 conda-forge
chardet 3.0.4 pypi_0 pypi
cmake 3.17.0 h28c56e5_0 conda-forge
cudatoolkit 10.2.89 hfd86e86_1
expat 2.2.9 he1b5a44_2 conda-forge
freetype 2.10.2 he06d7ca_0 conda-forge
idna 2.10 pypi_0 pypi
intel-openmp 2020.1 217
jpeg 9d h516909a_0 conda-forge
jsonpatch 1.26 pypi_0 pypi
jsonpointer 2.0 pypi_0 pypi
krb5 1.17.1 hfafb76e_1 conda-forge
lcms2 2.11 hbd6801e_0 conda-forge
ld_impl_linux-64 2.34 h53a641e_7 conda-forge
libblas 3.8.0 16_mkl conda-forge
libcblas 3.8.0 16_mkl conda-forge
libcurl 7.71.1 hcdd3856_2 conda-forge
libedit 3.1.20191231 h46ee950_1 conda-forge
libffi 3.2.1 he1b5a44_1007 conda-forge
libgcc-ng 9.2.0 h24d8f2e_2 conda-forge
libgfortran-ng 7.5.0 hdf63c60_6 conda-forge
libgomp 9.2.0 h24d8f2e_2 conda-forge
liblapack 3.8.0 16_mkl conda-forge
liblapacke 3.8.0 16_mkl conda-forge
libopenblas 0.3.10 pthreads_hb3c22a3_2 conda-forge
libpng 1.6.37 hed695b0_1 conda-forge
libprotobuf 3.12.3 h8b12597_1 conda-forge
libssh2 1.9.0 hab1572f_4 conda-forge
libstdcxx-ng 9.2.0 hdf63c60_2 conda-forge
libtiff 4.1.0 hc7e4089_6 conda-forge
libuv 1.38.0 h516909a_0 conda-forge
libwebp-base 1.1.0 h516909a_3 conda-forge
lz4-c 1.9.2 he1b5a44_1 conda-forge
magma-cuda100 2.5.2 1 pytorch
magma-cuda102 2.5.2 1 pytorch
mkl 2020.1 217
mkl-include 2020.1 219 conda-forge
ncurses 6.2 he1b5a44_1 conda-forge
ninja 1.10.0 hc9558a2_0 conda-forge
numpy 1.19.0 py37h8960a57_0 conda-forge
olefile 0.46 py_0 conda-forge
openssl 1.1.1g h516909a_0 conda-forge
pillow 7.2.0 py37h718be6c_1 conda-forge
pip 20.1.1 py_1 conda-forge
protobuf 3.12.3 py37h3340039_0 conda-forge
pycparser 2.20 pyh9f0ad1d_2 conda-forge
python 3.7.6 cpython_h8356626_6 conda-forge
python_abi 3.7 1_cp37m conda-forge
pytorch 1.5.1 py3.7_cuda10.2.89_cudnn7.6.5_0 pytorch
pyyaml 5.3.1 py37h8f50634_0 conda-forge
pyzmq 19.0.1 pypi_0 pypi
readline 8.0 he28a2e2_2 conda-forge
requests 2.24.0 pypi_0 pypi
rhash 1.3.6 h14c3975_1001 conda-forge
scipy 1.5.1 pypi_0 pypi
setuptools 49.2.0 py37hc8dfbb8_0 conda-forge
six 1.15.0 pyh9f0ad1d_0 conda-forge
sqlite 3.32.3 hcee41ef_1 conda-forge
tensorboardx 2.1 py_0 conda-forge
tk 8.6.10 hed695b0_0 conda-forge
torchfile 0.1.0 pypi_0 pypi
torchvision 0.6.1 py37_cu102 pytorch
tornado 6.0.4 pypi_0 pypi
typing 3.7.4.3 py37hc8dfbb8_0 conda-forge
urllib3 1.25.9 pypi_0 pypi
visdom 0.1.8.9 pypi_0 pypi
websocket-client 0.57.0 pypi_0 pypi
wheel 0.34.2 py_1 conda-forge
xz 5.2.5 h516909a_1 conda-forge
yaml 0.2.5 h516909a_0 conda-forge
zlib 1.2.11 h516909a_1006 conda-forge
zstd 1.4.5 h6597ccf_1 conda-forge

I'd much prefer to work in the up to date pytorch, but I can only find one other reference to this error:
pytorch/extension-cpp#6 , which is old and doesn't seem to provide any relevant solution. Thanks!

Hardware requirements

Hey there! I've read through the Polygames paper and browsed round the repo, but I can't find anything on the hardware requirements to successfully train these networks. To ask a concrete question, could you tell me how many GPU-hours this year's 13x13 Hex champion consumed?

What'd be even better would be the terminal commands and log files for each checkpoint, so I can get a feel for how much compute leads to how much performance, and what the best hyperparameters are. Are those readily available anywhere?

copyModelStateDict: Unknown state dict entry 'pi3.weight' -- could not find parameter/buffer 'weight'

Steps to reproduce

  1. Install Pytorch 1.2.0 and Polygames based on the instructions given in the README adapted for our cluster (https://github.com/facebookincubator/Polygames#first-install-conda-and-pytorch).
  2. Run python -m pypolygames train --game_name Hex11

Observed Results

######################################################################
#                              TRAINING                              #
######################################################################
setting-up pseudo-random generator...
creating and saving the model...
creating a generic model
total #trainable params = 3218050
creating optimizer...
creating training environment...
Game generation devices: ['cuda:0']
is_server is  False
is_client is  False
 -- UPDATE MODEL --
copyModelStateDict: Unknown state dict entry 'pi3.weight' -- could not find parameter/buffer 'weight'
Aborted

Note:

  • The error is the same if I force using the CPU with --device cpu.
  • The error is similar if I change the model (e.g. copyModelStateDict: Unknown state dict entry 'pi_logit.weight' -- could not find parameter/buffer 'weight').

Running test_state does not fail gracefully if Ludii.jar is absent

Steps to reproduce

  1. Build Polygames following the instructions from the readme.
  2. Run ./build/test_state

Observed Results

The test fails when it reaches the point where it tests the Ludii integration (except if the user manually also placed Ludii.jar in the correct spot, which would be ludii/Ludii.jar).

Expected Results

The test should probably skip testing the Ludii integration of the Ludii.jar file is missing (but test it if it's not missing!).

Won't CMake with VS

Steps to reproduce

  1. Don't have Visual Studio (or any c-compiler) installed
  2. Have Anaconda
  3. Follow steps for "compilation without modules"

Observed Results

Severity Code Description Project File Line Suppression State
Error CMake Error at .../Microsoft Visual Studio/2019/Community/Common7/IDE/CommonExtensions/Microsoft/CMake/CMake/share/cmake-3.17/Modules/CMakeTestCCompiler.cmake:60 (message):
The C compiler

".../Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.26.28801/bin/HostX64/x64/cl.exe"

is not able to compile a simple test program.

It fails with the following output:

Change Dir: .../Polygames/out/build/x64-Debug/CMakeFiles/CMakeTmp

Run Build Command(s):.../Microsoft Visual Studio/2019/Community/Common7/IDE/CommonExtensions/Microsoft/CMake/Ninja/ninja.exe cmTC_c6e64 && [1/2] Building C object CMakeFiles\cmTC_c6e64.dir\testCCompiler.c.obj
[2/2] Linking C executable cmTC_c6e64.exe
FAILED: cmTC_c6e64.exe 
cmd.exe /C "cd . && "...\Microsoft Visual Studio\2019\Community\Common7\IDE\CommonExtensions\Microsoft\CMake\CMake\bin\cmake.exe" -E vs_link_exe --intdir=CMakeFiles\cmTC_c6e64.dir --rc=rc --mt=CMAKE_MT-NOTFOUND --manifests  -- "...\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.26.28801\bin\Hostx64\x64\link.exe" /nologo CMakeFiles\cmTC_c6e64.dir\testCCompiler.c.obj  /out:cmTC_c6e64.exe /implib:cmTC_c6e64.lib /pdb:cmTC_c6e64.pdb /version:0.0  /machine:x64  /debug /INCREMENTAL /subsystem:console  kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib && cd ."
RC Pass 1: command "rc /fo CMakeFiles\cmTC_c6e64.dir/manifest.res CMakeFiles\cmTC_c6e64.dir/manifest.rc" failed (exit code 0) with the following output:
The system cannot find the file specified
ninja: build stopped: subcommand failed.

CMake will not be able to correctly generate this project. .../Microsoft Visual Studio/2019/Community/Common7/IDE/CommonExtensions/Microsoft/CMake/CMake/share/cmake-3.17/Modules/CMakeTestCCompiler.cmake 60

Expected Results

  • What did you expect to happen?
    I expected it to build so I could then make it and then run it.

Relevant Code

git clone ... 
cd polygames
mkdir build
cd build
cmake .. [gcc>=7]
make

runtime error when initializing JVM

Steps to reproduce

  1. build Polygames (branch ludiitest) & install Ludii.jar
  2. run ./build/test_state

Observed Results

...
testing: Ludii Tic-Tac-Toe
intializing JVM
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f098403846d, pid=22505, tid=0x00007f097b39da00
#
# JRE version: OpenJDK Runtime Environment (8.0_222) (build 1.8.0_222-ga)
# Java VM: OpenJDK 64-Bit Server VM (25.222-bga mixed mode linux-amd64 compressed oops)
# Problematic frame:
# V  [libjvm.so+0x63f46d]  get_method_id(JNIEnv_*, _jclass*, char const*, char const*, bool, Thread*) [clone .constprop.116]+0x7d
#
# Core dump written. Default location: /home/jd/depots/github/facebookincubator/Polygames/core or core.22505
#
# An error report file with more information is saved as:
# /home/jd/depots/github/facebookincubator/Polygames/hs_err_pid22505.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
#
Aborted (core dumped)

Call stack (in hs_err_pid22505.log):

Stack: [0x00007fff5873e000,0x00007fff5883e000],  sp=0x00007fff5882beb0,  free space=951k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x63f46d]  get_method_id(JNIEnv_*, _jclass*, char const*, char const*, bool, Thread*) [clone .constprop.116]+0x7d
V  [libjvm.so+0x63f81a]  jni_GetMethodID+0x7a
C  [test_state+0xbf309]  Ludii::LudiiGameWrapper::LudiiGameWrapper(JNIEnv_*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)+0x59
C  0x0000000002b381e0

Relevant Code

It seems that the error occurs here (games/ludii/ludii_game_wrapper.cc:27):

  jmethodID ludiiGameWrapperConstructor =
  		jenv->GetMethodID(ludiiGameWrapperClass, "<init>", "(Ljava/lang/String;)V");

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.