facebookresearch / symbolicmathematics Goto Github PK

View Code? Open in Web Editor NEW

513.0 28.0 112.0 58 KB

Deep Learning for Symbolic Mathematics

License: Other

Jupyter Notebook 5.71% Python 94.29%

symbolicmathematics's Introduction

Deep Learning for Symbolic Mathematics

PyTorch original implementation of Deep Learning for Symbolic Mathematics (ICLR 2020).

This repository contains code for:

Data generation
- Functions F with their derivatives f
- Functions f with their primitives F
  - Forward (FWD)
  - Backward (BWD)
  - Integration by parts (IBP)
- Ordinary differential equations with their solutions
  - First order (ODE1)
  - Second order (ODE2)
Training
- Half-precision (float16)
- Multi-GPU
- Multi-node
Evaluation:
- Greedy decoding
- Beam search evaluation

We also provide:

Datasets
- Train / Valid / Test sets for all tasks considered in the paper
Trained models
- Models trained with different configurations of training data
Notebook
- An ipython notebook with an interactive demo of the model on function integration

Dependencies

Python 3
NumPy
SymPy
PyTorch (tested on version 1.3)
Apex (for fp16 training)

Datasets and Trained Models

We provide datasets for each task considered in the paper:

Dataset	#train	Link
Integration (FWD)	45M	Link
Integration (BWD)	88M	Link
Integration (IBP)	23M	Link
Differential equations (ODE1)	65M	Link
Differential equations (ODE2)	32M	Link

We also provide models trained on the above datasets, for integration:

Model training data	Accuracy (FWD)	Accuracy (BWD)	Accuracy (IBP)	Link
FWD	97.2%	16.1%	89.2%	Link
BWD	31.6%	99.6%	60.0%	Link
IBP	55.3%	85.5%	99.3%	Link
FWD + BWD	96.8%	99.6%	86.1%	Link
BWD + IBP	56.7%	99.5%	98.7%	Link
FWD + BWD + IBP	95.6%	99.5%	99.6%	Link

and for differential equations:

Model training data	Accuracy (ODE1)	Accuracy (ODE2)	Link
ODE1	97.2%	-	Link
ODE2	-	88.2%	Link

All accuracies above are given using a beam search of size 10. Note that these datasets and models slightly differ from the ones used in the paper.

Data generation

If you want to use your own dataset / generator, it is possible to train a model by generating data on the fly. However, the generation process can take a while, so we recommend to first generate data, and export it into a dataset that can be used for training. This can easily be done by setting --export_data true:

python main.py --export_data true

## main parameters
--batch_size 32
--cpu true
--exp_name prim_bwd_data
--num_workers 20               # number of processes
--tasks prim_bwd               # task (prim_fwd, prim_bwd, prim_ibp, ode1, ode2)
--env_base_seed -1             # generator seed (-1 for random seed)

## generator configuration
--n_variables 1                # number of variables (x, y, z)
--n_coefficients 0             # number of coefficients (a_0, a_1, a_2, ...)
--leaf_probs "0.75,0,0.25,0"   # leaf sampling probabilities
--max_ops 15                   # maximum number of operators (at generation, but can be much longer after derivation)
--max_int 5                    # max value of sampled integers
--positive true                # sign of sampled integers
--max_len 512                  # maximum length of generated equations

## considered operators, with (unnormalized) sampling probabilities
--operators "add:10,sub:3,mul:10,div:5,sqrt:4,pow2:4,pow3:2,pow4:1,pow5:1,ln:4,exp:4,sin:4,cos:4,tan:4,asin:1,acos:1,atan:1,sinh:1,cosh:1,tanh:1,asinh:1,acosh:1,atanh:1"

## other generations parameters can be found in `main.py` and `src/envs/char_sp.py`

Data will be exported in the prefix and infix formats to:

./dumped/prim_bwd_data/EXP_ID/data.prefix
./dumped/prim_bwd_data/EXP_ID/data.infix

data.prefix and data.infix are two parallel files containing the same number of lines, with the same equations written in prefix and infix representations respectively. In these files, each line contains an input (e.g. the function to integrate) and the associated output (e.g. an integral) separated by a tab. In practice, the model only operates on prefix data. The infix data is optional, but more human readable, and can be used for debugging purposes.

Note that some generators are very fast, such as prim_bwd, which only requires to generate a random function and to differentiate it. The others are significantly longer. For instance, the validity of differential equations is checked (symbolically and numerically) after generation, which can be expensive. In our case, we generated the data across a large number of CPUs to create a large training set. For reproducibility, we provide our training / validation / test datasets in the links above. Generators can be made faster by decreasing the timeout generation time in char_sp.py, but this may slightly reduce the set of equations that the generator can produce.

If you generate your own dataset, you will notice that the generator generates a lot of duplicates (which is inevitable if you parallelize the generation). In practice, we remove duplicates using:

cat ./dumped/prim_bwd_data/*/data.prefix \
| awk 'BEGIN{PROCINFO["sorted_in"]="@val_num_desc"}{c[$0]++}END{for (i in c) printf("%i|%s\n",c[i],i)}' \
> data.prefix.counts

The resulting format is the following:

count1|input1_prefix    output1_prefix
count2|input2_prefix    output2_prefix
...

Where the input and output are separated by a tab, and equations are sorted by counts. This is under this format that data has to be given to the model. The number of counts is not used by the model, but was not removed in case of potential curriculum learning. The last part consists in simply splitting the dataset into training / validation / test sets. This can be done with the split_data.py script:

# create a valid and a test set of 10k equations
python split_data.py data.prefix.counts 10000

# remove valid inputs that are in the train
mv data.prefix.counts.valid data.prefix.counts.valid.old
awk -F"[|\t]" 'NR==FNR { lines[$2]=1; next } !($2 in lines)' <(cat data.prefix.counts.train) data.prefix.counts.valid.old \
> data.prefix.counts.valid

# test test inputs that are in the train
mv data.prefix.counts.test data.prefix.counts.test.old
awk -F"[|\t]" 'NR==FNR { lines[$2]=1; next } !($2 in lines)' <(cat data.prefix.counts.train) data.prefix.counts.test.old \
> data.prefix.counts.test

Training

To train a model, you first need data. You can either generate it using the scripts above, or download the data provided in this repository. For instance:

wget https://dl.fbaipublicfiles.com/SymbolicMathematics/data/prim_fwd.tar.gz
tar -xvf prim_fwd.tar.gz

Once you have a training / validation / test set, you can train using the following command:

python main.py

## main parameters
--exp_name first_train  # experiment name
--fp16 true --amp 2     # float16 training

## dataset location
--tasks "prim_fwd"                                                    # task
--reload_data "prim_fwd,prim_fwd.train,prim_fwd.valid,prim_fwd.test"  # data location
--reload_size 40000000                                                # training set size

## model parameters
--emb_dim 1024    # model dimension
--n_enc_layers 6  # encoder layers
--n_dec_layers 6  # decoder layers
--n_heads 8       # number of heads

## training parameters
--optimizer "adam,lr=0.0001"             # model optimizer
--batch_size 32                          # batch size
--epoch_size 300000                      # epoch size (number of equations per epoch)
--validation_metrics valid_prim_fwd_acc  # validation metric (when to save the model)

Additional training parameters can be found in main.py.

Evaluation

During training, the accuracy on the validation set is measured at the end of each epoch. However, during training, we only compare the model generated output with the solution in the dataset, although the solution may not be unique. For instance, if the input is:

y'' + y = 0

and that the expected solution in the dataset is a_0 * cos(x) + a_1 * sin(x), then if the model generates a_0 * sin(x) + a_1 * cos(x) the output will be considered invalid because it does not exactly match the one of the dataset. To verify the model output, we plug it into the input equation to verify that this is a valid solution. However, manually verifying the model output can take a lot of time, so we only do this at the end of training, by setting --beam_eval true, and using the following command:

python main.py

## main parameters
--exp_name first_eval     # experiment name
--eval_only true          # evaluation mode (do not load the training set)
--reload_model "fwd.pth"  # model to reload and evaluate

## dataset location
--tasks "prim_fwd"                                                    # task
--reload_data "prim_fwd,prim_fwd.train,prim_fwd.valid,prim_fwd.test"  # data location

--emb_dim 1024    # model dimension
--n_enc_layers 6  # encoder layers
--n_dec_layers 6  # decoder layers
--n_heads 8       # number of heads

## evaluation parameters
--beam_eval true            # beam evaluation (with false, outputs are only compared with dataset solutions)
--beam_size 10              # beam size
--beam_length_penalty 1.0   # beam length penalty (1.0 corresponds to average of log-probs)
--beam_early_stopping 1     # beam early stopping
--eval_verbose 1            # export beam results (set to 2 to evaluate with beam even when greedy was successful)
--eval_verbose_print false  # print detailed evaluation results

Evaluation with beam can take some time, so we recommend to use not-too-large beams (10 is a good value).

Frequently Asked Questions

How can I run experiments on multiple GPUs?

This code supports both multi-GPU and multi-node training, and was tested with up to 128 GPUs. To run an experiment with multiple GPUs on a single machine, simply replace python main.py in the commands above with:

export NGPU=8; python -m torch.distributed.launch --nproc_per_node=$NGPU main.py

The multi-node is automatically handled by SLURM.

How can I use this code to train a model on a new task?

In src/envs/char_sp.py you will find several functions gen_prim_fwd, gen_prim_bwd, gen_prim_ibp, gen_ode1, gen_ode2 responsible for the generation of the 5 tasks we considered, inside the environment class CharSPEnvironment. If you want to try a new task, you just need to add a new function gen_NEW_TASK to the environment class.

For all the tasks we considered, the input is composed of an equation with a function y which is the function to find. This procedure is compatible both with integration, and differential equations. For instance, in the case of integration, the input will be of the form y' - F where F is the function to integrate. In the case of differentiation, the input will be of the form y - F' where F is the function to differentiate. If the differential equation is y'' + y = 0 the input will simply be y'' + y. At test time, the y function in the input is replaced by the output of the model which is considered valid if the input is evaluated to 0. Based on the task you consider, you may need to update the evaluator in evaluator.py accordingly.

References

Deep Learning for Symbolic Mathematics (ICLR 2020) - Guillaume Lample * and François Charton *

@article{lample2019deep,
  title={Deep learning for symbolic mathematics},
  author={Lample, Guillaume and Charton, Fran{\c{c}}ois},
  journal={arXiv preprint arXiv:1912.01412},
  year={2019}
}

License

See the LICENSE file for more details.

symbolicmathematics's People

Contributors

Stargazers

Watchers

Forkers

kabongosalomon michaelbroox robertogemartin zeta1999 wanngweiwei ml-and-ai-repo arita37 scape1989 fundou r0manski belkhir-nacim pksx01 chaohuang-ch odellus citron murilo hadkins1 herminia1993 sailfish009 morgatron martincastellano longjohncoder vinaypatil-ev realfolkcode mryinglee mlh-fellowship arrrlex aryanverma2204 buggy213 cesarmiquel chuong armeria-program abdalazizrashid shadowkun kuriadan fabiocastilhoss ismailalaouiabdellaoui matyushinleonid basicacid zfr-1 lyzinfo gigg1ee tommasobendinelli yizheng-wang julia-pl1 prohit93 djouani vasiliyeskin konakona666 vnikoofard biayangqi meiyeshi hujp005 kimianoorbakhsh syo093c qmrpy alulimodradek peng-weil rmallof sharifi-mahdi ksrikar1234 jzallen07 huzhaoqing ebotiab modar7 mekty2012 wupeihan248 xiangxiang920920 aureliendersy qihangithub hayk-skydio jstack123 cesposo stfeleti ritam-m beteixz guanlongtianzi sima-kiani python-repository-hub treestreamymw ywen666 mtc2013 udemirezen logichen sbu-cs-scientific-association zyr2019 solomonpromise zhihh mabrokma felix660 zhaohj2017 xiaoyuan0203 chorseng jmflach xlinker1 requiema lavender-lee flooferdoodle yiweny harheem

symbolicmathematics's Issues

Why does the predict result copy the dataset's answer?

I'm trying to evaluate the model, and I use the beam evaluation.
I find out that your answer in the third line is copied from the second line, which is the answer in dataset.
Your code which copies the answer is here:
beam_log = {}
for i in range(len(len1)):
src = idx_to_sp(env, x1[1:len1[i] - 1, i].tolist())
tgt = idx_to_sp(env, x2[1:len2[i] - 1, i].tolist())
if valid[i]:
beam_log[i] = {'src': src, 'tgt': tgt, 'hyps': [(tgt, None, True)]}
The 'hyps' should be the prediction by the model, but you use 'tgt' as the output.
Why not give the prediction in the program? And can you give the code which can output the answer?

ODE's computation

Hello,

First of all thanks for your work and to share it !

I was wondering if there is a notebook about how to use the ODE's prediction instead of the integral prediction and if not where can I find instruction about how to do so ?

Thanks by advance :)

Best,

Arthur SERRES
student in applied mathematics.

Docker environment

We need a docker environment to quickly replicate the results. I would be interested in making one.

Symbolic solving

Hi everyone !

I played a bit with the solver but there is something I tried but didn't succeed: Of course we can solve simple ODEs like f'+12=0 but can we solve f'+c=0 for c a fixed arbitrary real constant ? And in the latter case how to do so ?

Many thanks by advance for those of you that will spent some time on my question :)

DataLoader worker exited unexpectedly

Hello! Thank you so much for putting up this wonderful resource! I am a student trying to play with this code and when I am running python3 main.py --> and doing training, I end up getting the following error of DataLoader worker exiting unexpectedly? (attached image). Do you know what causes this error? Even when I restart the computer and try again, I eventually run into this same error.

Note: I don't think the DataLoader exit problem is related to the ImportError exception, because there were many such "ImportError exceptions" on previous training equations with no such DataLoader exit.

Many thanks, and have a good day! :)

I am afraid I have trouble understanding the training data and I cannot find any explanation in the README or paper.

What for example what does a line like that mean:
2|sub Y' add INT+ 4 mul INT+ 2 pow add x sqrt INT+ 3 INT- 1 add mul INT+ 2 ln add x sqrt INT+ 3 mul INT+ 4 x
What is the "sub Y' what is the operator "INT-" or "INT" is it just integer addition/subtraction? And why do you differenciate between that and sub/add? Also where do I get the training pairs, there is only one expression per line not two.

Pickle Loading Problems (EOFError: Ran out of input)

Hi, I'm currently playing around with the symbomath code and I keep running into this error with data generation.

Running the following code (the example generation code provided):
python main.py --export_data true --batch_size 32 --cpu true --exp_name prim_bwd_data --num_workers 20 --tasks prim_bwd --env_base_seed -1 --n_variables 1 --n_coefficients 0 --leaf_probs "0.75,0,0.25,0" --max_ops 15 --max_int 5 --positive true --max_len 512 --operators "add:10,sub:3,mul:10,div:5,sqrt:4,pow2:4,pow3:2,pow4:1,pow5:1,ln:4,exp:4,sin:4,cos:4,tan:4,asin:1,acos:1,atan:1,sinh:1,cosh:1,tanh:1,asinh:1,acosh:1,atanh:1"

Yields an error message telling me that './dumped/prim_bwd_data\48t7888vh8\params.pkl' doesn't exist. I can't manually create the file, since that 10-digit string of numbers constantly changes, so could someone explain why the code isn't correctly making a directory for this?

But that's not the issue - I can set a parameter to set a dump path, so the run code looks like this (all I did was add a --dump_path to it): python main.py --export_data true --dump_path C:\Users\Toby\PycharmProjects\PDE\venv\dumped --batch_size 32 --cpu true --exp_name prim_bwd_data --num_workers 20 --tasks prim_bwd --env_base_seed -1 --n_variables 1 --n_coefficients 0 --leaf_probs "0.75,0,0.25,0" --max_ops 15 --max_int 5 --positive true --max_len 512 --operators "add:10,sub:3,mul:10,div:5,sqrt:4,pow2:4,pow3:2,pow4:1,pow5:1,ln:4,exp:4,sin:4,cos:4,tan:4,asin:1,acos:1,atan:1,sinh:1,cosh:1,tanh:1,asinh:1,acosh:1,atanh:1"`

Doing this yields the following error message:
SLURM job: False
0 - Number of nodes: 1
0 - Node ID : 0
0 - Local rank : 0
0 - Global rank : 0
0 - World size : 1
0 - GPUs per node : 1
0 - Master : True
0 - Multi-node : False
0 - Multi-GPU : False
0 - Hostname : ChanPC
A subdirectory or file -p already exists.
Error occurred while processing: -p.
INFO - 06/25/20 14:02:53 - 0:00:00 - ============ Initialized logger ============
INFO - 06/25/20 14:02:53 - 0:00:00 - accumulate_gradients: 1
amp: -1
attention_dropout: 0
balanced: False
batch_size: 32
beam_early_stopping: True
beam_eval: False
beam_length_penalty: 1
beam_size: 1
clean_prefix_expr: True
clip_grad_norm: 5
command: python main.py --export_data true --dump_path 'C:\Users\Chan\PycharmProjects\PDE\venv\dumped' --batch_size 32 --cpu true --exp_name prim_bwd_data --num_workers 20 --tasks prim_bwd --env_base_seed '-1' --n_variables 1 --n_coefficients 0 --leaf_probs '0.75,0,0.25,0' --max_ops 15 --max_int 5 --positive true --max_len 512 --operators 'add:10,sub:3,mul:10,div:5,sqrt:4,pow2:4,pow3:2,pow4:1,pow5:1,ln:4,exp:4,sin:4,cos:4,tan:4,asin:1,acos:1,atan:1,sinh:1,cosh:1,tanh:1,asinh:1,acosh:1,atanh:1' --exp_id "9epgwherdq"
cpu: True
debug: False
debug_slurm: False
dropout: 0
dump_path: C:\Users\Chan\PycharmProjects\PDE\venv\dumped\prim_bwd_data\9epgwherdq
emb_dim: 256
env_base_seed: -1
env_name: char_sp
epoch_size: 300000
eval_only: False
eval_verbose: 0
eval_verbose_print: False
exp_id: 9epgwherdq
exp_name: prim_bwd_data
export_data: True
fp16: False
global_rank: 0
int_base: 10
is_master: True
is_slurm_job: False
leaf_probs: 0.75,0,0.25,0
local_rank: 0
master_port: -1
max_epoch: 100000
max_int: 5
max_len: 512
max_ops: 15
max_ops_G: 4
multi_gpu: False
multi_node: False
n_coefficients: 0
n_dec_layers: 4
n_enc_layers: 4
n_gpu_per_node: 1
n_heads: 4
n_nodes: 1
n_variables: 1
node_id: 0
num_workers: 20
operators: add:10,sub:3,mul:10,div:5,sqrt:4,pow2:4,pow3:2,pow4:1,pow5:1,ln:4,exp:4,sin:4,cos:4,tan:4,asin:1,acos:1,atan:1,sinh:1,cosh:1,tanh:1,asinh:1,acosh:1,atanh:1
optimizer: adam,lr=0.0001
positive: True
precision: 10
reload_checkpoint:
reload_data:
reload_model:
reload_size: -1
rewrite_functions:
same_nb_ops_per_batch: False
save_periodic: 0
share_inout_emb: True
sinusoidal_embeddings: False
stopping_criterion:
tasks: prim_bwd
validation_metrics:
world_size: 1
INFO - 06/25/20 14:02:53 - 0:00:00 - The experiment will be stored in C:\Users\Chan\PycharmProjects\PDE\venv\dumped\prim_bwd_data\9epgwherdq

INFO - 06/25/20 14:02:53 - 0:00:00 - Running command: python main.py --export_data true --dump_path 'C:\Users\Chan\PycharmProjects\PDE\venv\dumped' --batch_size 32 --cpu true --exp_name prim_bwd_data --num_workers 20 --tasks prim_bwd --env_base_seed '-1' --n_variables 1 --n_coefficients 0 --leaf_probs '0.75,0,0.25,0' --max_ops 15 --max_int 5 --positive true --max_len 512 --operators 'add:10,sub:3,mul:10,div:5,sqrt:4,pow2:4,pow3:2,pow4:1,pow5:1,ln:4,exp:4,sin:4,cos:4,tan:4,asin:1,acos:1,atan:1,sinh:1,cosh:1,tanh:1,asinh:1,acosh:1,atanh:1'

WARNING - 06/25/20 14:02:53 - 0:00:00 - Signal handler installed.
INFO - 06/25/20 14:02:53 - 0:00:00 - Unary operators: ['acos', 'acosh', 'asin', 'asinh', 'atan', 'atanh', 'cos', 'cosh', 'exp', 'ln', 'pow2', 'pow3', 'pow4', 'pow5', 'sin', 'sinh', 'sqrt', 'tan', 'tanh']
INFO - 06/25/20 14:02:53 - 0:00:00 - Binary operators: ['add', 'div', 'mul', 'sub']
INFO - 06/25/20 14:02:53 - 0:00:00 - words: {'~~': 0, '~~': 1, '': 2, '(': 3, ')': 4, '<SPECIAL_5>': 5, '<SPECIAL_6>': 6, '<SPECIAL_7>': 7, '<SPECIAL_8>': 8, '<SPECIAL_9>': 9, 'pi': 10, 'E': 11, 'x': 12, 'y': 13, 'z': 14, 't': 15, 'a0': 16, 'a1': 17, 'a2': 18, 'a3': 19, 'a4': 20, 'a5': 21, 'a6': 22, 'a7': 23, 'a8': 24, 'a9': 25, 'abs': 26, 'acos': 27, 'acosh': 28, 'acot': 29, 'acoth': 30, 'acsc': 31, 'acsch': 32, 'add': 33, 'asec': 34, 'asech': 35, 'asin': 36, 'asinh': 37, 'atan': 38, 'atanh': 39, 'cos': 40, 'cosh': 41, 'cot': 42, 'coth': 43, 'csc': 44, 'csch': 45, 'derivative': 46, 'div': 47, 'exp': 48, 'f': 49, 'g': 50, 'h': 51, 'inv': 52, 'ln': 53, 'mul': 54, 'pow': 55, 'pow2': 56, 'pow3': 57, 'pow4': 58, 'pow5': 59, 'rac': 60, 'sec': 61, 'sech': 62, 'sign': 63, 'sin': 64, 'sinh': 65, 'sqrt': 66, 'sub': 67, 'tan': 68, 'tanh': 69, 'I': 70, 'INT+': 71, 'INT-': 72, 'INT': 73, 'FLOAT': 74, '-': 75, '.': 76, '10^': 77, 'Y': 78, "Y'": 79, "Y''": 80, '0': 81, '1': 82, '2': 83, '3': 84, '4': 85, '5': 86, '6': 87, '7': 88, '8': 89, '9': 90}
INFO - 06/25/20 14:02:53 - 0:00:00 - 6 possible leaves.
INFO - 06/25/20 14:02:53 - 0:00:00 - Checking expressions in [0.01, 0.1, 0.3, 0.5, 0.7, 0.9, 1.1, 2.1, 3.1, -0.01, -0.1, -0.3, -0.5, -0.7, -0.9, -1.1, -2.1, -3.1]
INFO - 06/25/20 14:02:53 - 0:00:00 - Training tasks: prim_bwd
INFO - 06/25/20 14:02:53 - 0:00:00 - Number of parameters (encoder): 4231424
INFO - 06/25/20 14:02:53 - 0:00:00 - Number of parameters (decoder): 5286235
INFO - 06/25/20 14:02:53 - 0:00:00 - Found 177 parameters in model.
INFO - 06/25/20 14:02:53 - 0:00:00 - Optimizers: model
INFO - 06/25/20 14:02:53 - 0:00:00 - Data will be stored in prefix in: C:\Users\Chan\PycharmProjects\PDE\venv\dumped\prim_bwd_data\9epgwherdq\data.prefix ...
INFO - 06/25/20 14:02:53 - 0:00:00 - Data will be stored in infix in: C:\Users\Chan\PycharmProjects\PDE\venv\dumped\prim_bwd_data\9epgwherdq\data.infix ...
INFO - 06/25/20 14:02:53 - 0:00:00 - Creating train iterator for prim_bwd ...
Traceback (most recent call last):
File "main.py", line 225, in
main(params)
File "main.py", line 162, in main
trainer = Trainer(modules, env, params)
File "C:\Users\Chan\PycharmProjects\PDE\venv\src\trainer.py", line 140, in init
self.dataloader = {
File "C:\Users\Chan\PycharmProjects\PDE\venv\src\trainer.py", line 141, in
task: iter(self.env.create_train_iterator(task, params, self.data_path))
File "D:\Anaconda\envs\pde\lib\site-packages\torch\utils\data\dataloader.py", line 279, in iter
return _MultiProcessingDataLoaderIter(self)
File "D:\Anaconda\envs\pde\lib\site-packages\torch\utils\data\dataloader.py", line 719, in init
w.start()
File "D:\Anaconda\envs\pde\lib\multiprocessing\process.py", line 121, in start
self._popen = self._Popen(self)
File "D:\Anaconda\envs\pde\lib\multiprocessing\context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "D:\Anaconda\envs\pde\lib\multiprocessing\context.py", line 326, in _Popen
return Popen(process_obj)
File "D:\Anaconda\envs\pde\lib\multiprocessing\popen_spawn_win32.py", line 93, in init
reduction.dump(process_obj, to_child)
File "D:\Anaconda\envs\pde\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
_pickle.PicklingError: Can't pickle f: attribute lookup f on main failed

(pde) C:\Users\Chan\PycharmProjects\PDE\venv>Traceback (most recent call last):
File "", line 1, in
File "D:\Anaconda\envs\pde\lib\multiprocessing\spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "D:\Anaconda\envs\pde\lib\multiprocessing\spawn.py", line 126, in _main
self = reduction.pickle.load(from_parent)
EOFError: Ran out of input

train.log

To me, it looks like its trying to train at the same time or something, which is why it can't find an input (the pickle file doesn't exist yet). In the dump folder, there are the data.infix and prefix files, but they're empty.

Have I inputted the parameters wrong or am missing some kind of step? Any help would be much appreciated, as I am relatively new to coding. Thanks so much in advance!

About the attn_mask

Excuse me, could you tell me that the attn_mask for Masked Multi-Head Attention only use sequence mask but ignore the padding mask? To my konwledge, padding mask is necessay for eliminating the effect of padding.

coefficients

Hi everyone !

I need to use coefficients such as a_0 but I have some trouble to find out how to do so. Of course I changed n_coefficients from 0 to 1 but also the leaf probability but then I don't know how to declare my coefficient let's say a_0 many thanks by advance for those of you that can help me :)

AssertionError

Trying to use this the first time:

$ python main.py --export_data true

SLURM job: False
0 - Number of nodes: 1
0 - Node ID : 0
0 - Local rank : 0
0 - Global rank : 0
0 - World size : 1
0 - GPUs per node : 1
0 - Master : True
0 - Multi-node : False
0 - Multi-GPU : False
0 - Hostname : james-Bonobo-WS
INFO - 04/18/20 15:18:46 - 0:00:00 - ============ Initialized logger ============
INFO - 04/18/20 15:18:46 - 0:00:00 - accumulate_gradients: 1
amp: -1
attention_dropout: 0
balanced: False
batch_size: 32
beam_early_stopping: True
beam_eval: False
beam_length_penalty: 1
beam_size: 1
clean_prefix_expr: True
clip_grad_norm: 5
command: python main.py --export_data true --exp_id "imf0l5hfpl"
cpu: False
debug: False
debug_slurm: False
dropout: 0
dump_path: ./dumped/debug/imf0l5hfpl
emb_dim: 256
env_base_seed: 0
env_name: char_sp
epoch_size: 300000
eval_only: False
eval_verbose: 0
eval_verbose_print: False
exp_id: imf0l5hfpl
exp_name: debug
export_data: True
fp16: False
global_rank: 0
int_base: 10
is_master: True
is_slurm_job: False
leaf_probs: 0.75,0,0.25,0
local_rank: 0
master_port: -1
max_epoch: 100000
max_int: 10000
max_len: 512
max_ops: 10
max_ops_G: 4
multi_gpu: False
multi_node: False
n_coefficients: 0
n_dec_layers: 4
n_enc_layers: 4
n_gpu_per_node: 1
n_heads: 4
n_nodes: 1
n_variables: 1
node_id: 0
num_workers: 10
operators: add:2,sub:1
optimizer: adam,lr=0.0001
positive: False
precision: 10
reload_checkpoint:
reload_data:
reload_model:
reload_size: -1
rewrite_functions:
same_nb_ops_per_batch: False
save_periodic: 0
share_inout_emb: True
sinusoidal_embeddings: False
stopping_criterion:
tasks:
validation_metrics:
world_size: 1
INFO - 04/18/20 15:18:46 - 0:00:00 - The experiment will be stored in ./dumped/debug/imf0l5hfpl

INFO - 04/18/20 15:18:46 - 0:00:00 - Running command: python main.py --export_data true

WARNING - 04/18/20 15:18:46 - 0:00:00 - Signal handler installed.
INFO - 04/18/20 15:18:46 - 0:00:00 - Unary operators: []
INFO - 04/18/20 15:18:46 - 0:00:00 - Binary operators: ['add', 'sub']
INFO - 04/18/20 15:18:46 - 0:00:00 - words: {'~~': 0, '~~': 1, '': 2, '(': 3, ')': 4, '<SPECIAL_5>': 5, '<SPECIAL_6>': 6, '<SPECIAL_7>': 7, '<SPECIAL_8>': 8, '<SPECIAL_9>': 9, 'pi': 10, 'E': 11, '
x': 12, 'y': 13, 'z': 14, 't': 15, 'a0': 16, 'a1': 17, 'a2': 18, 'a3': 19, 'a4': 20, 'a5': 21, 'a6': 22, 'a7': 23, 'a8': 24, 'a9': 25, 'abs': 26, 'acos': 27, 'acosh': 28, 'acot': 29, 'acoth': 30, 'acsc':
31, 'acsch': 32, 'add': 33, 'asec': 34, 'asech': 35, 'asin': 36, 'asinh': 37, 'atan': 38, 'atanh': 39, 'cos': 40, 'cosh': 41, 'cot': 42, 'coth': 43, 'csc': 44, 'csch': 45, 'derivative': 46, 'div': 47, 'ex
p': 48, 'f': 49, 'g': 50, 'h': 51, 'inv': 52, 'ln': 53, 'mul': 54, 'pow': 55, 'pow2': 56, 'pow3': 57, 'pow4': 58, 'pow5': 59, 'rac': 60, 'sec': 61, 'sech': 62, 'sign': 63, 'sin': 64, 'sinh': 65, 'sqrt': 6
6, 'sub': 67, 'tan': 68, 'tanh': 69, 'I': 70, 'INT+': 71, 'INT-': 72, 'INT': 73, 'FLOAT': 74, '-': 75, '.': 76, '10^': 77, 'Y': 78, "Y'": 79, "Y''": 80, '0': 81, '1': 82, '2': 83, '3': 84, '4': 85, '5': 8
6, '6': 87, '7': 88, '8': 89, '9': 90}
INFO - 04/18/20 15:18:46 - 0:00:00 - 20001 possible leaves.
INFO - 04/18/20 15:18:46 - 0:00:00 - Checking expressions in [0.01, 0.1, 0.3, 0.5, 0.7, 0.9, 1.1, 2.1, 3.1, -0.01, -0.1, -0.3, -0.5, -0.7, -0.9, -1.1, -2.1, -3.1]
Traceback (most recent call last):
File "main.py", line 232, in
main(params)
File "main.py", line 167, in main
env = build_env(params)
File "/home/james/Data Science/SymbolicMathematics/src/envs/init.py", line 29, in build_env
assert len(tasks) == len(set(tasks)) > 0
AssertionError

graphics card

I have an intel graphics card.

To verify that your GPU is CUDA-capable, go to your distribution's equivalent of System Properties, or, from the command line, enter:
$ lspci | grep -i nvidia

The result is empty. Do you have suggestions for what parts of the code I can still run? Is running on AWS an option?

Power of variables

Is there any parameter which can restrict the power of the variables being generated for an equation?

Versions of the dependencies

What version of the dependencies (esp. NumPy and SymPy) have been used?
Thank you.

HELP ! RuntimeError: CUDA error: device-side assert triggered

I downloaded this repository containing codes, data sets, and models trained, and tried to run the commands in the ipython notebook given by Dr. Lample. But I get a bug that I cannot solve.
The first 10 Inputs in the ipython notebooks run well, but for the In [11] to Decode with beam search, there throw out an error:

_File "", line 109, in
_, _, beam = decoder.generate_beam(encoded, len1, beam_size=beam_size, length_penalty=1.0, early_stopping=1, max_len=200)

File "D:\LampleCharton2019\SymbolicMathematics-master\src\model\transformer.py", line 544, in generate_beam
cache[k] = (cache[k][0][beam_idx], cache[k][1][beam_idx])

RuntimeError: CUDA error: device-side assert triggered_

The environment in my computer is win10, anaconda3, python3.7.5, pytorch (gpu), torch.cuda.is_available() = true, two Nvidia quadro P4000, they work well in other programs.

FileNotFoundError: [Errno 2] No such file or directory: './dumped/debug\\ij1patlkqx\\params.pkl'

Is this not finding my data? I downloaded and extracted the data and put it in the main.py directory, is this correct?

Process get killed

I run this model on the ibp dataset and it gets killed when loading the training data.My GPU is 2070s and environment is pytorch1.3 devel. Does anybody know how to deal with the problem?
My command:
python main.py --exp_name first_train --fp16 true --amp 2 --tasks "prim_ibp" --reload_data "prim_ibp,prim_ibp.train,prim_ibp.valid,prim_ibp.test" --reload_size 40000000 --emb_dim 1024 --n_enc_layers 6 --n_dec_layers 6 --n_heads 8 --optimizer "adam,lr=0.0001" --batch_size 32 --epoch_size 300000 --validation_metrics valid_prim_fwd_acc
Reaction:INFO - 07/02/20 06:56:43 - 0:00:00 - The experiment will be stored in ./dumped/first_train/dgv5zq039m

INFO - 07/02/20 06:56:43 - 0:00:00 - Running command: python main.py --exp_name first_train --fp16 true --amp 2 --tasks prim_ibp --reload_data 'prim_ibp,prim_ibp.train,prim_ibp.valid,prim_ibp.test' --reload_size '-1' --emb_dim 128 --n_enc_layers 1 --n_dec_layers 1 --n_heads 1 --optimizer 'adam,lr=0.0001' --batch_size 8 --epoch_size 300000 --validation_metrics valid_prim_fwd_acc

WARNING - 07/02/20 06:56:43 - 0:00:00 - Signal handler installed.
INFO - 07/02/20 06:56:43 - 0:00:00 - Unary operators: []
INFO - 07/02/20 06:56:43 - 0:00:00 - Binary operators: ['add', 'sub']
INFO - 07/02/20 06:56:43 - 0:00:00 - words: {'~~': 0, '~~': 1, '': 2, '(': 3, ')': 4, '<SPECIAL_5>': 5, '<SPECIAL_6>': 6, '<SPECIAL_7>': 7, '<SPECIAL_8>': 8, '<SPECIAL_9>': 9, 'pi': 10, 'E': 11, 'x': 12, 'y': 13, 'z': 14, 't': 15, 'a0': 16, 'a1': 17, 'a2': 18, 'a3': 19, 'a4': 20, 'a5': 21, 'a6': 22, 'a7': 23, 'a8': 24, 'a9': 25, 'abs': 26, 'acos': 27, 'acosh': 28, 'acot': 29, 'acoth': 30, 'acsc': 31, 'acsch': 32, 'add': 33, 'asec': 34, 'asech': 35, 'asin': 36, 'asinh': 37, 'atan': 38, 'atanh': 39, 'cos': 40, 'cosh': 41, 'cot': 42, 'coth': 43, 'csc': 44, 'csch': 45, 'derivative': 46, 'div': 47, 'exp': 48, 'f': 49, 'g': 50, 'h': 51, 'inv': 52, 'ln': 53, 'mul': 54, 'pow': 55, 'pow2': 56, 'pow3': 57, 'pow4': 58, 'pow5': 59, 'rac': 60, 'sec': 61, 'sech': 62, 'sign': 63, 'sin': 64, 'sinh': 65, 'sqrt': 66, 'sub': 67, 'tan': 68, 'tanh': 69, 'I': 70, 'INT+': 71, 'INT-': 72, 'INT': 73, 'FLOAT': 74, '-': 75, '.': 76, '10^': 77, 'Y': 78, "Y'": 79, "Y''": 80, '0': 81, '1': 82, '2': 83, '3': 84, '4': 85, '5': 86, '6': 87, '7': 88, '8': 89, '9': 90}
INFO - 07/02/20 06:56:43 - 0:00:00 - 20001 possible leaves.
INFO - 07/02/20 06:56:43 - 0:00:00 - Checking expressions in [0.01, 0.1, 0.3, 0.5, 0.7, 0.9, 1.1, 2.1, 3.1, -0.01, -0.1, -0.3, -0.5, -0.7, -0.9, -1.1, -2.1, -3.1]
INFO - 07/02/20 06:56:43 - 0:00:00 - Training tasks: prim_ibp
INFO - 07/02/20 06:56:43 - 0:00:00 - Number of parameters (encoder): 734464
INFO - 07/02/20 06:56:43 - 0:00:00 - Number of parameters (decoder): 800859
INFO - 07/02/20 06:56:47 - 0:00:03 - Found 51 parameters in model.
INFO - 07/02/20 06:56:47 - 0:00:03 - Optimizers: model
Selected optimization level O2: FP16 training with FP32 batchnorm and FP32 master weights.

Defaults for this optimization level are:
enabled : True
opt_level : O2
cast_model_type : torch.float16
patch_torch_functions : False
keep_batchnorm_fp32 : True
master_weights : True
loss_scale : dynamic
Processing user overrides (additional kwargs that are not None)...
After processing overrides, optimization options are:
enabled : True
opt_level : O2
cast_model_type : torch.float16
patch_torch_functions : False
keep_batchnorm_fp32 : True
master_weights : True
loss_scale : dynamic
INFO - 07/02/20 06:56:47 - 0:00:03 - Creating train iterator for prim_ibp ...
INFO - 07/02/20 06:56:47 - 0:00:03 - Loading data from prim_ibp.train ...
Killed

Is it possible to extract embeddings for an equation?

I'd like to be able to compare equation similarity. I wonder if I can get this model to extract embeddings for 2 particular equation inputs.

Is it possible to do evaluation on multiple gpus?

Thanks for your great works!

I want to do evaluation for pre-trained models with beam size 10 on multiple gpus, because I cannot run evaluation on single gpu due to out of memory, though the gpu is TITAN X (Pascal) with 12GB of memory.

However, when I run evaluation on multiple gpus, it throws up the following error:

AttributeError: 'DistributedDataParallel' object has no attribute 'generate_beam'

How can I do in this situation?

Thanks!

No such file or directory: './dump\first_eval\5mqaq8pt0a\params.pkl'

Instruction is as follows:
python main.py --exp_name first_eval --eval_only true --reload_model "./fwd_bwd_ibp.pth" --reload_data "./prim_ibp.test" --beam_eval true --beam_size 10 --emb_dim 1024
--n_enc_layers 6 --n_dec_layers 6 --n_heads 8 --dump_path ./dump
The error record is as follows:
Traceback (most recent call last):
File "main.py", line 232, in
main(params)
File "main.py", line 156, in main
logger = initialize_exp(params)
File "D:\code\python\SymbolicMathematics\src\utils.py", line 57, in initialize_exp
pickle.dump(params, open(os.path.join(params.dump_path, 'params.pkl'), 'wb'))
FileNotFoundError: [Errno 2] No such file or directory: './dump\first_eval\5mqaq8pt0a\params.pkl'

multi-gpu

hello, I want to run your code on a dual graphics machine, But I encountered the following problems：

Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.

Traceback (most recent call last):
File "main.py", line 256, in
check_model_params(params)
File "/public/home/pw/workspace/symbolicmathematics/SymbolicMathematics-master/src/model/init.py", line 27, in check_model_params
assert os.path.isfile(params.reload_model)
AssertionError
Traceback (most recent call last):
File "main.py", line 256, in
check_model_params(params)
File "/public/home/pw/workspace/symbolicmathematics/SymbolicMathematics-master/src/model/init.py", line 27, in check_model_params
assert os.path.isfile(params.reload_model)
AssertionError
Traceback (most recent call last):
File "/public/home/pw/anaconda3/envs/mathematics/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/public/home/pw/anaconda3/envs/mathematics/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/public/home/pw/anaconda3/envs/mathematics/lib/python3.7/site-packages/torch/distributed/launch.py", line 253, in
main()
File "/public/home/pw/anaconda3/envs/mathematics/lib/python3.7/site-packages/torch/distributed/launch.py", line 249, in main
cmd=cmd)
subprocess.CalledProcessError: Command '['/public/home/pw/anaconda3/envs/mathematics/bin/python', '-u', 'main.py', '--local_rank=1', '--exp_name', 'first_eval', '--eval_only', 'true', '--reload_model', 'fwd.pth', '--tasks', 'prim_fwd', '--reload_data', 'prim_fwd,prim_fwd.train,prim_fwd.valid,prim_fwd.test', '--emb_dim', '1024', '--n_enc_layers', '6', '--n_dec_layers', '6', '--n_heads', '8', '--beam_eval', 'true', '--beam_size', '10', '--beam_length_penalty', '1.0', '--beam_early_stopping', '1', '--eval_verbose', '1', '--eval_verbose_print', 'false']' returned non-zero exit status 1.

I run it completely on a single graphics card,My complete command is as follows

$NGPU = 2；
python -m torch.distributed.launch --nproc_per_node=$NGPU main.py --exp_name first_eval --eval_only true --reload_model "fwd.pth" --tasks "prim_fwd" --reload_data "prim_fwd,prim_fwd.train,prim_fwd.valid,prim_fwd.test" --emb_dim 1024 --n_enc_layers 6 --n_dec_layers 6 --n_heads 8 --beam_eval true --beam_size 10 --beam_length_penalty 1.0 --beam_early_stopping 1 --eval_verbose 1 --eval_verbose_print false

my Python version is 3.7.10, PyTorch version is 1.3.0 and torchversion is 0.4.1, Maybe my version of PyTorch is wrong or I need to modify the default parameter of local_rank?

Thank You.

What is leaf_probs?

In the configuration of the script, there is this setting:
--leaf_probs "0.75,0,0.25,0" #leaf sampling probabilities

What is it?

When generating random binary tree, what's the meaning of empty nodes?

According to https://github.com/facebookresearch/SymbolicMathematics/blob/master/src/envs/char_sp.py,

D[e][n] represents the number of different binary trees with n nodes that
can be generated from e empty nodes, using the following recursion:
D(0, n) = 0
D(1, n) = C_n (n-th Catalan number)
D(e, n) = D(e - 1, n + 1) - D(e - 2, n + 1)

I understand D(1,n) as number of different random binary tree with n node, but what is the meaning of e "empty node" in D(e,n)

about model predict

I am a beginner in this area, so the questions may be incorrect
When a model is trained, I want to use the model to complete a certain "prediction task", so I have some words to be "predicted".
The question is : are "eval_only" and "predicted" not the same thing? because i found in the "eval_only" related program that I still need to input the prediction results into the decoder, instead of inputting the results of the previous step prediction into the decoder as most tutorials say.
Looking forward to and thank you for your reply!

facebookresearch / symbolicmathematics Goto Github PK

symbolicmathematics's Introduction

Deep Learning for Symbolic Mathematics

Dependencies

Datasets and Trained Models

Data generation

Training

Evaluation

Frequently Asked Questions

How can I run experiments on multiple GPUs?

How can I use this code to train a model on a new task?

References

License

symbolicmathematics's People

Contributors

Stargazers

Watchers

Forkers

symbolicmathematics's Issues

Recommend Projects

Recommend Topics

Recommend Org