guillaumegenthial / im2latex Goto Github PK
View Code? Open in Web Editor NEWImage to LaTeX (Seq2seq + Attention with Beam Search) - Tensorflow
License: Apache License 2.0
Image to LaTeX (Seq2seq + Attention with Beam Search) - Tensorflow
License: Apache License 2.0
got OOM when doing training:
Environment:
tf: 1.4
GPU: Titan X
python 2.7
Ubuntu 16.04
Error:
2018-01-07 22:12:42.933166: W tensorflow/core/framework/op_kernel.cc:1192] Resource exhausted: OOM when allocating tensor with shape[34560,1]
Traceback (most recent call last):
File "train.py", line 61, in
main()
File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 722, in call
return self.main(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 697, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "train.py", line 57, in main
model.train(config, train_set, val_set, lr_schedule)
File "/home/hope/im2latex-1/model/base.py", line 160, in train
lr_schedule)
File "/home/hope/im2latex-1/model/img2seq.py", line 173, in _run_epoch
feed_dict=fd)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 889, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1120, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1317, in _do_run
options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1336, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[34560,1]
[[Node: attn_cell/rnn/while/rnn/att_mechanism/MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](attn_cell/rnn/while/rnn/att_mechanism/Reshape, attn_cell/rnn/while/rnn/att_mechanism/MatMul/Enter)]]
[[Node: Mean/_85 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2674_Mean", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
Caused by op u'attn_cell/rnn/while/rnn/att_mechanism/MatMul', defined at:
File "train.py", line 61, in
main()
File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 722, in call
return self.main(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 697, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "train.py", line 56, in main
model.build_train(config)
File "/home/hope/im2latex-1/model/img2seq.py", line 41, in build_train
self._add_pred_op()
File "/home/hope/im2latex-1/model/img2seq.py", line 119, in _add_pred_op
self.dropout)
File "/home/hope/im2latex-1/model/decoder.py", line 60, in call
initial_state=attn_cell.initial_state())
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn.py", line 614, in dynamic_rnn
dtype=dtype)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn.py", line 777, in _dynamic_rnn_loop
swap_memory=swap_memory)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 2816, in while_loop
result = loop_context.BuildLoop(cond, body, loop_vars, shape_invariants)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 2640, in BuildLoop
pred, body, original_loop_vars, loop_vars, shape_invariants)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 2590, in _BuildLoop
body_result = body(*packed_vars_for_body)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn.py", line 762, in _time_step
(output, new_state) = call_cell()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn.py", line 748, in
call_cell = lambda: cell(input_t, state)
File "/home/hope/im2latex-1/model/components/attention_cell.py", line 109, in call
new_output, new_state = self.step(inputs, state)
File "/home/hope/im2latex-1/model/components/attention_cell.py", line 79, in step
c = self._attention_mechanism.context(new_h)
File "/home/hope/im2latex-1/model/components/attention_mechanism.py", line 83, in context
e = tf.matmul(att_flat, att_beta)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/math_ops.py", line 1898, in matmul
a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_math_ops.py", line 2437, in _mat_mul
name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2960, in create_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1473, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[34560,1]
[[Node: attn_cell/rnn/while/rnn/att_mechanism/MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](attn_cell/rnn/while/rnn/att_mechanism/Reshape, attn_cell/rnn/while/rnn/att_mechanism/MatMul/Enter)]]
[[Node: Mean/_85 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2674_Mean", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
Hi Guillaume,
Thanks for the great model!
I've trained the model, but when I do prediction every time I get a different result. Here is an example, For the attached image for same model weights, I get below different results.
Could you please help with that.
Thanks!
when I run the following command
python train.py --data=configs/data.json --vocab=configs/vocab.json --training=configs/training.json --model=configs/model.json --output=results/full/
the result is
Loaded 76322 formulas from data/train.formulas.norm.txt
Bucketing the dataset...
Traceback (most recent call last):
File "train.py", line 61, in
main()
File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 764, in call
return self.main(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "train.py", line 36, in main
form_prepro=vocab.form_prepro)
File "/home/rootx/me/im2latex-master/model/utils/data_generator.py", line 76, in init
self._set_data_generator()
File "/home/rootx/me/im2latex-master/model/utils/data_generator.py", line 84, in _set_data_generator
self._data_generator = self.bucket(self._bucket_size)
File "/home/rootx/me/im2latex-master/model/utils/data_generator.py", line 104, in bucket
for idx, (img, formula, img_path, formula_id) in enumerate(self):
File "/home/rootx/me/im2latex-master/model/utils/data_generator.py", line 201, in iter
result, skip = self._process_instance(example)
File "/home/rootx/me/im2latex-master/model/utils/data_generator.py", line 173, in _process_instance
img = self._img_prepro(img)
File "/home/rootx/me/im2latex-master/model/utils/image.py", line 52, in greyscale
state = state[:, :, 0]*0.299 + state[:, :, 1]*0.587 + state[:, :, 2]*0.114
IndexError: index 2 is out of bounds for axis 2 with size 2
root@rootx-virtual-machine:/home/rootx/me/im2latex-master#
How to fix it?
Thanks a lot !
i have specified the model.train to use the gpu, but it doesnot work. how to train the model with gpu?
This isn't so much of an issue but rather a discussion of sorts.
First off, I'm very impressed with this project and I am looking to try and use this in one of my projects. This is where the "question" comes in.
(Not to be rude) but the main page isn't very clear on how we actually run this. The training steps were relatively clear (have yet to try it) but then how could we say take a picture's local path and "predict" the latex formula with it? Sorry if this isn't very clear what I'm asking, basically I just need some clarification on how to use this?
Thanks in advance
I have my data according to standard im2latex format from the original torch implementation ,
I have these final files:
formulas.lst
formulas.norm.lst
vocab.txt
final_images/ ( images after all rendering and preprocessing )
train.lst, test.lst and validation.lst ( after running through filtering )
Is the above data compatible with this project ?
My training code uses 2 gpu, with batch size=20, however when I run evaluation its only running on cpu, Is this normal ?
请问这个库训练im2latex 100k norm 或者tok 上的指标能到多少?
bleu-4 完全匹配,
i set batch_size=256,but mem use only 309MB,
I notice that predict.py
basically does this, but it seems to start an interactive session. For downstream tools that depend on this, a CLI would be useful. I imagine you would run it something like predict.py image.png
and it would return some text string via stdout.
使用这份代码在im2latex 100k 上面训练,得到如下的指标:和论文上的Em76相差有些大,anyone meet the same situation?
Eval: BLEU-4 88.89 - EM 37.86 - Edit 91.88 - perplexity -1.14
After I train the model on the full data, I run the 'predict.py', but there isnt any result?
Thank you for providing the great source code.
I got BLEU-4 83.68 by training the full dataset.
how to tool for a custom image to latex ?
I set
import os
os.environ['CUDA_VISIBLE_DEVICES']='0,1,2,3'
The program is still single GPU training
The "body" function for the tf.while_loop extracts final decoding results time step by time step.
But the state "parents" has not been updated in the body function!
def body(time, outputs_ta, parents):
... (no update of parents) ...
return (time + 1), outputs_ta, parents
This should be as the following:
return (time + 1), outputs_ta, input_t.parents
since parents for the next step are stored in "input_t" which is extracted for the current time step.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.