jpuigcerver / laia Goto Github PK

View Code? Open in Web Editor NEW

144.0 20.0 57.0 1.84 MB

Laia: A deep learning toolkit for HTR based on Torch

License: MIT License

Lua 33.77% Shell 55.19% Python 3.97% Perl 4.69% R 1.95% Dockerfile 0.43%

deep-learning htr torch

laia's People

Contributors

Stargazers

Watchers

laia's Issues

Current Laia version seems broken

I've been trying to train on Spanish Numbers dataset and the CER stays at 100%. I've tried to replicate my results on Lope dataset and after 8 epochs the CER is ~100% (before was <90%).

I think that a commit after sep 15 has broken something.

luajit runs out of memory when training with very big datasets

The estimation of confidence intervals using tables makes luajit to run out of memory in "EpochSummarizer:summarize" function. Heap size is limited to 2gb.

ERROR: Creating file "data/lang/forms/word/aachen/tr.txt"!

Hey @jpuigcerver

One more issue I found when I run:

./steps/prepare_iam_text.sh --partition aachen;
awk: line 2: syntax error at or near ,
awk: line 7: syntax error at or near else
awk: line 10: syntax error at or near }
ERROR: Creating file "data/lang/forms/word/aachen/tr.txt"!

I had that issue on my previous run as well, but I workaround it by not creating tr.txt file for words. Since I am creating data again I would like to know why this issue happen.

Thanks!

A Problem about image distortion

Hi, @jpuigcerver

I am reproducing your work by PyTorch (it is more scalable for me to do more base on your work).

In my experiment, without BN and image distortion, the cer on validation set is 5.45% and the cer on test set is 8.66%. This result is near your baseline result. Now I am trying to add BN into the model and add image distortion to image preprocessing.

My problem is how can I use the same parameter of image distortion. In my opinion, I can implment the affine transformation and morphological operation by opencv-python. But I don't know how to use your parameters. Can you help me?

Trained Model for testing on sample pages

Is there any trained model available so that I can run the HTR on a few English handwritten pages.

thanks

2D-LSTM

Hello and thanks for your great work!

I am going to train the 2D-LSTM model but the file "create-model-aachen.lua" is missing. Do you have any implementation of 2D-LSTM model for handwriting recognition?

Thanks

Dropout

Dropout in cudnn is applied to input of each layer (https://github.com/soumith/cudnn.torch/issues/197). Fix the code.

No LuaRocks module found for laia.util.mem

In my install/share/lua/5.1/laia/util directory I see the 13 files corresponding to:
["laia.util.argparse"] = "laia/util/argparse.lua",
["laia.util.base"] = "laia/util/base.lua",
["laia.util.cudnn"] = "laia/util/cudnn.lua",
["laia.util.decode"] = "laia/util/decode.lua",
["laia.util.format"] = "laia/util/format.lua",
["laia.util.io"] = "laia/util/io.lua",
["laia.util.log"] = "laia/util/log.lua",
["laia.util.math"] = "laia/util/math.lua",
["laia.util.rand"] = "laia/util/rand.lua",
["laia.util.string"] = "laia/util/string.lua",
["laia.util.table"] = "laia/util/table.lua",
["laia.util.torch"] = "laia/util/torch.lua",
["laia.util.types"] = "laia/util/types.lua"

but when I try and run the laia-create-model for the Spanish Numbers experiment it complains that it cannot find laia.util.mem

module 'trepl' not found:

I am having an error:

to try solved it, i put the trepl folder into the steps folder, but the error has persist

./steps/train_lstm1d.sh/train/aachen/lstm1d_h128.t7
/usr/bin/luajit: /usr/lib/torch-trepl/th:103: module 'trepl' not found:
no field package.preload['trepl']
no file '/home/suporte/Área de Trabalho/LAIA/Laia-master/egs/iam/../../trepl/init.lua'
no file '/home/suporte/Área de Trabalho/LAIA/Laia-master/egs/iam/../../trepl.lua'
no file './trepl.so'
no file '/usr/local/lib/lua/5.1/trepl.so'
no file '/usr/lib/x86_64-linux-gnu/lua/5.1/trepl.so'
no file '/usr/local/lib/lua/5.1/loadall.so'
stack traceback:
[C]: in function 'require'
/usr/lib/torch-trepl/th:103: in main chunk
[C]: at 0x55ce9144b1d0

Please can somebody help me ?

Rimes Dataset not accessible anymore

Hi All - Ive been trying to get access to the RImes Dataset and have signed and sent the NDA - no luck so far. Is there an altenative Repo available?

Don't update Version.lua unless lua code changes

Currently, every time a commit is made the Version.lua file is updated with the current date and time, even if no Lua code was modified.

Although it is not very important, I would suggest to modify this behavior and update only the timestamp there when some lua file changes. I think we'll get cleaner diffs and better statistics with this change.

some regex not making any change in train/test data

Laia/egs/iam/steps/iam_tokenize.py

Line 18 in c6bb8ab

(re.compile(r'(``)'), r' \1 '),

Hi Joan,
To better understand function of each regex in steps/iam_tokenize.py, I did the following:
For File a)

commented out all regex in steps/iam_tokenize.py.
ran steps/prepare_iam_text.sh
For file b)
commented out all regex except one in steps/iam_tokenize.py.
ran steps/prepare_iam_text.sh

Diff file a and file b.
For some regex, I understood that it is making following changes:

Mr. -> Mr .
settlers' -> settlers '
Foot's -> Foot 's
don't -> do n't
cannot -> can not
Bru"cke -> Bru " cke
you've -> you 've

But some regex are not making any change, for example:

STARTING_QUOTES
PARENS_BRACKETS
CONTRACTIONS3

next to STARTING_QUOTES, it is mentioned that: # This line changes: do not replace "

But I am not able to understand purpose if these regex. Are they for brown, wellington corpus?
Thanks,
Ashish

Fix/Remove WidthBatcher?

WidthBatcher is deprecated since it still assumes HDF5 files, which are not longer used. Maybe we can just remove it.

Prepare image error from egs/iam

Hi,

I am using ImageMagick6 and at the step of preparing image by running ./steps/prepare_images.sh, I got the following error. Do you have any idea? Thanks.

Failed image processing:
imgtxtenh: enhancing by Sauvola: width=80pixels, mfct=0.2, sfct=0.5
imgtxtenh: removed 1 small components from a total of 18 (min. area: 6 pixels^2)
convert: magick/cache.c:306: AddOpenCLEvent: Assertion `cache_info->opencl != (OpenCLCacheInfo *) ((void *)0)' failed.
./steps/prepare_images.sh: line 40: 1669 Done imgtxtenh -d 118.110 "$1" png:-
1670 Aborted (core dumped) | convert png:- -deskew 40% -bordercolor white -border 5 -trim -bordercolor white -border 20x0 +repage -strip "data/imgs/$partition/$bn.jpg"
ERROR: Processing image data/original/lines/a01-000u-00.png

IAM database example fails

When trying to run the IAM database example i get the following error:

laia.ImageDistorter {
dilate_rrate = 1
translate_stdv = 0.02
shear_prec = 4
rotate_prob = 0.5
erode_prob = 0.5
translate_prob = 0.5
erode_srate = 0.8
scale_prob = 0.5
erode_rrate = 1.2
dilate_srate = 0.4
dilate_prob = 0.5
rotate_prec = 100
scale_stdv = 0.12
shear_prob = 0.5
}
[2019-02-27 17:06:43 INFO] /opt/torch/share/lua/5.1/laia/CTCTrainer.lua:98: CTCTrainer uses the weight regularizer:
laia.WeightDecayRegularizer {
weight_l2_decay = 0
weight_l1_decay = 0
}
[2019-02-27 17:06:43 INFO] /opt/torch/share/lua/5.1/laia/CTCTrainer.lua:88: CTCTrainer uses the adversarial regularizer:
laia.AdversarialRegularizer {
adversarial_weight = 0
adversarial_epsilon = 0.0019607843137255
}
/opt/torch/bin/luajit: /opt/torch/share/lua/5.1/nn/Container.lua:67:
In 7 module of nn.Sequential:
/opt/torch/share/lua/5.1/cudnn/init.lua:166: Error in CuDNN: CUDNN_STATUS_EXECUTION_FAILED (cudnnSetDropoutDescriptor)
stack traceback:
[C]: in function 'error'
/opt/torch/share/lua/5.1/cudnn/init.lua:166: in function 'errcheck'
/opt/torch/share/lua/5.1/cudnn/RNN.lua:130: in function 'resetDropoutDescriptor'
/opt/torch/share/lua/5.1/cudnn/RNN.lua:526: in function </opt/torch/share/lua/5.1/cudnn/RNN.lua:449>
[C]: in function 'xpcall'
/opt/torch/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
/opt/torch/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
/opt/torch/share/lua/5.1/laia/CTCTrainer.lua:453: in function '_fbPass'
/opt/torch/share/lua/5.1/laia/CTCTrainer.lua:371: in function '_trainBatch'
/opt/torch/share/lua/5.1/laia/CTCTrainer.lua:307: in function 'opfunc'
/opt/torch/share/lua/5.1/optim/rmsprop.lua:35: in function '_optimizer'
/opt/torch/share/lua/5.1/laia/CTCTrainer.lua:305: in function 'trainEpoch'
/opt/torch/lib/luarocks/rocks/laia/scm-1/bin/laia-train-ctc:303: in main chunk
[C]: at 0x00405d50

WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above.
stack traceback:
[C]: in function 'error'
/opt/torch/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
/opt/torch/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
/opt/torch/share/lua/5.1/laia/CTCTrainer.lua:453: in function '_fbPass'
/opt/torch/share/lua/5.1/laia/CTCTrainer.lua:371: in function '_trainBatch'
/opt/torch/share/lua/5.1/laia/CTCTrainer.lua:307: in function 'opfunc'
/opt/torch/share/lua/5.1/optim/rmsprop.lua:35: in function '_optimizer'
/opt/torch/share/lua/5.1/laia/CTCTrainer.lua:305: in function 'trainEpoch'
/opt/torch/lib/luarocks/rocks/laia/scm-1/bin/laia-train-ctc:303: in main chunk
[C]: at 0x00405d50

Sadly i could not find any information anywhere what could be causing this error.
I was alble to run the spanish numbers example, but also encountered this same error when i tried to run the example with my own dataset.

Edit: As a clarification, i'm using the latest docker image of laia. I modified the scripts in the IAM steps folder to use the laia-docker commands instead of the normal ones.

Add methods to propagate the sizes of the samples in the batch

We should two methods to the layer modules (e.g. nn.* and cudnn.* modules). All these modules interact with our tools through two main methods: forward(input) and backward(input, gradOutput), that perform the forward and backpropagation passes through the module, respectively. It would be nice to have two similar methods that also receive the input and output sizes, in the case that the different batch elements have different sizes. For instance, consider the LSTM layer from the cudnn package:

layer:forward_with_sizes(input, inputSizes), this would return the output tensor AND a tensor with the output sizes. If the sizes do not change, it can just return inputSizes. For instance, in the Maxpool layer, the output sizes would need to be computed according to the stride/size parameters of the pooling.

layer:backward_with_sizes(input, gradOutput, outputSizes), this would return the gradInput and gradInputSizes tensors, to be passed to the previous layers during backpropagation.

Re-using a network seems to be not working

Relaunching train.lua in an already trained network throws an error regarding model_opt

Pixel batcher

It would be very interesting to have a Pixel batcher, so that instead of sampling a fixed number of images, to sample an upper bound of pixels. This way, we could make a more efficient use of the GPU memory, and the batch size would not be affected by the largest image size, as it is affected now (batches with bigger images would contain less samples).

Aachen's RETURNN supports this options and I think it's very interesting.

problem in decode_net.sh

@jpuigcerver
I am running the Laia using my own dataset which is formatted as in Bentham dataset. The training is running good but when I run the decode code on test and validation parts, I get an error:

hyp[line[0]] = line[1:]
IndexError: list index out of range

I checked and It seems that the decode/no_lm/char/va_.txt decode/no_lm/char/te_.txt contains space lines which gives the error above!
I added this line in compute-errors.py

for line in args.hypothesis:                                                                                                           │
        line = line.split()                                                                                                                │
        if len(line)!=0:                                                                                                                   │
           hyp[line[0]] = line[1:]

The code is running but it does not show anything!!!

Missing file in RIMES recipe

Hi! I'm trying to execute the RIMES recipe but there's a missing file when I get to the decoding step:
./utils/prepare_word_lexicon_from_boundaries.sh

This script is called inside the decode_lm.sh:

# Build lexicon from the boundaries file. lexiconp=data/lang/forms/word/lexiconp.txt; [ "$overwrite" = false -a -s "$lexiconp" ] || ./utils/prepare_word_lexicon_from_boundaries.sh \ data/lang/forms/word/tr_boundaries.txt > "$lexiconp" || { echo "ERROR: Creating file \"$lexiconp\"!" >&2 && exit 1; }

Can I replace it with ./utils/build_word_lexicon.sh without no side effects?

Thank you!

Problems with Cristo-salvador

./run.sh

Required tool imgtxtenh was not found!

<eps> symbol

Hi there

I have a question, what is the "eps" symbol you add to syms.txt file? I know you add "ctc" for blank and "space" for space but I didn't see any explanation about "eps".

Thanks

Make a lua package from Laia

It would be nice to have a Lua package that could be installed from luarocks or similar tools that installs Laia in the default Torch directory. This would be useful to run the tools without the need of appending the directory where Laia is installed to the LUAPATH env. variable.

Erroneous name in Curriculum_learning_from

Instead of from is until

Process mini-batch as N blocks to reduce GPU memory usage

In train.lua, provide the possibility to divide each mini-batch in N blocks, processing each block in GPU sequentially. This is so that training can be performed for large mini-batches (relative to the amount of GPU memory) in GPUs with limited memory, or to leave room for other processes using the GPU.

IAM - Missing file - all.txt

For the IAM example, the first two commands complete successfully but the third command results in an error:
./steps/download.sh --iam_user "$IAM_USER" --iam_pass "$IAM_PASS";
./steps/prepare_images.sh;
./steps/prepare_iam_text.sh --partition aachen;
awk: cannot open data/lang/lines/char/all.txt (No such file or directory)
ERROR: Creating file data/lang/lines/char/all_wspace.txt

Missing MDRNN.lua file

Running the installation command leads to a missing file error:

$ luarocks install https://raw.githubusercontent.com/jpuigcerver/Laia/master/rocks/laia-scm-1.rockspec
Using https://raw.githubusercontent.com/jpuigcerver/Laia/master/rocks/laia-scm-1.rockspec... switching to 'build' mode
Cloning into 'Laia'...
remote: Counting objects: 248, done.
remote: Compressing objects: 100% (212/212), done.
remote: Total 248 (delta 43), reused 133 (delta 19), pack-reused 0
Receiving objects: 100% (248/248), 1.13 MiB | 18.00 KiB/s, done.
Resolving deltas: 100% (43/43), done.
Checking connectivity... done.
cp: cannot stat 'laia/nn/MDRNN.lua': No such file or directory

Error: Build error: Failed installing laia/nn/MDRNN.lua in /home/andrew/torch/install/lib/luarocks/rocks/laia/scm-1/lua/laia/nn/MDRNN.lua

Which seems to be from this line: https://github.com/jpuigcerver/Laia/blob/master/rocks/laia-scm-1.rockspec#L58

ERROR: Kaldi's compute-wer was not found in your PATH!

I have had several problems dealing with Laia.

First of all the problem of
[2017-01-14 17:18:53 FATAL] /home/rparedes/torch/install/share/lua/5.1/cudnn/RNN.lua:252: dropout supported only in cudnn v5.1 and above

I have 5.05 but rnn_dropout is set to zero. I have modified the lea files to forced it to be zero. Finally i have had to modify RNN.lua an set there self.dropout=0 in order to pass the assert.

After that i can run the run.sh:

Creating transcripts... Done.
Creating symbols table... Done.
Preprocessing images... Done.
ERROR: Kaldi's compute-wer was not found in your PATH!

and i get this error. Perhaps it is related to my modification and any model is created etc... i do not know, the run.sh doesn't pass the th ../../laia-create-model step.

Any idea?

Meanwhile I wil update the cudann to the 5.1

Using language modelling with word level recognition

Hi @jpuigcerver,
Could you please point me to how I could modify the existing Laia framework to create a character-level language model which I could use for decoding the output of a word-level recognition system based on IAM? (similarly to how currently a word-level language model is used with line-level recognizer, say for IAM)
Thanks

WOrds.xml download

Hey what if I want t download the xml file of the words
from IAM is there a way to do it?

Add debugging tools

Convenient debugging tools:

Histogram of outputs and gradients for each layer
Plot weights and gradWeights as images
Plot inputs/outputs as images
Notify if the images are reduced too much for the expected output, according with CTC

I started implementing some of them, but never finished it. They are very useful to analyze why the CTC training does not make any progress.

Where to obtain lob_excludealltestsets.txt?

Hi, thanks for creating these instructions. In README.md, there is a line that says:

Your team probably has three text files named brown.txt, lob_excludealltestsets.txt and wellington.txt.

My group has access to Brown and LOB, but we don't know where to obtain lob_excludealltestsets.txt. Would you be able to provide some clarification? Thanks.

Fix curriculum learning batcher

The Curriculum Learning batcher had many bugs, some were fixed but it probably doesn't work properly yet.

There are several publications that mention the benefits of using it, instead of pure (Batched) Stochastic Gradient Descent, so we should fix the Curriculum Learning batcher and try to use it for training.

Transfer weights of lua model to pytorch model

@jpuigcerver Thanks for your answer. Actually, I build your model using pytorch and I try to extract the weights form your lua pretrained model and transfer them to my pytorch model. I do the convolution parts correctly but I have a problem about the cudnn.BLSTM in lua and nn.lstm() in pytorch. It seems that in lua, all the parameters are flatten but not in nn.lstm() see my post https://stackoverflow.com/questions/55507905/cudnn-blstm-weight-and-biaises. How to reflatten the Lua weights and put them in the right order and place in pytroch model.

Add command-line options registrar in classes

Currently, each tool (trainer, decoder, etc) defines command-line options and passes them to the different objects that they use (batcher, distorter, trainer, etc).

This means that several options are duplicated across different tools, meaning potential inconsistencies in the description, default parameters, etc. Also, this means that each tool has to keep track of all the options in each of the classes that they use.

It would be more appropriate if the classes had two default methods: register_options(cmd) and parse_options(opts), to register to the command line parser and to parse the options read from the command line.

How can I download RIMES data?

I am wondering how can I download RIMES data such as training_2011_gray.tar.
I cannot visit the web domain www.a2ialab.com.
Can anybody give me any hints?
Thank you.

Spanish numbers

since the run.sh failed i followed the instructions in README

When running:
../../laia-train-ctc
--adversarial_weight 0.5
--batch_size "$batch_size"
--log_also_to_stderr info
--log_level info
--log_file laia.log
--progress_table_output laia.dat
--use_distortions true
--early_stop_epochs 100
--learning_rate 0.0005
model.t7 data/lang/chars/symbs.txt
data/train.lst data/lang/chars/train.txt
data/test.lst data/lang/chars/test.txt;

There are several incorrect paths because "/chars/ should be "/char/"

torch-cuda/Dockerfile is redundant

install.sh already creates (using luarocks) the following rocks: torch, cutorch, cunn and cudnn. So L38-L42 should be removed.

This can potentially lead to problems. If a particular Torch7 version is checkout'd (at L29), the more recent version of the rocks will be installed, instead of the particular ones.

IAM dataset currently on 267 epoch with ~6.2% cer ...

Hi @jpuigcerver ,

I train IAM dataset following the README instructions https://github.com/jpuigcerver/Laia/tree/master/egs/iam . I am currently on ~267 epoch with ~6.2% cer on val set. Since, the readme file says that the 3.8% cer will be reached at 80 epoch, just wondering if there is any change that I am not aware of.

IAM dataset: 6176 training samples, 976 val samples

Error due to ImageDistorter.addCmdOptions(cmd)

/opt/torch/install/bin/luajit: ./src/TrainOptions.lua:62: attempt to call field 'addCmdOptions' (a nil value)
stack traceback:
./src/TrainOptions.lua:62: in function 'parse'
./train.lua:39: in main chunk
[C]: in function 'dofile'
/opt/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x004065d0

Convert the pretrained model

@jpuigcerver Hi, thanks for sharing this well written code. Did you try to convert convert the pretrained model to pytorch model ? Actually, I am trying to do it but still having problems

jpuigcerver / laia Goto Github PK

laia's People

Contributors

Stargazers

Watchers

Forkers

laia's Issues

Is there any trained model available so that I can run the HTR on a few English handwritten pages.

Recommend Projects

Recommend Topics

Recommend Org