halfish / lstm-ctc-ocr Goto Github PK

View Code? Open in Web Editor NEW

73.0 6.0 22.0 12.52 MB

using rnn (lstm or gru) and ctc to convert line image into text, based on torch7 and warp-ctc

License: Apache License 2.0

Python 7.40% Lua 92.60%

ctc recurrent-neural-networks lstm ocr

lstm-ctc-ocr's People

Contributors

Stargazers

Watchers

lstm-ctc-ocr's Issues

Bad argument #1 to 'set'

I get this output when I tried to run 3_phonernn.lua:

train size = 8, valid size = 2 nn.Sequential { [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> output] (1): nn.SplitTable (2): nn.Sequencer @ nn.FastLSTM(36 -> 128) (3): nn.Sequencer @ nn.Recursor @ nn.ReLU (4): nn.Sequencer @ nn.Recursor @ nn.BatchNormalization (5): nn.Sequencer @ nn.Recursor @ nn.Dropout(0.5, busy) (6): nn.Sequencer @ nn.Recursor @ nn.Linear(128 -> 12) (7): nn.JoinTable (8): nn.View(255, 12) } [=================== 1/1 =====================>] Tot: 0ms | Step: 0ms /home/ubuntu/torch/install/bin/luajit: /home/ubuntu/torch/install/share/lua/5.1/nn/Container.lua:67: In 8 module of nn.Sequential: /home/ubuntu/torch/install/share/lua/5.1/torch/Tensor.lua:458: bad argument #1 to 'set' (expecting number or torch.DoubleTensor or torch.DoubleStorage at /home/ubuntu/torch/pkg/torch/generic/Tensor.c:1125)

What's happening here? How do I solve this?
I've run 1_generateImage.py and 2_dump.lua successfully prior to this. Thanks in advance.

Query : How to work with variable width input images?

@Halfish the images of the numeral-sequences you create are all of the same dimensions (36 x 255). How can one go about using variable width images instead? (36 x var-width)

I see that you feed in a complete image at a time to the model (not as 255-strips of 36pix-height of the image -- a sequence of pixel-strips). Can you suggest how this can be modified to incorporate variable length inputs?

halfish / lstm-ctc-ocr Goto Github PK

lstm-ctc-ocr's People

Contributors

Stargazers

Watchers

Forkers

lstm-ctc-ocr's Issues

Bad argument #1 to 'set'

Query : How to work with variable width input images?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent