halfish / lstm-ctc-ocr Goto Github PK
View Code? Open in Web Editor NEWusing rnn (lstm or gru) and ctc to convert line image into text, based on torch7 and warp-ctc
License: Apache License 2.0
using rnn (lstm or gru) and ctc to convert line image into text, based on torch7 and warp-ctc
License: Apache License 2.0
I get this output when I tried to run 3_phonernn.lua
:
train size = 8, valid size = 2 nn.Sequential { [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> output] (1): nn.SplitTable (2): nn.Sequencer @ nn.FastLSTM(36 -> 128) (3): nn.Sequencer @ nn.Recursor @ nn.ReLU (4): nn.Sequencer @ nn.Recursor @ nn.BatchNormalization (5): nn.Sequencer @ nn.Recursor @ nn.Dropout(0.5, busy) (6): nn.Sequencer @ nn.Recursor @ nn.Linear(128 -> 12) (7): nn.JoinTable (8): nn.View(255, 12) } [=================== 1/1 =====================>] Tot: 0ms | Step: 0ms /home/ubuntu/torch/install/bin/luajit: /home/ubuntu/torch/install/share/lua/5.1/nn/Container.lua:67: In 8 module of nn.Sequential: /home/ubuntu/torch/install/share/lua/5.1/torch/Tensor.lua:458: bad argument #1 to 'set' (expecting number or torch.DoubleTensor or torch.DoubleStorage at /home/ubuntu/torch/pkg/torch/generic/Tensor.c:1125)
What's happening here? How do I solve this?
I've run 1_generateImage.py
and 2_dump.lua
successfully prior to this. Thanks in advance.
@Halfish the images of the numeral-sequences you create are all of the same dimensions (36 x 255). How can one go about using variable width images instead? (36 x var-width)
I see that you feed in a complete image at a time to the model (not as 255-strips of 36pix-height of the image -- a sequence of pixel-strips). Can you suggest how this can be modified to incorporate variable length inputs?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.