Giter VIP home page Giter VIP logo

crnn.mxnet's People

Contributors

kli-casia avatar novioleo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

crnn.mxnet's Issues

How to train?

Hello, @novioleo ! Your work is very nice, I just use the demo,the result is good.But when I use my dataset(a picture contains two lines of characters), the result is not satisfactory. So, can you give me some help for training(include preparing dataset, and parameter);or can you give me some example, I can't start this project!Thank you very much!

ADD TEXT LOCALTION

i use the algorithm from opencv_contrib to location the text(temporary),to detect the character area.this algorithm is suit for non-chinese text.i'll replace it with other simpler algorithm ,so that the mobile device can detect the text area much more efficient.

[PROMOTION]make model much more generic

Recently I've changed the structure of the web to make it more generic, adapting to more than just generating data sets. The current model can very good fitting we generated data, but the reality data and background too miscellaneous and too much, we can not collect enough data, therefore, in order to improve the ability of generic OCR,I put forward several solutions:
1. Background filtering: the operation of max-pooling minus avg-pooling is performed on the first or second layer.
2. Reduce the number of pooling: not compress the number of rows to 1 row, and increase the input steps to LSTM.
3. Reduce the number of kernels of convolution: it can improve generalization, but it will slow down the convergence rate and reduce the accuracy slightly. BUT It also reduces the size of the model.

Test Image on C++ Invoker main.cpp file

I am trying to test the cpp version using main.cpp file but i am unable to understand how to read file
in bytes as const std::vector so that i can pass it to predict function in predictor class.

Please help this is the main function i am using.

int main(int argc, char** argv) {
    predictor obj = predictor(argv[2],argv[3]);
   // read image here in bytes format
    try {
        std::string result = obj.predict(image);
    }catch(std::exception& e){
        std::cout<<e.what()<<std::endl;
    }
 return 0;
}

and this is my prediction function prototype

std::string predict(const std::vector<uchar> &byte_data);

if above things are unclear i can try to explain again. thank you very much.

A little question about Optimizer params

@novioleo Hi, I'm trying to train crnn on synth90k with your code.But it behaves badly.While the origin version get 93% on that dataset ,my model can only get 60% accuracy. A little different from the original version , I use the adam optimizer as you used with learning rate set to be 1e-4.Could you give me some advice?
Thank you very much.

Check failed: op != nullptr Operator WarpCTC is not registered

i run the python predictor.py ,there are some errors:
[11:38:15] /home/xuting/ocr/mxnet/dmlc-core/include/dmlc/./logging.h:308: [11:38:15] src/core/op.cc:55: Check failed: op != nullptr Operator WarpCTC is not registered

Stack trace returned 10 entries:
[bt] (0) /usr/lib/python2.7/site-packages/mxnet-0.11.1-py2.7.egg/mxnet/libmxnet.so(_ZN4nnvm2Op3GetERKSs+0x329) [0x7f1833825449]
[bt] (1) /usr/lib/python2.7/site-packages/mxnet-0.11.1-py2.7.egg/mxnet/libmxnet.so(+0x28aecf8) [0x7f183380ccf8]
[bt] (2) /usr/lib/python2.7/site-packages/mxnet-0.11.1-py2.7.egg/mxnet/libmxnet.so(_ZN4dmlc20JSONObjectReadHelper13ReadAllFieldsEPNS_10JSONReaderE+0x100) [0x7f18338130a0]
[bt] (3) /usr/lib/python2.7/site-packages/mxnet-0.11.1-py2.7.egg/mxnet/libmxnet.so(+0x28adb86) [0x7f183380bb86]
[bt] (4) /usr/lib/python2.7/site-packages/mxnet-0.11.1-py2.7.egg/mxnet/libmxnet.so(ZNSt17_Function_handlerIFN4nnvm5GraphES1_EPS2_E9_M_invokeERKSt9_Any_dataS1+0x11f) [0x7f1832487d4f]
[bt] (5) /usr/lib/python2.7/site-packages/mxnet-0.11.1-py2.7.egg/mxnet/libmxnet.so(_ZN4nnvm11ApplyPassesENS_5GraphERKSt6vectorISsSaISsEE+0x501) [0x7f1833818a21]
[bt] (6) /usr/lib/python2.7/site-packages/mxnet-0.11.1-py2.7.egg/mxnet/libmxnet.so(_ZN5mxnet18LoadLegacyJSONPassEN4nnvm5GraphE+0x180) [0x7f1832480e60]
[bt] (7) /usr/lib/python2.7/site-packages/mxnet-0.11.1-py2.7.egg/mxnet/libmxnet.so(ZNSt17_Function_handlerIFN4nnvm5GraphES1_EPS2_E9_M_invokeERKSt9_Any_dataS1+0x11f) [0x7f1832487d4f]
[bt] (8) /usr/lib/python2.7/site-packages/mxnet-0.11.1-py2.7.egg/mxnet/libmxnet.so(_ZN4nnvm11ApplyPassesENS_5GraphERKSt6vectorISsSaISsEE+0x501) [0x7f1833818a21]
[bt] (9) /usr/lib/python2.7/site-packages/mxnet-0.11.1-py2.7.egg/mxnet/libmxnet.so(_ZN4nnvm9ApplyPassENS_5GraphERKSs+0x8e) [0x7f18327e7f7e]

Traceback (most recent call last):
File "predictor.py", line 133, in
my_predictor = predict(images,(256,32),'model/digit',0,'./digit.txt',32,24,128,False)
File "predictor.py", line 77, in init
label_names=('label',)
File "/usr/lib/python2.7/site-packages/mxnet-0.11.1-py2.7.egg/mxnet/module/module.py", line 143, in load
sym, args, auxs = load_checkpoint(prefix, epoch)
File "/usr/lib/python2.7/site-packages/mxnet-0.11.1-py2.7.egg/mxnet/model.py", line 394, in load_checkpoint
symbol = sym.load('%s-symbol.json' % prefix)
File "/usr/lib/python2.7/site-packages/mxnet-0.11.1-py2.7.egg/mxnet/symbol/symbol.py", line 1913, in load
check_call(_LIB.MXSymbolCreateFromFile(c_str(fname), ctypes.byref(handle)))
File "/usr/lib/python2.7/site-packages/mxnet-0.11.1-py2.7.egg/mxnet/base.py", line 143, in check_call
raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: Failed loading Op warpctc0 of type WarpCTC: [11:38:15] src/core/op.cc:55: Check failed: op != nullptr Operator WarpCTC is not registered

Stack trace returned 10 entries:
[bt] (0) /usr/lib/python2.7/site-packages/mxnet-0.11.1-py2.7.egg/mxnet/libmxnet.so(_ZN4nnvm2Op3GetERKSs+0x329) [0x7f1833825449]
[bt] (1) /usr/lib/python2.7/site-packages/mxnet-0.11.1-py2.7.egg/mxnet/libmxnet.so(+0x28aecf8) [0x7f183380ccf8]
[bt] (2) /usr/lib/python2.7/site-packages/mxnet-0.11.1-py2.7.egg/mxnet/libmxnet.so(_ZN4dmlc20JSONObjectReadHelper13ReadAllFieldsEPNS_10JSONReaderE+0x100) [0x7f18338130a0]
[bt] (3) /usr/lib/python2.7/site-packages/mxnet-0.11.1-py2.7.egg/mxnet/libmxnet.so(+0x28adb86) [0x7f183380bb86]
[bt] (4) /usr/lib/python2.7/site-packages/mxnet-0.11.1-py2.7.egg/mxnet/libmxnet.so(ZNSt17_Function_handlerIFN4nnvm5GraphES1_EPS2_E9_M_invokeERKSt9_Any_dataS1+0x11f) [0x7f1832487d4f]
[bt] (5) /usr/lib/python2.7/site-packages/mxnet-0.11.1-py2.7.egg/mxnet/libmxnet.so(_ZN4nnvm11ApplyPassesENS_5GraphERKSt6vectorISsSaISsEE+0x501) [0x7f1833818a21]
[bt] (6) /usr/lib/python2.7/site-packages/mxnet-0.11.1-py2.7.egg/mxnet/libmxnet.so(_ZN5mxnet18LoadLegacyJSONPassEN4nnvm5GraphE+0x180) [0x7f1832480e60]
[bt] (7) /usr/lib/python2.7/site-packages/mxnet-0.11.1-py2.7.egg/mxnet/libmxnet.so(ZNSt17_Function_handlerIFN4nnvm5GraphES1_EPS2_E9_M_invokeERKSt9_Any_dataS1+0x11f) [0x7f1832487d4f]
[bt] (8) /usr/lib/python2.7/site-packages/mxnet-0.11.1-py2.7.egg/mxnet/libmxnet.so(_ZN4nnvm11ApplyPassesENS_5GraphERKSt6vectorISsSaISsEE+0x501) [0x7f1833818a21]
[bt] (9) /usr/lib/python2.7/site-packages/mxnet-0.11.1-py2.7.egg/mxnet/libmxnet.so(_ZN4nnvm9ApplyPassENS_5GraphERKSs+0x8e) [0x7f18327e7f7e]

language?

I see the language used in this project is C++ and the code is coding in python, so is there any problem I use python to run this code

multi gpu version?

hi,novioleo,thanks for your job.can you plz support the multi gpus training version?thanks.

批量预测

请问一下,模型批量预测有说明文档吗?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.