Giter VIP home page Giter VIP logo

crnn-with-stn's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

crnn-with-stn's Issues

error while using the saved model

When I want to use the saved mode to get prediction based on that, it raises

  File "/home/sgnbx/Downloads/projects/CRNN-with-STN-master/prediction.py", line 20, in <module>
    model = load_model('weightswithoutstnlrchanged.best.hdf5', custom_objects={"bknd": backend})
  File "/home/sgnbx/anaconda3/envs/tf_gpu/lib/python3.6/site-packages/keras/engine/saving.py", line 419, in load_model
    model = _deserialize_model(f, custom_objects, compile)
  File "/home/sgnbx/anaconda3/envs/tf_gpu/lib/python3.6/site-packages/keras/engine/saving.py", line 312, in _deserialize_model
    sample_weight_mode=sample_weight_mode)
  File "/home/sgnbx/anaconda3/envs/tf_gpu/lib/python3.6/site-packages/keras/engine/training.py", line 129, in compile
    loss_functions.append(losses.get(loss.get(name)))
  File "/home/sgnbx/anaconda3/envs/tf_gpu/lib/python3.6/site-packages/keras/losses.py", line 133, in get
    return deserialize(identifier)
  File "/home/sgnbx/anaconda3/envs/tf_gpu/lib/python3.6/site-packages/keras/losses.py", line 114, in deserialize
    printable_module_name='loss function')
  File "/home/sgnbx/anaconda3/envs/tf_gpu/lib/python3.6/site-packages/keras/utils/generic_utils.py", line 165, in deserialize_keras_object
    ':' + function_name)
ValueError: Unknown loss function:<lambda>

Do you have any idea of this?

this is my piece of code:

model = load_model('weightswithoutstnlrchanged.best.hdf5', custom_objects={"bknd": backend})
sgd = SGD(lr=0.0001, decay=1e-6, momentum=0.9, nesterov=True, clipnorm=5)
adam = optimizers.Adam()

model.compile(loss={'ctc': lambda y_true, y_pred: y_pred}, optimizer=sgd)

I gave custom_object backend because firstly it did not recognize backend.
now it raising error for loss, and I tried to assign loss in the custom object but it did not work.or maybe I have to try something else.

can you please have a look on this.
Thank you

你好,加载模型时遇到了一些问题

当我修改config.py的模型路径load_model_path = '/home/user/WWY/CRNN-with-STN-master/weights_with_STN.hdf5'
命令行报错:
ValueError: Layer #10 (named "spatial_transformer_1" in the current model) was found to correspond to layer spatial_transformer_1 in the save file. However the new layer spatial_transformer_1 expects 16 weights, but the saved weights have 8 elements.

如果是load_model_path = '/home/user/WWY/CRNN-with-STN-master/weights_without_STN.hdf5'
命令行报错:
ValueError: axes don't match array

这两个错误在我以前没有遇到过,我不知道如何修改,麻烦作者解答一下困惑,谢谢啦

Learning rate

Hi!
I read in a different issue that you are able to achieve 90% accuracy in 24 hours.
I'd like to ask what is the learning rate you used for that. Is it 0.0001 or 0.002? There are two values in the config.py so I was wondering which of the two was used for that.
I'm quite struggling to train the model as I can only reach like 3% accuracy in 48 hours when training on 4 GPUs...

alternative way of concatenating two LSTM cell

Hello again:)

I am working on your code and almost done. except I need to change one line of the code:
rnn2_merged = concatenate([rnn_2, rnn_2b]) in which you are concatenating them.
Can you please help me with this. I want to keep the same structure but without Concatenate?
to put it another way, what will be the alternative way of concatenating them in keras but without using concatenate?

Thanks for taking the time.

about the input shape

I found your model has the certain size of input, so, how can your recognize images with uncertain size? Like a 64*500 image, if resize the image, it main destroy its aspect ratio and influence the result, is it?

y_true (label) in CTC

Hi,
I've just learnt CTC loss, and as I know it allows various length of labels as long as it's not longer than label_len. For that reason, I don't understand why you needed to pad '-' for the labels (your comment doesn't make sense btw):

# due to the explanation of ctc_loss, try to not add "-" for blank
while len(lexicon) < label_len:
     lexicon += "-"

and why you added '-' symbol in your vocabulary (characters):

characters = '0123456789'+string.ascii_lowercase+'-'
label_classes = len(characters)+1

EDIT:
I fought that you need to pad to the label to make the code run well. Last question, eg. label='12345---' and label_len=5, CTC just uses label[:label_len] for caculating the loss, right?

你好,加入新数据后遇到的问题

我用您提供的数据集成功运行后,尝试将自己的数据加入其中,但不管是类似原数据集准备lexicon,train,vali,还是直接将新数据覆盖在原数据集上,总是会报错,而且是同一种错误

(0) Invalid argument: Saw a non-null label (index >= num_classes - 1) following a null label, batch: 38 num_classes: 38 labels: 0,255,0,1,255,1,5,9,0,0,36,36,36,36,36,36 labels seen so far: 0
[[{{node ctc/CTCLoss}}]]
[[training/SGD/gradients/ctc/CTCLoss_grad/mul/_431]]
(1) Invalid argument: Saw a non-null label (index >= num_classes - 1) following a null label, batch: 38 num_classes: 38 labels: 0,255,0,1,255,1,5,9,0,0,36,36,36,36,36,36 labels seen so far: 0
[[{{node ctc/CTCLoss}}]]

这种应该如何修改呢?

why decoding starts from 3rd position?

Hi,

I wonder what is the theoretical basis for starting decoding from 3rd position. I'm referring to this line:
ctc_decode = bknd.ctc_decode(y_pred[:, 2:, :], input_length=np.ones(shape[0])*shape[1])[0][0]

In image_ocr.py example on keras github there's a comment:

# the 2 is critical here since the first couple outputs of the RNN
# tend to be garbage:

But why? And why everyone is using 2 regardless of dataset, image width and text length?

concept of y_pred[:,2:,:] tensor?

Hi,
Why you don't use all of y_pred[:,:,:] tensor instead of y_pred[:,2:,:]? why you don't use 0 ,1 dims?

def evaluate(input_model):
correct_prediction = 0
generator = img_gen_val()

x_test, y_test = next(generator)
# print(" ")
y_pred = input_model.predict(x_test) 
`shape = y_pred[:, 2:, :].shape `
ctc_decode = bknd.ctc_decode(`y_pred[:, 2:, :]`, input_length=np.ones(shape[0])*shape[1])[0][0]
out = bknd.get_value(ctc_decode)[:, :label_len]

Problem related to Text Detection

I tried training East Text detection algorithm, wasn't successful in detection of all the lines in a document.
I tried Ctpn model on VOC pascal dataset ...
It works really well for font size of certain value... Beyond tht if font in an document is too small .... It fails to detect lines properly...
Any suggestions?

input and output name

Again thanks for sharing your code with us.
I'd love working on your code and I want to use in my phone.
So I need to convet the keras model to core ML.
this is the function I need to call:


def convert(model,
            mode=None,
            image_input_names=[],
            preprocessing_args={},
            image_output_names=[],
            deprocessing_args={},
            class_labels=None,
            predicted_feature_name='classLabel',
            add_custom_layers = False,
            custom_conversion_functions = {})

I was wondering can you help me with the argument according to your code?
like what will be image_input_names,....

Thanks in advance for taking the time:)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.