Giter VIP home page Giter VIP logo

crnn's People

Contributors

belval avatar cospel avatar dependabot[bot] avatar mrtpk avatar wangershi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

crnn's Issues

ctc_beam_search_decoder

decoded, log_prob = tf.nn.ctc_beam_search_decoder(logits, seq_len)

acutually, i think you shoud set the param merge_repeated to False, so the result will keep the repeated char. decoded, log_prob = tf.nn.ctc_beam_search_decoder(logits, seq_len, merge_repeated=False) .

so your Ground truth text in readme.md 'semantically' will not be recognized as 'semanticaly'

Test shows nothing and date detection

Hello,
I have 2 questions:

  1. when I run the command
    python run.py -ex ../samples/ -m ./save/ --test --restore it only shows the following pic...with 0. what that means?
  2. Can I use crnn to detect the date like the one I attached?
    Thanks
    num4

error

pretrained model

hello ,i will trouble you again. the pretrained model can be tested using chinese?
when i test in chinese it has the error
File "/Users/liufengnan/workspace/OCR/CRNN/CRNN/utils.py", line 48, in <listcomp> return [config.CHAR_VECTOR.index(x) for x in label] ValueError: substring not found
and then i change the CHAR_VECTOR in config.py use chinese characters. have error with shape
InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [512,3992] rhs shape= [512,70] [[Node: save/Assign = Assign[T=DT_FLOAT, _class=["loc:@W"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](W, save/RestoreV2)]]

and can yue understand my english, it is poor for me

resize function bug

Question describe

  • Bug position: data_manage.py->resize_omage function

  • code:

    im_arr = imread(image, mode='L')
     r, c = np.shape(im_arr)
     if c > input_width:
         c = input_width
         ratio = float(input_width) / c
         final_arr = imresize(im_arr, (int(32 * ratio), input_width))
     else:
         final_arr = np.zeros((32, input_width))
         ratio = 32.0 / r
         im_arr_resized = imresize(im_arr, (32, int(c * ratio)))
         final_arr[:, 0:np.shape(im_arr_resized)[1]] = im_arr_resized
     return final_arr, c
  • detail:
    if the shape of im_arr is (22,92),the programe will run code in 'else',so the shape of im_arr_resized will be(32,int(c*(32/r))) i.e (32,int(92*(32/22)))=(32,133).

    At this code : final_arr[:, 0:np.shape(im_arr_resized)[1]] = im_arr_resized.You just make the fina_arr[ : ,0 : 133]=im_arr_resized,but the max_width of final_arr is 100

The Error rate is always 1.0

Hello! I use your network to train Chinese and English.Char_vector length is 1000+,like '®·、。〇《》一七万三上下不专且世业东丝两个中丰串临丶主丽举久义乐乒乔九也习书买了事' and so on.

The training process is as follows:
---- 50 ----
GT: 水果捞
PREDICT:
---- 50 ----
GT: 品牌电脑
PREDICT:
---- 50 ----
[50] Iteration loss: 482.9037551879883 Error rate: 1.0
step: 51
PREDICT is Blank space.
Where is the problem?

How to predict a picture

When I train my model, I want to predict the characters in a picture. But when I use the command——‘python3 run.py -ex ../data/test --test --restore’,Will show the following results.

Restoring
Checkpoint is valid
0
Loading data
Testing

I want to know how to predict a picture,Thanks!

loss fall slow

Hello @Belval . I want to train a new model to recognize captcha . I use 3000 samples, but loss fall slow . Can you give me some advice and tricks ? add epoch ? Thank you!

Training time

Hi, can you approximate train time of 100k examples on Gtx 1080? I started it, seems very slow. Thanks.

fail to test

how to use the pretrained model in the "save" folder ,i have tried many times ,thank you for your reply

sequence_length(0) <= 15

hello,when i tried to train the model ,i meet the follows:
InvalidArgumentError (see above for traceback): sequence_length(0) <= 15.it happend at code ' loss = tf.nn.ctc_loss(targets, logits, seq_len) ' when feed datas

Could not find ../data/test

Hello, I want to try this command
python3 run.py -ex ../data/test --test --restore
But I found there is no ../data/test in the folder of CRNN
Could you help me?

What does seq_len means?

Thank you for sharing this wonderful project. It works quite well for my number recg problem.
But I have some confusion.

In feed_dict, self.__seq_len: [self.__max_char_count] * self.__data_manager.batch_size,
and
max_char_count = reshaped_cnn_output.get_shape().as_list()[1] .
I'm not quite understand of that. Should seq_len be the width of img after cnn part of shape(batchsize, ?)

Sorry for my poor English.

Can you give me some advice?:)

Hello, Belval, it's me again~ My mentor want me to implement crnn with tensorflow...But I'm poor at coding,I don't know how to start this project, can u give me some advice? Thanks very very very much!

train result question

when i trian the model ,i use 200000W word ,batchsize = 128 ,adam = 0.00001,
epoch = 100 ,
the result as
nonlister
nonlaterzzzzzzzz
neep
miepzzzzzzzz
i don't know why ?
the result is not good ?

Looking forward to your reply

logits = tf.transpose(logits, (1, 0, 2)) why? the original order is [batch time class]
self.__seq_len: [self.__max_char_count] * self.__data_manager.batch_size why? I think seq should be a varient number equal every single target seq length

Testing problem

When I test the result by python3 run.py -ex ../out --test --restore ,I can't get any results. The out folder contains the pictures generated by the code form your repo(TextRecognitionDataGenerator).
The console just outputs these.
Restoring
Checkpoint is valid
0
Loading data
Testing

Process finished with exit code 0

Thank you for help me!

the resize_image function in utils should add "dtype=np.uint8"

def resize_image(image, input_width):
im_arr = imread(image, mode='L')
r, c = np.shape(im_arr)

if c > input_width:
    c = input_width
    ratio = float(input_width) / c
    final_arr = imresize(im_arr, (int(32 * ratio), input_width))
else:
    final_arr = np.zeros((32, input_width))

final_arr = np.zeros((32, input_width), dtype=np.uint8)

How to train a new model

Hi , Belval !
First , I want to say thanks.Your codes give me lots of help . But there are some problems with me .
I use your pretrained model to recognise some nubmers and some symbols like"+" , but the results is not good,so I want to train my own model to do my work but I don't know how to train ,can you give me some guidance ?
Hope for your reply.Thanks !!

Questions about the graph

I have a little question about this part below. Does this mean you slice it along first axis, which means you slice it along batch-size dimension? But according to the paper,shouldn't it be sliced along 'w' dimension?

        def MapToSequences(x):
            x = tf.squeeze(x, [1])
            x = tf.unstack(x)
            return x

TypeError: object of type 'numpy.int32' has no len()

When i run testing i get the below error

(vinayak) C:\Users\vinayak\Documents\github\CRNN-1\CRNN>python run.py -ex ../sam
ples --test --restore
Restoring
Checkpoint is valid
0
Loading data
examples 10
Traceback (most recent call last):
File "run.py", line 118, in
main()
File "run.py", line 111, in main
args.restore
File "C:\Users\vinayak\Documents\github\CRNN-1\CRNN\crnn.py", line 53, in in
it

self.__data_manager = DataManager(batch_size, model_path, examples_path, max
_image_width, train_test_ratio, self.__max_char_count)
File "C:\Users\vinayak\Documents\github\CRNN-1\CRNN\data_manager.py", line 26,
in init
self.test_batches = self.__generate_all_test_batches()
File "C:\Users\vinayak\Documents\github\CRNN-1\CRNN\data_manager.py", line 108
, in __generate_all_test_batches
(-1)
File "C:\Users\vinayak\Documents\github\CRNN-1\CRNN\utils.py", line 17, in spa
rse_tuple_from
indices.extend(zip([n]*len(seq), [i for i in range(len(seq))]))
TypeError: object of type 'numpy.int32' has no len()

Use effect

Can I see the actual recognition effect of your program after training?

continuous repetition letters of the OCR

hi Belval , when i retrain the crnn code ,i found that if there were continuous repetition letters of the OCR, the result offer missed the repetition letters and outputed the single letter . Such as the Ground truth of the OCR is '0870011' , '37075337' , and the Prediction is '08701' , '3707537' . If there were no continuous repetition letters ,the result was correct . My training data only include digital provide by your project https://github.com/Belval/TextRecognitionDataGenerator .
Is it the problem of the CTC ? How can i to solve this problem?
ths!

datamanager.py

在def __generate_all_train_batches(self)里面,
batch_dt = sparse_tuple_from(
np.reshape(
np.array(raw_batch_la),
(-1)
)
)
这样会报错: int型没有len属性
改成
batch_dt = sparse_tuple_from(
np.array(raw_batch_la)
)
就能训练了,但训练非常非常慢,10小时loss没有任何降低

testing the CRNN

I'm getting different output everytime I run the code for testing the pretrained model. and its not correct.
Kindly help.
Thanks

python run.py -ex ../samples --test --restore

I just want to test it,and change ../data/test to ../samples,the folder "samples" also have some image.
But I don not get anything except the flowing:

Restoring
Checkpoint is valid
0
Loading data
Testing

then,it exits.

Large loss with slow descent

Hi Belval ! When I use your code to train with my training data of 100,000 pictures, I got a large loss and the loss descend slowly as follows:

problem

How can I solve this problem?
Thank you!

The loss always be inf

I trained with the data generated by your tool TextRecognitionDataGenerator.
And trained with 100 iterations, the loss is always be inf, I'm wondering the reason, thx a lot.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.