belval / crnn Goto Github PK

View Code? Open in Web Editor NEW

297.0 13.0 101.0 185.56 MB

A TensorFlow implementation of https://github.com/bgshih/crnn

License: MIT License

Python 100.00%

crnn tensorflow bidirectionnallstm ctc

crnn's People

Contributors

Stargazers

Watchers

Forkers

practise2017 xianfengju dingqunfei youngforever222 chongshengzhang xiaolaodi hydrophis-spiralis lvjc moses1994 juventi kitter hanfeijp mm1860 oujunke gegetang estalle cao-ming haoyuanz13 davincibj hanlinsun97 vinayakarannil gds101054108 jiancong xuewengeophysics caoyangcr7 hutaotao830 hsuse glb-seu zoonono lss616263 dreadlord1984 92xianshen bobrey 10183308 frankfqchen axltfytx fendaq liben2018 chl916185 sulyyueer xueran1991 cloudtsang bakhaboubakr force1585 xiaran guyp70 ancybeibei blitu12345 stpraha mantianlong daniel00008 sagullapalli happog sunxingxingtf geographerwang joei1994 datbkhn zhangxu5515 hezhihao931218 chaitusvk victorleelk jeasydev michalliu cospel jangocheng teresasun jollyhe hxz2015 zhangxiao339 cherish24 yangtong1989 agalst tangshao0804 ssitb deep-zhang hoheybo aakashofficial murphyton mrtpk hopeskair tslnihaogit zawlinnnaing yixinnb daibinhua888 wangguanghui0607 darkseptember zjwhy advancer-debug oustandingman xinlaoda akarshan97 runngezhang caizhengqi 2550868135 jinmana finalwen verazhou pujiangxian88888888 varsha359 webstorage119

crnn's Issues

ctc_beam_search_decoder

CRNN/CRNN/crnn.py

Line 167 in 0633495

decoded, log_prob = tf.nn.ctc_beam_search_decoder(logits, seq_len)

acutually, i think you shoud set the param merge_repeated to False, so the result will keep the repeated char. decoded, log_prob = tf.nn.ctc_beam_search_decoder(logits, seq_len, merge_repeated=False) .

so your Ground truth text in readme.md 'semantically' will not be recognized as 'semanticaly'

Test shows nothing and date detection

Hello,
I have 2 questions:

when I run the command
python run.py -ex ../samples/ -m ./save/ --test --restore it only shows the following pic...with 0. what that means?
Can I use crnn to detect the date like the one I attached?
Thanks

pretrained model

hello ,i will trouble you again. the pretrained model can be tested using chinese?
when i test in chinese it has the error
File "/Users/liufengnan/workspace/OCR/CRNN/CRNN/utils.py", line 48, in <listcomp> return [config.CHAR_VECTOR.index(x) for x in label] ValueError: substring not found
and then i change the CHAR_VECTOR in config.py use chinese characters. have error with shape
InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [512,3992] rhs shape= [512,70] [[Node: save/Assign = Assign[T=DT_FLOAT, _class=["loc:@W"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](W, save/RestoreV2)]]

and can yue understand my english, it is poor for me

resize function bug

Question describe

Bug position: data_manage.py->resize_omage function

code:

im_arr = imread(image, mode='L')
 r, c = np.shape(im_arr)
 if c > input_width:
     c = input_width
     ratio = float(input_width) / c
     final_arr = imresize(im_arr, (int(32 * ratio), input_width))
 else:
     final_arr = np.zeros((32, input_width))
     ratio = 32.0 / r
     im_arr_resized = imresize(im_arr, (32, int(c * ratio)))
     final_arr[:, 0:np.shape(im_arr_resized)[1]] = im_arr_resized
 return final_arr, c

detail:
if the shape of im_arr is (22,92),the programe will run code in 'else',so the shape of im_arr_resized will be(32,int(c*(32/r))) i.e (32,int(92*(32/22)))=(32,133).

At this code : final_arr[:, 0:np.shape(im_arr_resized)[1]] = im_arr_resized.You just make the fina_arr[ : ,0 : 133]=im_arr_resized,but the max_width of final_arr is 100

The Error rate is always 1.0

Hello! I use your network to train Chinese and English.Char_vector length is 1000+,like '®·、。〇《》一七万三上下不专且世业东丝两个中丰串临丶主丽举久义乐乒乔九也习书买了事' and so on.

The training process is as follows：
---- 50 ----
GT: 水果捞
PREDICT:
---- 50 ----
GT: 品牌电脑
PREDICT:
---- 50 ----
[50] Iteration loss: 482.9037551879883 Error rate: 1.0
step: 51
PREDICT is Blank space.
Where is the problem?

How can I accelerate training process using gpu？

it seems that it only uses cpu to train the model...

How to predict a picture

When I train my model, I want to predict the characters in a picture. But when I use the command——‘python3 run.py -ex ../data/test --test --restore’，Will show the following results.

Restoring
Checkpoint is valid
0
Loading data
Testing

I want to know how to predict a picture,Thanks!

what if the character numbers of my training data are different，how to train？

loss fall slow

Hello @Belval . I want to train a new model to recognize captcha . I use 3000 samples, but loss fall slow . Can you give me some advice and tricks ? add epoch ? Thank you!

hi,Belval I am a beginner.

How to modify the connectionist Temporal Classification (CTC) layer of the network to also give us a confidence score?

how to modify the connectionist Temporal Classification (CTC) layer of the network to also give us a confidence score?

model overfitting

Hello @Belval . How could i sovle overfitting problem? Thanks !

Training time

Hi, can you approximate train time of 100k examples on Gtx 1080? I started it, seems very slow. Thanks.

fail to test

how to use the pretrained model in the "save" folder ,i have tried many times ,thank you for your reply

sequence_length(0) <= 15

hello,when i tried to train the model ,i meet the follows:
InvalidArgumentError (see above for traceback): sequence_length(0) <= 15.it happend at code ' loss = tf.nn.ctc_loss(targets, logits, seq_len) ' when feed datas

Could not find ../data/test

Hello, I want to try this command
python3 run.py -ex ../data/test --test --restore
But I found there is no ../data/test in the folder of CRNN
Could you help me?

how many times your model iterates when you train?

What is the different bettween this project and current LSTM+CTCproject

I wanna if there are something developments in current LSTM+CTC project, which I notice your project was established 2years ago.
Wait for you response!

What does seq_len means?

Thank you for sharing this wonderful project. It works quite well for my number recg problem.
But I have some confusion.

In feed_dict, self.__seq_len: [self.__max_char_count] * self.__data_manager.batch_size,
and
max_char_count = reshaped_cnn_output.get_shape().as_list()[1] .
I'm not quite understand of that. Should seq_len be the width of img after cnn part of shape(batchsize, ?)

Sorry for my poor English.

Can you give me some advice?:)

Hello, Belval, it's me again~ My mentor want me to implement crnn with tensorflow...But I'm poor at coding，I don't know how to start this project, can u give me some advice? Thanks very very very much!

How to compute accuracy both from train and test set

missing files

How to save model?

train result question

when i trian the model ,i use 200000W word ,batchsize = 128 ,adam = 0.00001,
epoch = 100 ,
the result as
nonlister
nonlaterzzzzzzzz
neep
miepzzzzzzzz
i don't know why ?
the result is not good ?

What dataset do you use?

hello, I want to run your project, but I dont know choose which dataset, can you help me? Thanks a lot @Belval

Looking forward to your reply

logits = tf.transpose(logits, (1, 0, 2)) why? the original order is [batch time class]
self.__seq_len: [self.__max_char_count] * self.__data_manager.batch_size why? I think seq should be a varient number equal every single target seq length

Testing problem

When I test the result by python3 run.py -ex ../out --test --restore ,I can't get any results. The out folder contains the pictures generated by the code form your repo(TextRecognitionDataGenerator).
The console just outputs these.
Restoring
Checkpoint is valid
0
Loading data
Testing

Process finished with exit code 0

Thank you for help me!

the resize_image function in utils should add "dtype=np.uint8"

def resize_image(image, input_width):
im_arr = imread(image, mode='L')
r, c = np.shape(im_arr)

if c > input_width:
    c = input_width
    ratio = float(input_width) / c
    final_arr = imresize(im_arr, (int(32 * ratio), input_width))
else:
    final_arr = np.zeros((32, input_width))

final_arr = np.zeros((32, input_width), dtype=np.uint8)

can wen test the model with image named like 0909.jpg

if we test the image and the name of image shows the result , why i need to train it

How to train a new model

Hi , Belval !
First , I want to say thanks.Your codes give me lots of help . But there are some problems with me .
I use your pretrained model to recognise some nubmers and some symbols like"+" , but the results is not good，so I want to train my own model to do my work but I don't know how to train ,can you give me some guidance ?
Hope for your reply.Thanks !!

Questions about the graph

I have a little question about this part below. Does this mean you slice it along first axis, which means you slice it along batch-size dimension? But according to the paper,shouldn't it be sliced along 'w' dimension?

        def MapToSequences(x):
            x = tf.squeeze(x, [1])
            x = tf.unstack(x)
            return x

TypeError: object of type 'numpy.int32' has no len()

When i run testing i get the below error

(vinayak) C:\Users\vinayak\Documents\github\CRNN-1\CRNN>python run.py -ex ../sam
ples --test --restore
Restoring
Checkpoint is valid
0
Loading data
examples 10
Traceback (most recent call last):
File "run.py", line 118, in
main()
File "run.py", line 111, in main
args.restore
File "C:\Users\vinayak\Documents\github\CRNN-1\CRNN\crnn.py", line 53, in in
it
self.__data_manager = DataManager(batch_size, model_path, examples_path, max
_image_width, train_test_ratio, self.__max_char_count)
File "C:\Users\vinayak\Documents\github\CRNN-1\CRNN\data_manager.py", line 26,
in init
self.test_batches = self.__generate_all_test_batches()
File "C:\Users\vinayak\Documents\github\CRNN-1\CRNN\data_manager.py", line 108
, in __generate_all_test_batches
(-1)
File "C:\Users\vinayak\Documents\github\CRNN-1\CRNN\utils.py", line 17, in spa
rse_tuple_from
indices.extend(zip([n]*len(seq), [i for i in range(len(seq))]))
TypeError: object of type 'numpy.int32' has no len()

result in ICDAR2013

what's the result in ICDAR2013,my result is lower than paper

what`s the format of you training data

what`s the format of your training data ? I want train your model from scratch.thanks a lot

max_width 训练的时候是256，预测的时候输入1024行不，这个参数是固定的么

Use effect

Can I see the actual recognition effect of your program after training?

how can I lengthen the max recognition char count?

I know you add char count limit (24) when load pictures. and after CNN part the image will become a tensor of size 24x512.
if I want to recognize longer text，what should I do？

Using pictures from TextRecognitionDataGenerator will lead to a bug

I generated training data using TextRecognitionDataGenerator.
ValueError: substring not found
When 'data_manager', it will meet a bug caused by pictures named 'A&P_58395.jpg' ,'R&D_14671.jpg'...
I think that's because ‘&‘ is a special symbol of Linux.

continuous repetition letters of the OCR

hi Belval , when i retrain the crnn code ,i found that if there were continuous repetition letters of the OCR, the result offer missed the repetition letters and outputed the single letter . Such as the Ground truth of the OCR is '0870011' , '37075337' , and the Prediction is '08701' , '3707537' . If there were no continuous repetition letters ,the result was correct . My training data only include digital provide by your project https://github.com/Belval/TextRecognitionDataGenerator .
Is it the problem of the CTC ? How can i to solve this problem?
ths!

Why I can run the code but can't get the good result？

the result show that
Restoring
Checkpoint is valid
0
Loading data
Testing

Process finished with exit code 0

datamanager.py

在def __generate_all_train_batches(self)里面，
batch_dt = sparse_tuple_from(
np.reshape(
np.array(raw_batch_la),
(-1)
)
)
这样会报错： int型没有len属性
改成
batch_dt = sparse_tuple_from(
np.array(raw_batch_la)
)
就能训练了，但训练非常非常慢，10小时loss没有任何降低

Useable on chinese text？

testing the CRNN

I'm getting different output everytime I run the code for testing the pretrained model. and its not correct.
Kindly help.
Thanks

python run.py -ex ../samples --test --restore

I just want to test it，and change ../data/test to ../samples,the folder "samples" also have some image.
But I don not get anything except the flowing：

Restoring
Checkpoint is valid
0
Loading data
Testing

then，it exits.

belval / crnn Goto Github PK

crnn's People

Contributors

Stargazers

Watchers

Forkers

crnn's Issues

Question describe

I just want to test it，and change ../data/test to ../samples,the folder "samples" also have some image. But I don not get anything except the flowing：

Restoring Checkpoint is valid 0 Loading data Testing

Recommend Projects

Recommend Topics

Recommend Org

I just want to test it，and change ../data/test to ../samples,the folder "samples" also have some image.
But I don not get anything except the flowing：

Restoring
Checkpoint is valid
0
Loading data
Testing