Jupyter Notebook 100.00%

a-crnn-model-for-text-recognition-in-keras's Introduction

A-CRNN-model-for-Text-Recognition-in-Keras

To understand the algorithm used in the model follow these blogs:

a-crnn-model-for-text-recognition-in-keras's People

Contributors

Stargazers

Watchers

a-crnn-model-for-text-recognition-in-keras's Issues

will you tell me what 'lambda' does in your model architecture?

in CRNN Model.ipynb notebook, you have this line of code in cell numbered 7:

squeezed = Lambda(lambda x: K.squeeze(x, 1))(conv_7)

I couldn't understand what it is for and what it does? Will you please be able to tell me that?

This is what im getting in out. Can anyone comment on this?
[-1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1]

Dataset

The synth90K data set is very hug its 10GB, can you share your copy of dataset.

ValueError: Error when checking input: expected input_4 to have 4 dimensions, but got array with shape (3, 1)

ValueError Traceback (most recent call last)
in ()
1 batch_size = 256
2 epochs = 1
----> 3 model.fit(x=[training_img, train_padded_txt, train_input_length, train_label_length], y=np.zeros(len(training_img)), batch_size=batch_size, epochs = epochs, validation_data = ([valid_img, valid_padded_txt, valid_input_length, valid_label_length], [np.zeros(len(valid_img))]), verbose = 1, callbacks = callbacks_list)

2 frames
/usr/local/lib/python3.6/dist-packages/keras/engine/training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix)
133 ': expected ' + names[i] + ' to have ' +
134 str(len(shape)) + ' dimensions, but got array '
--> 135 'with shape ' + str(data_shape))
136 if not check_batch_axis:
137 data_shape = data_shape[1:]

Any reason why this error is occurring and how to solve it?

Infinite loss showing after first few iterations of each epoch

Hi,
I'm facing inf loss while training. Although val loss is constantly coming down, train loss always ends up as inf. I have tried lowering lr to 0.00001 also but it happens anyway. What val and train loss should we expect generally after its fully trained as per your dataset?

Error: Expect x to be a non-empty array or dataset.

batch_size = 256
epochs = 10
model.fit(x=[training_img, train_padded_txt, train_input_length, train_label_length], y=np.zeros(len(training_img)), batch_size=batch_size, epochs = epochs, validation_data = ([valid_img, valid_padded_txt, valid_input_length, valid_label_length], [np.zeros(len(valid_img))]), verbose = 1, callbacks = callbacks_list)

while running above code, I am getting below error. How to resolve this error

Epoch 1/10

ValueError Traceback (most recent call last)

in ()
1 batch_size = 256
2 epochs = 10
----> 3 model.fit(x=[training_img, train_padded_txt, train_input_length, train_label_length], y=np.zeros(len(training_img)), batch_size=batch_size, epochs = epochs, validation_data = ([valid_img, valid_padded_txt, valid_input_length, valid_label_length], [np.zeros(len(valid_img))]), verbose = 1, callbacks = callbacks_list)

/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq, max_queue_size, workers, use_multiprocessing)
1108
1109 if logs is None:
-> 1110 raise ValueError('Expect x to be a non-empty array or dataset.')
1111 epoch_logs = copy.copy(logs)
1112

ValueError: Expect x to be a non-empty array or dataset.

How to run from saved model

@TheAILearner I have trained and saved model "best_model.hdf5", how do I run from this saved model without rerunning the entire script

A request for model weights

Hi,
Sorry this is not an issue but I was not sure how to contact you.
Could you please somehow give me the best model weights for what you have trained as it would save me a lot of time if I could initialize with pretrained weights during training.
my email is [email protected]

ignore_longer_outputs_than_inputs should be set to True

While trying to train the model on IAM dataset, an error would pop up about the output size mismatch with 31 hidden layers of BiLSTM in ctc_loss function. For that in tensorflow_core under https://www.tensorflow.org/api_docs/python/tf/nn/ctc_loss function set ignore_longer_outputs_than_inputs to True.

why can't the model.fit method take images to x and labels to y? What is the reason for passing array of zeroes to y?

In this line of code, in CRNN Model.ipynb notebook:

batch_size = 256
epochs = 10
model.fit(x=[training_img, train_padded_txt, train_input_length, train_label_length], y=np.zeros(len(training_img)), batch_size=batch_size, epochs = epochs, validation_data = ([valid_img, valid_padded_txt, valid_input_length, valid_label_length], [np.zeros(len(valid_img))]), verbose = 1, callbacks = callbacks_list)

From what I understand,
x argument of fit method have been assigned with array of images, indexed text with zeroes padded, an array of 31s with same length as input array and length of each target word.
y argument of fit method have been assigned with an array of zeroes.

My doubts:

why are zeroes being sent to y, not train_padded_txt, since this is the array that contains text or target labels.
Why is x being assigned with all of those things. What is the reason. Why can't it just be:
model.fit(x=training_img, y=train_padded_txt,,,...)

Please help me understand these. Thanks in advance and very nice work and nice of you to open source this.

Please provide the trained model

It would be really helpful if you could provide the trained Keras model

All input arrays (x) should have the same number of samples.

maybe there is something wrong with this

`labels = Input(name='the_labels', shape=[max_label_len], dtype='float32')
input_length = Input(name='input_length', shape=[1], dtype='int64')
label_length = Input(name='label_length', shape=[1], dtype='int64')

def ctc_lambda_func(args):
y_pred, labels, input_length, label_length = args

return K.ctc_batch_cost(labels, y_pred, input_length, label_length)

loss_out = Lambda(ctc_lambda_func, output_shape=(1,), name='ctc')([outputs, labels, input_length, label_length])

#model to be used at training time
model = Model(inputs=[inputs, labels, input_length, label_length], outputs=loss_out)
model.summary()`

I don't know
can you help me?
I want to load my own data
I forked the code
you can see it

this is the error that I get when

ValueError Traceback (most recent call last)
in
5 batch_size=batch_size, epochs = epochs,
6 validation_data = ([valid_img, valid_padded_txt, valid_input_length, valid_label_length], np.zeros(len(valid_img))),
----> 7 verbose = 1, callbacks = callbacks_list)

c:\users\yehya\appdata\local\programs\python\python36\lib\site-packages\keras\engine\training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, **kwargs)
970 val_x, val_y,
971 sample_weight=val_sample_weight,
--> 972 batch_size=batch_size)
973 if self._uses_dynamic_learning_phase():
974 val_ins = val_x + val_y + val_sample_weights + [0.]

c:\users\yehya\appdata\local\programs\python\python36\lib\site-packages\keras\engine\training.py in _standardize_user_data(self, x, y, sample_weight, class_weight, check_array_lengths, batch_size)
802 ]
803 # Check that all arrays have the same length.
--> 804 check_array_length_consistency(x, y, sample_weights)
805 if self._is_graph_network:
806 # Additional checks to avoid users mistakenly

c:\users\yehya\appdata\local\programs\python\python36\lib\site-packages\keras\engine\training_utils.py in check_array_length_consistency(inputs, targets, weights)
226 raise ValueError('All input arrays (x) should have '
227 'the same number of samples. Got array shapes: ' +
--> 228 str([x.shape for x in inputs]))
229 if len(set_y) > 1:
230 raise ValueError('All target arrays (y) should have '

ValueError: All input arrays (x) should have the same number of samples. Got array shapes: [(4500, 32, 200, 1), (500, 20), (500, 1), (500, 1)]

when I run your repo,

Error when checking input: expected input_1 to have 4 dimensions, but got array with shape (0, 1)

3 model.fit(x=[training_img, train_padded_txt, train_input_length, train_label_length], y=np.zeros(len(training_img)), batch_size=batch_size, epochs = epochs, validation_data = ([valid_img, valid_padded_txt, valid_input_length, valid_label_length], [np.zeros(len(valid_img))]), verbose = 1, callbacks = callbacks_list)

the only thing is that my images are 100X200, what is the problem?

Out value is empty

Hi, I am faced with a problem when I execute this model in the last step, I don't have any ctc value and I don't have any result of predicted value, any help please,
this is the result:
original_text = Enuresis
predicted text = []
and this is my code:
char_list = string.ascii_letters + string.digits
print("Character List: ", char_list)

#function to decode the text into indice of char list
def encode_to_labels(text):
# We encode each output word into digits
digit_list = []
for index, character in enumerate(text):
try:
digit_list.append(char_list.index(character))
except:
print("Error in finding index for character ", character)
#End For
return digit_list
a= encode_to_labels('hola')
print(a)

#preprocess the data
#read the image from IAM Dataset
n_samples = len(os.listdir('/home/yosra/Desktop/imagetest'))

#Number of samples in xml file

xml_samples = len(dic)

#list of trining_set

training_img = []
training_txt=[]
train_input_length = []
train_label_length = []
orig_txt = []

#lists for validation dataset
valid_img = []
valid_txt = []
valid_input_length = []
valid_label_length = []
valid_orig_txt = []

max_label_len = 0

Training Variables

k=1

for i, pic in enumerate(os.listdir('/home/yosra/Desktop/imagetest')):
# Read image as grayscale
img = cv2.imread(os.path.join('/home/yosra/Desktop/imagetest', pic), cv2.IMREAD_GRAYSCALE)

    pic_target = pic[:-4]
    # convert each image of shape (32, 128, 1)
    w, h = img.shape

    if h > 128 or w > 32:
        continue
    # endif

    # Process the images to bring them to scale
    if w < 32:
        add_zeros = np.ones((32-w, h))*255
        img = np.concatenate((img, add_zeros))
    # endif
    if h < 128:
        add_zeros = np.ones((32, 128-h))*255
        img = np.concatenate((img, add_zeros), axis=1)
    # endif    

    img = np.expand_dims(img , axis = 2)

    # Normalise the image
    img = img/255.

    # Get the text for the image
    txt = pic_target.split('_')[1]
            
    # compute maximum length of the text
    if len(txt) > max_label_len:
        max_label_len = len(txt)

    if k%10 == 0:     
        valid_orig_txt.append(txt)   
        valid_label_length.append(len(txt))
        valid_input_length.append(31)
        valid_img.append(img)
        valid_txt.append(encode_to_labels(txt))
    else:
        orig_txt.append(txt)   
        train_label_length.append(len(txt))
        train_input_length.append(31)
        training_img.append(img)
        training_txt.append(encode_to_labels(txt))
    k+=1

pad each output label to maximum text length

train_padded_txt = pad_sequences(training_txt, maxlen=max_label_len, padding='post', value = len(char_list))
valid_padded_txt = pad_sequences(valid_txt, maxlen=max_label_len, padding='post', value = len(char_list))

input with shape of height=32 and width=128

inputs = Input(shape=(32,128,1))

convolution layer with kernel size (3,3)

conv_1 = Conv2D(64, (3,3), activation = 'relu', padding='same')(inputs)

poolig layer with kernel size (2,2)

pool_1 = MaxPool2D(pool_size=(2, 2), strides=2)(conv_1)

conv_2 = Conv2D(128, (3,3), activation = 'relu', padding='same')(pool_1)
pool_2 = MaxPool2D(pool_size=(2, 2), strides=2)(conv_2)

conv_3 = Conv2D(256, (3,3), activation = 'relu', padding='same')(pool_2)

conv_4 = Conv2D(256, (3,3), activation = 'relu', padding='same')(conv_3)

poolig layer with kernel size (2,1)

pool_4 = MaxPool2D(pool_size=(2, 1))(conv_4)

conv_5 = Conv2D(512, (3,3), activation = 'relu', padding='same')(pool_4)

Batch normalization layer

batch_norm_5 = BatchNormalization()(conv_5)

conv_6 = Conv2D(512, (3,3), activation = 'relu', padding='same')(batch_norm_5)
batch_norm_6 = BatchNormalization()(conv_6)
pool_6 = MaxPool2D(pool_size=(2, 1))(batch_norm_6)

conv_7 = Conv2D(512, (2,2), activation = 'relu')(pool_6)

squeezed = Lambda(lambda x: K.squeeze(x, 1))(conv_7)

bidirectional LSTM layers with units=128

blstm_1 = Bidirectional(LSTM(128, return_sequences=True, dropout = 0.2))(squeezed)
blstm_2 = Bidirectional(LSTM(128, return_sequences=True, dropout = 0.2))(blstm_1)

outputs = Dense(len(char_list)+1, activation = 'softmax')(blstm_2)

model to be used at test time

act_model = Model(inputs, outputs)

act_model.summary()

#the CTC loss fnction is to predict the output text, it is very helpfull for the
#text recognition topic.
labels = Input(name='the_labels', shape=[max_label_len], dtype='float32')
input_length = Input(name='input_length', shape=[1], dtype='int64')
label_length = Input(name='label_length', shape=[1], dtype='int64')

def ctc_lambda_func(args):
y_pred, labels, input_length, label_length = args

return K.ctc_batch_cost(labels, y_pred, input_length, label_length)

loss_out = Lambda(ctc_lambda_func, output_shape=(1,), name='ctc')([outputs, labels, input_length, label_length])

#model to be used at training time
model = Model(inputs=[inputs, labels, input_length, label_length], outputs=loss_out)

model.compile(loss={'ctc': lambda y_true, y_pred: y_pred}, optimizer = 'adam')

filepath="/home/yosra/Downloads/best_model.hdf5"
checkpoint = ModelCheckpoint(filepath=filepath, monitor='val_loss', verbose=1, save_best_only=True, mode='auto')
callbacks_list = [checkpoint]

#train the model

callbacks_list = [checkpoint]
training_img = np.array(training_img)
train_input_length = np.array(train_input_length)
train_label_length = np.array(train_label_length)

valid_img = np.array(valid_img)
valid_input_length = np.array(valid_input_length)
valid_label_length = np.array(valid_label_length)

batch_size = 256
epochs = 10

model.fit(x=[training_img, train_padded_txt, train_input_length, train_label_length], y=np.zeros(len(training_img)),
batch_size=batch_size,
epochs = epochs,
validation_data = ([valid_img, valid_padded_txt, valid_input_length, valid_label_length], [np.zeros(len(valid_img))]),
verbose = 1, callbacks = callbacks_list)

act_model.save(filepath)

#test the model
from keras.models import load_model

load the saved best model weights

act_model.load_weights(filepath)

image=valid_img[:10]

predict outputs on validation images

prediction = act_model.predict(image)

use CTC decoder

out = K.get_value(K.ctc_decode(prediction, input_length=np.ones(prediction.shape[0])*prediction.shape[1],
greedy=True)[0][0])

see the results

i = 0
for x in out:
print("original_text = ", valid_orig_txt[i])
print("predicted text = ", end = '')
print(x)
for p in x:
if int(p) != -1:
print(char_list[int(p)], end = '')
print('\n')
i+=1

Error when checking input: expected input_1 to have 4 dimensions, but got array with shape (0, 1)

Getting this error while training the images. Also, I have checked, the dataset is successfully stored in the variable. Why this error is occurring?

`ValueError Traceback (most recent call last)
in
6 epochs = epochs,
7 validation_data = ([valid_img, valid_padded_txt, valid_input_length, valid_label_length],[np.zeros(len(valid_img))]),
----> 8 verbose = 1, callbacks = callbacks_list)

E:\Anaconda\envs\PythonCPU\lib\site-packages\keras\engine\training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_freq, max_queue_size, workers, use_multiprocessing, **kwargs)
1173 val_x, val_y,
1174 sample_weight=val_sample_weight,
-> 1175 batch_size=batch_size)
1176 if self._uses_dynamic_learning_phase():
1177 val_inputs = val_x + val_y + val_sample_weights + [0]

E:\Anaconda\envs\PythonCPU\lib\site-packages\keras\engine\training.py in _standardize_user_data(self, x, y, sample_weight, class_weight, check_array_lengths, batch_size)
577 feed_input_shapes,
578 check_batch_axis=False, # Don't enforce the batch size.
--> 579 exception_prefix='input')
580
581 if y is not None:

E:\Anaconda\envs\PythonCPU\lib\site-packages\keras\engine\training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix)
133 ': expected ' + names[i] + ' to have ' +
134 str(len(shape)) + ' dimensions, but got array '
--> 135 'with shape ' + str(data_shape))
136 if not check_batch_axis:
137 data_shape = data_shape[1:]

ValueError: Error when checking input: expected input_9 to have 4 dimensions, but got array with shape (0, 1)
`
Here's my Code:
print(training_img.shape)
print(train_padded_txt.shape)
print(train_input_length.shape)
print(train_label_length.shape)

(3, 32, 128, 1)
(3, 13)
(3,)
(3,)

batch_size = 8
epochs = 20
model.fit(x=[training_img, train_padded_txt, train_input_length, train_label_length],
y=np.zeros(len(training_img)),
batch_size=batch_size,
epochs = epochs,
validation_data = ([valid_img, valid_padded_txt, valid_input_length, valid_label_length],
[np.zeros(len(valid_img))]),
verbose = 1, callbacks = callbacks_list)

Predictions are blank

Hi TheAILearner, It's an urgent need could please tell me, why the predictions are blank, ie. nothing is being predicted when I train the model on our own using your code, but when we use your trained model for predictions, then the predictions are well good. (we tried with training 50000, 100000, 150000 images are well with validation images as 10% of total )
Thank you it would be great if you could help ASAP.

Evaluation (Acc, Precision and Recall)

how can we get accuracy, precision and recall from this code? thanks

Error when checking input: expected input_1 to have 4 dimensions, but got array with shape (0, 1)

I'm following your steps but getting the same error again and again
Error when checking input: expected input_1 to have 4 dimensions, but got array with shape (0, 1)

theailearner / a-crnn-model-for-text-recognition-in-keras Goto Github PK

a-crnn-model-for-text-recognition-in-keras's Introduction

A-CRNN-model-for-Text-Recognition-in-Keras

a-crnn-model-for-text-recognition-in-keras's People

Contributors

Stargazers

Watchers

Forkers

a-crnn-model-for-text-recognition-in-keras's Issues

#Number of samples in xml file

xml_samples = len(dic)

Training Variables

pad each output label to maximum text length

input with shape of height=32 and width=128

input with shape of height=32 and width=128

convolution layer with kernel size (3,3)

poolig layer with kernel size (2,2)

poolig layer with kernel size (2,1)

Batch normalization layer

bidirectional LSTM layers with units=128

model to be used at test time

load the saved best model weights

load the saved best model weights

predict outputs on validation images

use CTC decoder

see the results

Recommend Projects

Recommend Topics

Recommend Org