I found your model has the certain size of input, so, how can your recognize images wi

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

在调用loc_net函数时报错 <a target="_blank" rel="noopener noreferrer nofollow" href="https:

about the input shape about crnn-with-stn HOT 11 OPEN

sbillburg commented on June 7, 2024

about the input shape

from crnn-with-stn.

Comments (11)

LiangHao92 commented on June 7, 2024 2

@sbillburg 哈哈哈，谢谢你了。我觉得你加了stn效果并不比没加stn效果好的原因是stn加在了后面，如果字符行本身旋转角度不大，那么其实形变比较小，后面的特征图，特别是经过了maxpooling的特征图，的特征都是经过了提炼的，你再去stn仿射变换可能效果不如直接在输入的时候做stn效果来的妥当。

from crnn-with-stn.

sbillburg commented on June 7, 2024

The input size is set by you before starting the training, and it's fixed. Once you train a model in one input shape, than rest inputs should be in the same size, including training dataset and test dataset.

My method is, set a aspect ratio like width:height = 5:1, and only a few inputs are bigger than this ratio, I resize them to 5:1. The neural network will learn features from these resized images, and if a image is so long, it will contains some features that is unique and good for recognize.
For those images which are smaller than this ratio, I add vain block(a pure black RGB(0, 0, 0) image) on both side of the image. Or say, generate a pure black image in 5:1 aspect ratio, then put the input image whose aspect ratio is smaller than 5:1 into the center of the black image.
You can find my method in the CRNN-with-STN/Batch_Generator.py, line38~line44.

My statement maybe nor clear, if you still get any question, please tell me. My English is not very good, but I'd love to help you.

from crnn-with-stn.

LiangHao92 commented on June 7, 2024

@sbillburg thanks a lot! I have got your point.

from crnn-with-stn.

sbillburg commented on June 7, 2024

看了一下才发现您是国人，那我就直接再用中文给你说一遍了。
输入长宽比不一样，在resize以后确实会影响识别结果。

所以对我来说，我的思路就是尽量少的去resize。比如我设定一个宽高比5:1，然后在数据集里生成训练batch的时候，把所有宽高比高于5:1的图片（说明图片很宽，横向很长）直接压缩为5:1，虽然会有图像上的损失或者说失真，但是如果宽高比很高，就说明单词很长，特征很明显，对于网络来说也不难识别了。

对于长宽比小于5;1的图片，说明其宽度较窄，我会在其两遍加上纯黑色的色块，生成一个5:1的图像，原始的图像长宽比并没有改变，而是靠额外的拼接使得图像达到了需要的比例。纯黑色的色块对于网络来说也会学习为‘什么都不输出’，所以不必担心识别错误的问题。

相关的实现方法在CRNN-with-STN/Batch_Generator.py, line38~line44 可以看到，如果您还有不明白的地方可以直接问我或发邮件。

from crnn-with-stn.

qwzhong1988 commented on June 7, 2024

CRNN-with-STN/Batch_Generator.py里面的38行
if (img_size[1]/img_size[0]*1.0) < 6.4:
要加个括号
if (img_size[1]/(img_size[0]*1.0)) < 6.4:
76行类似。

from crnn-with-stn.

sbillburg commented on June 7, 2024

CRNN-with-STN/Batch_Generator.py里面的38行
if (img_size[1]/img_size[0]*1.0) < 6.4:
要加个括号
if (img_size[1]/(img_size[0]*1.0)) < 6.4:
76行类似。

Can you tell me the difference? It seems the same in Python3 with or without the parentheses

from crnn-with-stn.

qwzhong1988 commented on June 7, 2024

Python3没有问题，Python2的时候会有区别，习惯上加个括号比较好

from crnn-with-stn.

qwzhong1988 commented on June 7, 2024

想问下，STN加在batchnorm_7这个位置，有什么论文或者理论依据吗？？

from crnn-with-stn.

sbillburg commented on June 7, 2024

想问下，STN加在batchnorm_7这个位置，有什么论文或者理论依据吗？？

没有，STN整个部分相当于一个模块，我只是加在了CNN和RNN之间，你可以把这一模块放在网络的任意位置，说不定可以取得更好的效果。本项目只是对于CRNN的Keras实现，以及STN的一些尝试。

from crnn-with-stn.

jingwanli6666 commented on June 7, 2024

在调用loc_net函数时报错
，请问如何解决，谢谢！

from crnn-with-stn.

sbillburg commented on June 7, 2024

感觉是张量格式不对，还是要尽量对照源代码中的输入和输出的格式来。注意源代码中的loc_net函数调用的方法和参数

…

2019年11月28日下午3:33，jingwanli6666 ***@***.***> 写道： loc_net

from crnn-with-stn.

about the input shape about crnn-with-stn HOT 11 OPEN

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent