about the input shape #3

LiangHao92 · 2018-10-22T01:58:57Z

I found your model has the certain size of input, so, how can your recognize images with uncertain size? Like a 64*500 image, if resize the image, it main destroy its aspect ratio and influence the result, is it?

sbillburg · 2018-10-23T08:30:03Z

The input size is set by you before starting the training, and it's fixed. Once you train a model in one input shape, than rest inputs should be in the same size, including training dataset and test dataset.

My method is, set a aspect ratio like width:height = 5:1, and only a few inputs are bigger than this ratio, I resize them to 5:1. The neural network will learn features from these resized images, and if a image is so long, it will contains some features that is unique and good for recognize.
For those images which are smaller than this ratio, I add vain block(a pure black RGB(0, 0, 0) image) on both side of the image. Or say, generate a pure black image in 5:1 aspect ratio, then put the input image whose aspect ratio is smaller than 5:1 into the center of the black image.
You can find my method in the CRNN-with-STN/Batch_Generator.py, line38~line44.

My statement maybe nor clear, if you still get any question, please tell me. My English is not very good, but I'd love to help you.

LiangHao92 · 2018-10-23T08:36:22Z

@sbillburg thanks a lot! I have got your point.

sbillburg · 2018-10-23T08:37:23Z

看了一下才发现您是国人，那我就直接再用中文给你说一遍了。
输入长宽比不一样，在resize以后确实会影响识别结果。

所以对我来说，我的思路就是尽量少的去resize。比如我设定一个宽高比5:1，然后在数据集里生成训练batch的时候，把所有宽高比高于5:1的图片（说明图片很宽，横向很长）直接压缩为5:1，虽然会有图像上的损失或者说失真，但是如果宽高比很高，就说明单词很长，特征很明显，对于网络来说也不难识别了。

对于长宽比小于5;1的图片，说明其宽度较窄，我会在其两遍加上纯黑色的色块，生成一个5:1的图像，原始的图像长宽比并没有改变，而是靠额外的拼接使得图像达到了需要的比例。纯黑色的色块对于网络来说也会学习为‘什么都不输出’，所以不必担心识别错误的问题。

相关的实现方法在CRNN-with-STN/Batch_Generator.py, line38~line44 可以看到，如果您还有不明白的地方可以直接问我或发邮件。

LiangHao92 · 2018-10-25T02:11:49Z

@sbillburg 哈哈哈，谢谢你了。我觉得你加了stn效果并不比没加stn效果好的原因是stn加在了后面，如果字符行本身旋转角度不大，那么其实形变比较小，后面的特征图，特别是经过了maxpooling的特征图，的特征都是经过了提炼的，你再去stn仿射变换可能效果不如直接在输入的时候做stn效果来的妥当。

qwzhong1988 · 2018-11-12T08:55:03Z

CRNN-with-STN/Batch_Generator.py里面的38行
if (img_size[1]/img_size[0]*1.0) < 6.4:
要加个括号
if (img_size[1]/(img_size[0]*1.0)) < 6.4:
76行类似。

sbillburg · 2018-11-12T18:08:27Z

CRNN-with-STN/Batch_Generator.py里面的38行
if (img_size[1]/img_size[0]*1.0) < 6.4:
要加个括号
if (img_size[1]/(img_size[0]*1.0)) < 6.4:
76行类似。

Can you tell me the difference? It seems the same in Python3 with or without the parentheses

qwzhong1988 · 2018-11-13T03:01:12Z

Python3没有问题，Python2的时候会有区别，习惯上加个括号比较好

qwzhong1988 · 2018-11-13T03:06:46Z

想问下，STN加在batchnorm_7这个位置，有什么论文或者理论依据吗？？

sbillburg · 2018-11-13T09:14:20Z

想问下，STN加在batchnorm_7这个位置，有什么论文或者理论依据吗？？

没有，STN整个部分相当于一个模块，我只是加在了CNN和RNN之间，你可以把这一模块放在网络的任意位置，说不定可以取得更好的效果。本项目只是对于CRNN的Keras实现，以及STN的一些尝试。

jingwanli6666 · 2019-11-28T07:33:22Z

在调用loc_net函数时报错
，请问如何解决，谢谢！

sbillburg · 2019-11-28T10:05:15Z

感觉是张量格式不对，还是要尽量对照源代码中的输入和输出的格式来。注意源代码中的loc_net函数调用的方法和参数

…

2019年11月28日下午3:33，jingwanli6666 ***@***.***> 写道： loc_net

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

about the input shape #3

about the input shape #3

LiangHao92 commented Oct 22, 2018

sbillburg commented Oct 23, 2018

LiangHao92 commented Oct 23, 2018

sbillburg commented Oct 23, 2018

LiangHao92 commented Oct 25, 2018

qwzhong1988 commented Nov 12, 2018

sbillburg commented Nov 12, 2018

qwzhong1988 commented Nov 13, 2018 •

edited

Loading

qwzhong1988 commented Nov 13, 2018

sbillburg commented Nov 13, 2018

jingwanli6666 commented Nov 28, 2019

sbillburg commented Nov 28, 2019 via email

about the input shape #3

about the input shape #3

Comments

LiangHao92 commented Oct 22, 2018

sbillburg commented Oct 23, 2018

LiangHao92 commented Oct 23, 2018

sbillburg commented Oct 23, 2018

LiangHao92 commented Oct 25, 2018

qwzhong1988 commented Nov 12, 2018

sbillburg commented Nov 12, 2018

qwzhong1988 commented Nov 13, 2018 • edited Loading

qwzhong1988 commented Nov 13, 2018

sbillburg commented Nov 13, 2018

jingwanli6666 commented Nov 28, 2019

sbillburg commented Nov 28, 2019 via email

qwzhong1988 commented Nov 13, 2018 •

edited

Loading