Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

使用作者的pytorch-train/crnn.main.py训练,accuracy一直为0 #168

Open
yyyyykp opened this issue Sep 3, 2021 · 2 comments
Open

Comments

@yyyyykp
Copy link

yyyyykp commented Sep 3, 2021

使用作者的数据集源代码训练的,使用的环境配置为:python3.6+pytorch1.3
中间调了三个bug

  1. 由于在'cpu_texts = [clean_txt(tx.decode('utf-8')) for tx in cpu_texts]'这句时会报错str没有decode的方法,查询资料后感觉是python2和python3的区别,便把代码改成了'cpu_texts = [clean_txt(tx.encode('utf-8').decode('utf-8')) for tx in cpu_texts]';
  2. 在clean_texts函数中,作者是将在字典中找不到的字用空格代替了,但是这样在查询时会报错误,因此将该函数中的newTxt += u' '改成了newTxt += u'',即直接去掉了找不到的字;
  3. 由于预测出来的preds在后续处理中会报错:IndexError: Dimension out of range (expected to be in range of [-2, 1], but got 2),因此将‘preds = preds.squeeze(2)’这句代码删除了;
    不知道是我修改的代码导致特征的维度发生了问题还是什么,模型训练过程中得到的accuracy一直为0,有没有大佬可以帮忙解答一下~感激不尽!
    https://imgtu.com/i/hy8weP
@yyyyykp
Copy link
Author

yyyyykp commented Sep 3, 2021

作者放的训练数据集 data/lmdb/train 数据有200条,val是72条,不知道是不是因为数据量太小了,导致网络学不起来-_-

@yyyyykp
Copy link
Author

yyyyykp commented Sep 6, 2021

终于发现了一个问题
CHINESE-OCR/train/pytorch-train/crnn_main.py Line 271
应该是 optimizer.zero_grad()
啊~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant