请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem
- 系统环境/System Environment:Ubuntu 20.04 环境为paddle官方docker
- 版本号/Version:paddlepaddle/paddle:2.5.2-gpu-cuda11.2-cudnn8.2-trt8.0
- Paddle:paddlepaddle-gpu:2.5.2.post112
- PaddleOCR:release/2.7
- 问题相关组件/Related components:tools/train.py
- 运行指令/Command Code:python tools/train.py -c configs/rec/PP-OCRv4/ch_PP-OCRv4_rec_distill.yml
- 完整报错/Complete Error Message:
Traceback (most recent call last):
File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/root/.vscode-server/extensions/ms-python.python-2023.4.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/main.py", line 39, in
cli.main()
File "/root/.vscode-server/extensions/ms-python.python-2023.4.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 430, in main
run()
File "/root/.vscode-server/extensions/ms-python.python-2023.4.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 284, in run_file
runpy.run_path(target, run_name="main")
File "/root/.vscode-server/extensions/ms-python.python-2023.4.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 322, in run_path
pkg_name=pkg_name, script_name=fname)
File "/root/.vscode-server/extensions/ms-python.python-2023.4.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 136, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "/root/.vscode-server/extensions/ms-python.python-2023.4.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 124, in _run_code
exec(code, run_globals)
File "/paddle/tools/train.py", line 227, in
main(config, device, logger, vdl_writer)
File "/paddle/tools/train.py", line 202, in main
amp_dtype)
File "/paddle/tools/program.py", line 301, in train
preds = model(images, data=batch[1:])
File "/usr/local/lib/python3.7/dist-packages/paddle/nn/layer/layers.py", line 1254, in call
return self.forward(*inputs, **kwargs)
File "/paddle/ppocr/modeling/architectures/distillation_model.py", line 59, in forward
result_dict[model_name] = self.model_list[idx](x, data)
File "/usr/local/lib/python3.7/dist-packages/paddle/nn/layer/layers.py", line 1254, in call
return self.forward(*inputs, **kwargs)
File "/paddle/ppocr/modeling/architectures/base_model.py", line 100, in forward
x = self.head(x, targets=data)
File "/usr/local/lib/python3.7/dist-packages/paddle/nn/layer/layers.py", line 1254, in call
return self.forward(*inputs, **kwargs)
File "/paddle/ppocr/modeling/heads/rec_multi_head.py", line 92, in forward
ctc_encoder = self.ctc_encoder(x)
File "/usr/local/lib/python3.7/dist-packages/paddle/nn/layer/layers.py", line 1254, in call
return self.forward(*inputs, **kwargs)
File "/paddle/ppocr/modeling/necks/rnn.py", line 261, in forward
x = self.encoder(x)
File "/usr/local/lib/python3.7/dist-packages/paddle/nn/layer/layers.py", line 1254, in call
return self.forward(*inputs, **kwargs)
File "/paddle/ppocr/modeling/necks/rnn.py", line 208, in forward
z = self.conv1(z)
File "/usr/local/lib/python3.7/dist-packages/paddle/nn/layer/layers.py", line 1254, in call
return self.forward(*inputs, **kwargs)
File "/paddle/ppocr/modeling/backbones/rec_svtrnet.py", line 68, in forward
out = self.conv(inputs)
File "/usr/local/lib/python3.7/dist-packages/paddle/nn/layer/layers.py", line 1254, in call
return self.forward(*inputs, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/paddle/nn/layer/conv.py", line 722, in forward
use_cudnn=self._use_cudnn,
File "/usr/local/lib/python3.7/dist-packages/paddle/nn/functional/conv.py", line 141, in _conv_nd
data_format,
ValueError: (InvalidArgument) The input of Op(Conv) should be a 4-D or 5-D Tensor. But received: input's dimension is 3, input's shape is [8, 240, 256].
[Hint: Expected in_dims.size() == 4 || in_dims.size() == 5 == true, but received in_dims.size() == 4 || in_dims.size() == 5:0 != true:1.] (at ../paddle/phi/infermeta/binary.cc:475)
我镜像中用相同的数据可以用ch_PP-OCRv4_rec_hgnet.yml配置文件训练,也可以用v3的配置文件训练,只有ch_PP-OCRv4_rec_distill.yml这个配置文件报错。
我采用的ch_PP-OCRv4_rec_distill.yml配置文件的内容如下:
Global:
debug: false
use_gpu: true
epoch_num: 200
log_smooth_window: 20
print_batch_step: 10
save_model_dir: ./output/rec_dkd_400w_svtr_ctc_lcnet_blank_dkd0.1/
save_epoch_step: 40
eval_batch_step:
- 0
- 2000
cal_metric_during_train: true
pretrained_model: ./pre_train/rec/ch_PP-OCRv4_rec_train/student.pdparams
checkpoints:
save_inference_dir: doc/imgs_words/ch/
use_visualdl: false
infer_img: doc/imgs_words/ch/word_1.jpg
character_dict_path: ppocr/utils/ppocr_keys_v1.txt
max_text_length: &max_text_length 25
infer_mode: false
use_space_char: true
distributed: true
save_res_path: ./output/rec/predicts_ppocrv3.txt
Optimizer:
name: Adam
beta1: 0.9
beta2: 0.999
lr:
name: Cosine
learning_rate: 0.001
warmup_epoch: 2
regularizer:
name: L2
factor: 3.0e-05
Architecture:
model_type: rec
name: DistillationModel
algorithm: Distillation
Models:
Teacher:
pretrained:
freeze_params: true
return_all_feats: true
model_type: rec
algorithm: SVTR
Transform: null
Backbone:
name: SVTRNet
img_size:
- 48
- 320
out_char_num: 40
out_channels: 192
patch_merging: Conv
embed_dim:
- 64
- 128
- 256
depth:
- 3
- 6
- 3
num_heads:
- 2
- 4
- 8
mixer:
- Conv
- Conv
- Conv
- Conv
- Conv
- Conv
- Global
- Global
- Global
- Global
- Global
- Global
local_mixer:
- - 5
- 5
- - 5
- 5
- - 5
- 5
last_stage: false
prenorm: true
Head:
name: MultiHead
head_list:
- CTCHead:
Neck:
name: svtr
dims: 120
depth: 2
hidden_dims: 120
kernel_size: [1, 3]
use_guide: True
Head:
fc_decay: 0.00001
- NRTRHead:
nrtr_dim: 384
max_text_length: *max_text_length
Student:
pretrained:
freeze_params: false
return_all_feats: true
model_type: rec
algorithm: SVTR
Transform: null
Backbone:
name: PPLCNetV3
scale: 0.95
Head:
name: MultiHead
head_list:
- CTCHead:
Neck:
name: svtr
dims: 120
depth: 2
hidden_dims: 120
kernel_size: [1, 3]
use_guide: True
Head:
fc_decay: 0.00001
- NRTRHead:
nrtr_dim: 384
max_text_length: *max_text_length
Loss:
name: CombinedLoss
loss_config_list:
- DistillationDKDLoss:
weight: 0.1
model_name_pairs:
-
- Student
- Teacher
key: head_out
multi_head: true
alpha: 1.0
beta: 2.0
dis_head: gtc
name: dkd
- DistillationCTCLoss:
weight: 1.0
model_name_list:
- Student
key: head_out
multi_head: true
- DistillationNRTRLoss:
weight: 1.0
smoothing: false
model_name_list:
- Student
key: head_out
multi_head: true
- DistillCTCLogits:
weight: 1.0
reduction: mean
model_name_pairs:
-
- Student
- Teacher
key: head_out
PostProcess:
name: DistillationCTCLabelDecode
model_name:
- Student
key: head_out
multi_head: true
Metric:
name: DistillationMetric
base_metric_name: RecMetric
main_indicator: acc
key: Student
ignore_space: false
Train:
dataset:
name: SimpleDataSet
data_dir: ./train_data/lpd_rec
label_file_list:
-
./train_data/lpd_rec/train.txt
ratio_list:
-
1.0
transforms:
-
DecodeImage:
img_mode: BGR
channel_first: false
-
RecAug:
-
MultiLabelEncode:
gtc_encode: NRTRLabelEncode
-
RecResizeImg:
image_shape: [3, 48, 320]
-
KeepKeys:
keep_keys:
- image
- label_ctc
- label_gtc
- length
- valid_ratio
loader:
shuffle: true
batch_size_per_card: 8
drop_last: true
num_workers: 2
use_shared_memory: true
Eval:
dataset:
name: SimpleDataSet
data_dir: ./train_data/lpd_rec
label_file_list:
-
./train_data/lpd_rec/test.txt
transforms:
-
DecodeImage:
img_mode: BGR
channel_first: false
-
MultiLabelEncode:
gtc_encode: NRTRLabelEncode
-
RecResizeImg:
image_shape: [3, 48, 320]
-
KeepKeys:
keep_keys:
- image
- label_ctc
- label_gtc
- length
- valid_ratio
loader:
shuffle: false
drop_last: false
batch_size_per_card: 8
num_workers: 2
profiler_options: null
请问我要如何修改呢?
请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem
Traceback (most recent call last):
File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/root/.vscode-server/extensions/ms-python.python-2023.4.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/main.py", line 39, in
cli.main()
File "/root/.vscode-server/extensions/ms-python.python-2023.4.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 430, in main
run()
File "/root/.vscode-server/extensions/ms-python.python-2023.4.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 284, in run_file
runpy.run_path(target, run_name="main")
File "/root/.vscode-server/extensions/ms-python.python-2023.4.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 322, in run_path
pkg_name=pkg_name, script_name=fname)
File "/root/.vscode-server/extensions/ms-python.python-2023.4.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 136, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "/root/.vscode-server/extensions/ms-python.python-2023.4.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 124, in _run_code
exec(code, run_globals)
File "/paddle/tools/train.py", line 227, in
main(config, device, logger, vdl_writer)
File "/paddle/tools/train.py", line 202, in main
amp_dtype)
File "/paddle/tools/program.py", line 301, in train
preds = model(images, data=batch[1:])
File "/usr/local/lib/python3.7/dist-packages/paddle/nn/layer/layers.py", line 1254, in call
return self.forward(*inputs, **kwargs)
File "/paddle/ppocr/modeling/architectures/distillation_model.py", line 59, in forward
result_dict[model_name] = self.model_list[idx](x, data)
File "/usr/local/lib/python3.7/dist-packages/paddle/nn/layer/layers.py", line 1254, in call
return self.forward(*inputs, **kwargs)
File "/paddle/ppocr/modeling/architectures/base_model.py", line 100, in forward
x = self.head(x, targets=data)
File "/usr/local/lib/python3.7/dist-packages/paddle/nn/layer/layers.py", line 1254, in call
return self.forward(*inputs, **kwargs)
File "/paddle/ppocr/modeling/heads/rec_multi_head.py", line 92, in forward
ctc_encoder = self.ctc_encoder(x)
File "/usr/local/lib/python3.7/dist-packages/paddle/nn/layer/layers.py", line 1254, in call
return self.forward(*inputs, **kwargs)
File "/paddle/ppocr/modeling/necks/rnn.py", line 261, in forward
x = self.encoder(x)
File "/usr/local/lib/python3.7/dist-packages/paddle/nn/layer/layers.py", line 1254, in call
return self.forward(*inputs, **kwargs)
File "/paddle/ppocr/modeling/necks/rnn.py", line 208, in forward
z = self.conv1(z)
File "/usr/local/lib/python3.7/dist-packages/paddle/nn/layer/layers.py", line 1254, in call
return self.forward(*inputs, **kwargs)
File "/paddle/ppocr/modeling/backbones/rec_svtrnet.py", line 68, in forward
out = self.conv(inputs)
File "/usr/local/lib/python3.7/dist-packages/paddle/nn/layer/layers.py", line 1254, in call
return self.forward(*inputs, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/paddle/nn/layer/conv.py", line 722, in forward
use_cudnn=self._use_cudnn,
File "/usr/local/lib/python3.7/dist-packages/paddle/nn/functional/conv.py", line 141, in _conv_nd
data_format,
ValueError: (InvalidArgument) The input of Op(Conv) should be a 4-D or 5-D Tensor. But received: input's dimension is 3, input's shape is [8, 240, 256].
[Hint: Expected in_dims.size() == 4 || in_dims.size() == 5 == true, but received in_dims.size() == 4 || in_dims.size() == 5:0 != true:1.] (at ../paddle/phi/infermeta/binary.cc:475)
我镜像中用相同的数据可以用ch_PP-OCRv4_rec_hgnet.yml配置文件训练,也可以用v3的配置文件训练,只有ch_PP-OCRv4_rec_distill.yml这个配置文件报错。
我采用的ch_PP-OCRv4_rec_distill.yml配置文件的内容如下:
Global:
debug: false
use_gpu: true
epoch_num: 200
log_smooth_window: 20
print_batch_step: 10
save_model_dir: ./output/rec_dkd_400w_svtr_ctc_lcnet_blank_dkd0.1/
save_epoch_step: 40
eval_batch_step:
cal_metric_during_train: true
pretrained_model: ./pre_train/rec/ch_PP-OCRv4_rec_train/student.pdparams
checkpoints:
save_inference_dir: doc/imgs_words/ch/
use_visualdl: false
infer_img: doc/imgs_words/ch/word_1.jpg
character_dict_path: ppocr/utils/ppocr_keys_v1.txt
max_text_length: &max_text_length 25
infer_mode: false
use_space_char: true
distributed: true
save_res_path: ./output/rec/predicts_ppocrv3.txt
Optimizer:
name: Adam
beta1: 0.9
beta2: 0.999
lr:
name: Cosine
learning_rate: 0.001
warmup_epoch: 2
regularizer:
name: L2
factor: 3.0e-05
Architecture:
model_type: rec
name: DistillationModel
algorithm: Distillation
Models:
Teacher:
pretrained:
freeze_params: true
return_all_feats: true
model_type: rec
algorithm: SVTR
Transform: null
Backbone:
name: SVTRNet
img_size:
- 48
- 320
out_char_num: 40
out_channels: 192
patch_merging: Conv
embed_dim:
- 64
- 128
- 256
depth:
- 3
- 6
- 3
num_heads:
- 2
- 4
- 8
mixer:
- Conv
- Conv
- Conv
- Conv
- Conv
- Conv
- Global
- Global
- Global
- Global
- Global
- Global
local_mixer:
- - 5
- 5
- - 5
- 5
- - 5
- 5
last_stage: false
prenorm: true
Head:
name: MultiHead
head_list:
- CTCHead:
Neck:
name: svtr
dims: 120
depth: 2
hidden_dims: 120
kernel_size: [1, 3]
use_guide: True
Head:
fc_decay: 0.00001
- NRTRHead:
nrtr_dim: 384
max_text_length: *max_text_length
Student:
pretrained:
freeze_params: false
return_all_feats: true
model_type: rec
algorithm: SVTR
Transform: null
Backbone:
name: PPLCNetV3
scale: 0.95
Head:
name: MultiHead
head_list:
- CTCHead:
Neck:
name: svtr
dims: 120
depth: 2
hidden_dims: 120
kernel_size: [1, 3]
use_guide: True
Head:
fc_decay: 0.00001
- NRTRHead:
nrtr_dim: 384
max_text_length: *max_text_length
Loss:
name: CombinedLoss
loss_config_list:
weight: 0.1
model_name_pairs:
key: head_out
multi_head: true
alpha: 1.0
beta: 2.0
dis_head: gtc
name: dkd
weight: 1.0
model_name_list:
key: head_out
multi_head: true
weight: 1.0
smoothing: false
model_name_list:
key: head_out
multi_head: true
weight: 1.0
reduction: mean
model_name_pairs:
key: head_out
PostProcess:
name: DistillationCTCLabelDecode
model_name:
key: head_out
multi_head: true
Metric:
name: DistillationMetric
base_metric_name: RecMetric
main_indicator: acc
key: Student
ignore_space: false
Train:
dataset:
name: SimpleDataSet
data_dir: ./train_data/lpd_rec
label_file_list:
./train_data/lpd_rec/train.txt
ratio_list:
1.0
transforms:
DecodeImage:
img_mode: BGR
channel_first: false
RecAug:
MultiLabelEncode:
gtc_encode: NRTRLabelEncode
RecResizeImg:
image_shape: [3, 48, 320]
KeepKeys:
keep_keys:
loader:
shuffle: true
batch_size_per_card: 8
drop_last: true
num_workers: 2
use_shared_memory: true
Eval:
dataset:
name: SimpleDataSet
data_dir: ./train_data/lpd_rec
label_file_list:
./train_data/lpd_rec/test.txt
transforms:
DecodeImage:
img_mode: BGR
channel_first: false
MultiLabelEncode:
gtc_encode: NRTRLabelEncode
RecResizeImg:
image_shape: [3, 48, 320]
KeepKeys:
keep_keys:
loader:
shuffle: false
drop_last: false
batch_size_per_card: 8
num_workers: 2
profiler_options: null
请问我要如何修改呢?