Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

训练文字识别报错:RecursionError: maximum recursion depth exceeded while calling a Python object #11148

Open
great-wind opened this issue Oct 27, 2023 · 6 comments
Assignees

Comments

@great-wind
Copy link

请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem

  • 系统环境/System Environment:Linux version 3.10.0-514.el7.x86_64 ([email protected]) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) )
  • 版本号/Version:Paddle:2.4.2,PaddleOCR:2.7.0.3,问题相关组件/Related components:
  • 运行指令/Command Code:python tools/train.py -c configs/rec/rec_mv3_none_bilstm_ctc.yml
  • 完整报错/Complete Error Message:
[2023/10/27 10:57:17] ppocr INFO: Architecture : 
[2023/10/27 10:57:17] ppocr INFO:     Backbone : 
[2023/10/27 10:57:17] ppocr INFO:         model_name : large
[2023/10/27 10:57:17] ppocr INFO:         name : MobileNetV3
[2023/10/27 10:57:17] ppocr INFO:         scale : 0.5
[2023/10/27 10:57:17] ppocr INFO:     Head : 
[2023/10/27 10:57:17] ppocr INFO:         fc_decay : 0
[2023/10/27 10:57:17] ppocr INFO:         name : CTCHead
[2023/10/27 10:57:17] ppocr INFO:     Neck : 
[2023/10/27 10:57:17] ppocr INFO:         encoder_type : rnn
[2023/10/27 10:57:17] ppocr INFO:         hidden_size : 96
[2023/10/27 10:57:17] ppocr INFO:         name : SequenceEncoder
[2023/10/27 10:57:17] ppocr INFO:     Transform : None
[2023/10/27 10:57:17] ppocr INFO:     algorithm : CRNN
[2023/10/27 10:57:17] ppocr INFO:     model_type : rec
[2023/10/27 10:57:17] ppocr INFO: Eval : 
[2023/10/27 10:57:17] ppocr INFO:     dataset : 
[2023/10/27 10:57:17] ppocr INFO:         data_dir : ./rec_seal_real_straighten/test/
[2023/10/27 10:57:17] ppocr INFO:         label_file_list : ['./rec_seal_real_straighten/test.txt']
[2023/10/27 10:57:17] ppocr INFO:         name : SimpleDataSet
[2023/10/27 10:57:17] ppocr INFO:         transforms : 
[2023/10/27 10:57:17] ppocr INFO:             DecodeImage : 
[2023/10/27 10:57:17] ppocr INFO:                 channel_first : False
[2023/10/27 10:57:17] ppocr INFO:                 img_mode : BGR
[2023/10/27 10:57:17] ppocr INFO:             CTCLabelEncode : None
[2023/10/27 10:57:17] ppocr INFO:             RecResizeImg : 
[2023/10/27 10:57:17] ppocr INFO:                 image_shape : [3, 32, 100]
[2023/10/27 10:57:17] ppocr INFO:             KeepKeys : 
[2023/10/27 10:57:17] ppocr INFO:                 keep_keys : ['image', 'label', 'length']
[2023/10/27 10:57:17] ppocr INFO:     loader : 
[2023/10/27 10:57:17] ppocr INFO:         batch_size_per_card : 1
[2023/10/27 10:57:17] ppocr INFO:         drop_last : False
[2023/10/27 10:57:17] ppocr INFO:         num_workers : 1
[2023/10/27 10:57:17] ppocr INFO:         shuffle : False
[2023/10/27 10:57:17] ppocr INFO:         use_shared_memory : False
[2023/10/27 10:57:17] ppocr INFO: Global : 
[2023/10/27 10:57:17] ppocr INFO:     cal_metric_during_train : True
[2023/10/27 10:57:17] ppocr INFO:     character_dict_path : ppocr/utils/en_dict.txt
[2023/10/27 10:57:17] ppocr INFO:     checkpoints : None
[2023/10/27 10:57:17] ppocr INFO:     distributed : False
[2023/10/27 10:57:17] ppocr INFO:     epoch_num : 30
[2023/10/27 10:57:17] ppocr INFO:     eval_batch_step : [0, 135]
[2023/10/27 10:57:17] ppocr INFO:     infer_img : doc/imgs_words_en/word_10.png
[2023/10/27 10:57:17] ppocr INFO:     infer_mode : False
[2023/10/27 10:57:17] ppocr INFO:     log_smooth_window : 20
[2023/10/27 10:57:17] ppocr INFO:     max_text_length : 25
[2023/10/27 10:57:17] ppocr INFO:     pretrained_model : ./pretrain_models/rec_mv3_none_bilstm_ctc_v2.0_train/best_accuracy
[2023/10/27 10:57:17] ppocr INFO:     print_batch_step : 10
[2023/10/27 10:57:17] ppocr INFO:     save_epoch_step : 3
[2023/10/27 10:57:17] ppocr INFO:     save_inference_dir : ./
[2023/10/27 10:57:17] ppocr INFO:     save_model_dir : ./output/rec/mv3_none_bilstm_ctc/
[2023/10/27 10:57:17] ppocr INFO:     save_res_path : ./output/rec/predicts_ic15.txt
[2023/10/27 10:57:17] ppocr INFO:     use_gpu : True
[2023/10/27 10:57:17] ppocr INFO:     use_space_char : False
[2023/10/27 10:57:17] ppocr INFO:     use_visualdl : False
[2023/10/27 10:57:17] ppocr INFO: Loss : 
[2023/10/27 10:57:17] ppocr INFO:     name : CTCLoss
[2023/10/27 10:57:17] ppocr INFO: Metric : 
[2023/10/27 10:57:17] ppocr INFO:     main_indicator : acc
[2023/10/27 10:57:17] ppocr INFO:     name : RecMetric
[2023/10/27 10:57:17] ppocr INFO: Optimizer : 
[2023/10/27 10:57:17] ppocr INFO:     beta1 : 0.9
[2023/10/27 10:57:17] ppocr INFO:     beta2 : 0.999
[2023/10/27 10:57:17] ppocr INFO:     lr : 
[2023/10/27 10:57:17] ppocr INFO:         learning_rate : 0.0001
[2023/10/27 10:57:17] ppocr INFO:     name : Adam
[2023/10/27 10:57:17] ppocr INFO:     regularizer : 
[2023/10/27 10:57:17] ppocr INFO:         factor : 0
[2023/10/27 10:57:17] ppocr INFO:         name : L2
[2023/10/27 10:57:17] ppocr INFO: PostProcess : 
[2023/10/27 10:57:17] ppocr INFO:     name : CTCLabelDecode
[2023/10/27 10:57:17] ppocr INFO: Train : 
[2023/10/27 10:57:17] ppocr INFO:     dataset : 
[2023/10/27 10:57:17] ppocr INFO:         data_dir : ./rec_seal_real_straighten/train/
[2023/10/27 10:57:17] ppocr INFO:         label_file_list : ['./rec_seal_real_straighten/train.txt']
[2023/10/27 10:57:17] ppocr INFO:         name : SimpleDataSet
[2023/10/27 10:57:17] ppocr INFO:         transforms : 
[2023/10/27 10:57:17] ppocr INFO:             DecodeImage : 
[2023/10/27 10:57:17] ppocr INFO:                 channel_first : False
[2023/10/27 10:57:17] ppocr INFO:                 img_mode : BGR
[2023/10/27 10:57:17] ppocr INFO:             CTCLabelEncode : None
[2023/10/27 10:57:17] ppocr INFO:             RecResizeImg : 
[2023/10/27 10:57:17] ppocr INFO:                 image_shape : [3, 32, 100]
[2023/10/27 10:57:17] ppocr INFO:             KeepKeys : 
[2023/10/27 10:57:17] ppocr INFO:                 keep_keys : ['image', 'label', 'length']
[2023/10/27 10:57:17] ppocr INFO:     loader : 
[2023/10/27 10:57:17] ppocr INFO:         batch_size_per_card : 9
[2023/10/27 10:57:17] ppocr INFO:         drop_last : True
[2023/10/27 10:57:17] ppocr INFO:         num_workers : 1
[2023/10/27 10:57:17] ppocr INFO:         shuffle : True
[2023/10/27 10:57:17] ppocr INFO:         use_shared_memory : False
[2023/10/27 10:57:17] ppocr INFO: profiler_options : None
[2023/10/27 10:57:17] ppocr INFO: train with paddle 2.4.2 and device Place(gpu:0)
[2023/10/27 10:57:17] ppocr INFO: Initialize indexs of datasets:['./rec_seal_real_straighten/train.txt']
[2023/10/27 10:57:17] ppocr INFO: Initialize indexs of datasets:['./rec_seal_real_straighten/test.txt']
[2023/10/27 10:57:19] ppocr INFO: train dataloader has 135 iters
[2023/10/27 10:57:19] ppocr INFO: valid dataloader has 224 iters
[2023/10/27 10:57:19] ppocr WARNING: The shape of model params head.fc.bias [96] not matched with loaded params head.fc.bias [37] !
[2023/10/27 10:57:19] ppocr WARNING: The shape of model params head.fc.weight [192, 96] not matched with loaded params head.fc.weight [192, 37] !
[2023/10/27 10:57:19] ppocr INFO: load pretrain successful from ./pretrain_models/rec_mv3_none_bilstm_ctc_v2.0_train/best_accuracy
[2023/10/27 10:57:19] ppocr INFO: During the training process, after the 0th iteration, an evaluation is run every 135 iterations
[2023/10/27 10:57:25] ppocr ERROR: When parsing line 0829_0.png	科技有限公司
, error happened with msg: Traceback (most recent call last):
  File "/home/liguobao/PaddleOCR/ppocr/data/simple_dataset.py", line 157, in __getitem__
    data['image'] = img
RecursionError: maximum recursion depth exceeded while calling a Python object

我们提供了AceIssueSolver来帮助你解答问题,你是否想要它来解答(请填写yes/no)?/We provide AceIssueSolver to solve issues, do you want it? (Please write yes/no): no

查询之后,得到的答案是程序在运行过程中超过最大的递归深度,可以通过修改默认的递归深度解决,但是治标不治本,请问该问题如何从根本上解决???求大佬解惑

@PavloMyrotiuk
Copy link

PavloMyrotiuk commented Dec 20, 2023

Have similar error:

Exception in thread Thread-1 (_thread_loop):
Traceback (most recent call last):
  File "/Users/pmyrotiuk/Workspace/PaddleOCR/ppocr/data/simple_dataset.py", line 159, in __getitem__
    outs = transform(data, self.ops)
  File "/Users/pmyrotiuk/Workspace/PaddleOCR/ppocr/data/imaug/__init__.py", line 56, in transform
    data = op(data)
  File "/Users/pmyrotiuk/Workspace/PaddleOCR/ppocr/data/imaug/label_ops.py", line 1260, in __call__
    data_ctc = copy.deepcopy(data)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/paddle-train/lib/python3.10/copy.py", line 146, in deepcopy
    y = copier(x, memo)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/paddle-train/lib/python3.10/copy.py", line 231, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/paddle-train/lib/python3.10/copy.py", line 146, in deepcopy
    y = copier(x, memo)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/paddle-train/lib/python3.10/copy.py", line 206, in _deepcopy_list
    append(deepcopy(a, memo))
  File "/opt/homebrew/Caskroom/miniconda/base/envs/paddle-train/lib/python3.10/copy.py", line 146, in deepcopy
    y = copier(x, memo)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/paddle-train/lib/python3.10/copy.py", line 231, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/paddle-train/lib/python3.10/copy.py", line 177, in deepcopy
    _keep_alive(x, memo) # Make sure x lives at least as long as d
  File "/opt/homebrew/Caskroom/miniconda/base/envs/paddle-train/lib/python3.10/copy.py", line 254, in _keep_alive
    memo[id(memo)].append(x)
RecursionError: maximum recursion depth exceeded while calling a Python object

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/homebrew/Caskroom/miniconda/base/envs/paddle-train/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/opt/homebrew/Caskroom/miniconda/base/envs/paddle-train/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/paddle-train/lib/python3.10/site-packages/paddle/io/dataloader/dataloader_iter.py", line 235, in _thread_loop
    batch = self._dataset_fetcher.fetch(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/paddle-train/lib/python3.10/site-packages/paddle/io/dataloader/fetcher.py", line 78, in fetch
    data.append(self.dataset[idx])
  File "/Users/pmyrotiuk/Workspace/PaddleOCR/ppocr/data/simple_dataset.py", line 169, in __getitem__
    return self.__getitem__(rnd_idx)
  File "/Users/pmyrotiuk/Workspace/PaddleOCR/ppocr/data/simple_dataset.py", line 169, in __getitem__
    return self.__getitem__(rnd_idx)
  File "/Users/pmyrotiuk/Workspace/PaddleOCR/ppocr/data/simple_dataset.py", line 169, in __getitem__
    return self.__getitem__(rnd_idx)
  [Previous line repeated 980 more times]
  File "/Users/pmyrotiuk/Workspace/PaddleOCR/ppocr/data/simple_dataset.py", line 163, in __getitem__
    data_line, traceback.format_exc()))
  File "/opt/homebrew/Caskroom/miniconda/base/envs/paddle-train/lib/python3.10/traceback.py", line 183, in format_exc
    return "".join(format_exception(*sys.exc_info(), limit=limit, chain=chain))
  File "/opt/homebrew/Caskroom/miniconda/base/envs/paddle-train/lib/python3.10/traceback.py", line 135, in format_exception
    te = TracebackException(type(value), value, tb, limit=limit, compact=True)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/paddle-train/lib/python3.10/site-packages/exceptiongroup/_formatting.py", line 96, in __init__
    self.stack = traceback.StackSummary.extract(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/paddle-train/lib/python3.10/traceback.py", line 383, in extract
    f.line
  File "/opt/homebrew/Caskroom/miniconda/base/envs/paddle-train/lib/python3.10/traceback.py", line 306, in line
    self._line = linecache.getline(self.filename, self.lineno)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/paddle-train/lib/python3.10/linecache.py", line 30, in getline
    lines = getlines(filename, module_globals)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/paddle-train/lib/python3.10/linecache.py", line 46, in getlines
    return updatecache(filename, module_globals)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/paddle-train/lib/python3.10/linecache.py", line 136, in updatecache
    with tokenize.open(fullname) as fp:
  File "/opt/homebrew/Caskroom/miniconda/base/envs/paddle-train/lib/python3.10/tokenize.py", line 394, in open
    buffer = _builtin_open(filename, 'rb')
RecursionError: maximum recursion depth exceeded while calling a Python object

@Haeppypuppy
Copy link

Same here

iterations
[2024/02/16 20:08:38] ppocr ERROR: When parsing line icdar_c4_train_imgs/img_440.jpg [{"transcription": "Sprite", "points": [[817, 247], [872, 250], [867, 279], [811, 276]]}, {"transcription": "DARY", "points": [[806, 74], [858, 61], [860, 91], [808, 104]]}, {"transcription": "###", "points": [[749, 444], [763, 451], [761, 482], [747, 476]]}, {"transcription": "###", "points": [[761, 453], [778, 461], [777, 481], [760, 473]]}, {"transcription": "Sprite", "points": [[888, 565], [927, 548], [929, 566], [890, 583]]}, {"transcription": "###", "points": [[1048, 456], [1111, 485], [1101, 490], [1038, 460]]}, {"transcription": "###", "points": [[1131, 434], [1174, 451], [1148, 460], [1105, 443]]}, {"transcription": "Sottis", "points": [[834, 420], [855, 416], [856, 428], [835, 432]]}, {"transcription": "###", "points": [[251, 301], [284, 304], [278, 313], [245, 310]]}, {"transcription": "###", "points": [[257, 293], [286, 294], [283, 303], [254, 302]]}, {"transcription": "###", "points": [[466, 180], [516, 175], [519, 188], [469, 192]]}, {"transcription": "###", "points": [[1037, 458], [1108, 492], [1096, 498], [1026, 464]]}]
, error happened with msg: Traceback (most recent call last):
File "C:\Users\nana\0216\PaddleOCR-release-2.6.1\ppocr\data\simple_dataset.py", line 151, in getitem
data['image'] = img
RecursionError: maximum recursion depth exceeded while calling a Python object

Fatal Python error: Cannot recover from stack overflow.
Python runtime state: initialized

@RishabhSheoran
Copy link

Hi! Did you find a solution?

@Aukeen
Copy link

Aukeen commented Jan 8, 2025

I had the same problem today. Can I ask if anyone has a solution?

@nliaudat
Copy link

I have the same error if I do NOT resize images

    - RecResizeImg:
        image_shape:
        - 3
        - 48
        - 246

@Xenrose
Copy link

Xenrose commented Mar 24, 2025

@Haeppypuppy @RishabhSheoran @Aukeen @nliaudat

The error originates from the self.delimiter variable in ./ppocr/data/simple_dataset.py.

If the delimiter is not explicitly set in the config file under the dataset section, it defaults to \t.
This delimiter is used to split the image path and label data in the label.txt file.
(Refer to lines 35, 98, 124, 172, and 226 in ./ppocr/data/simple_dataset.py.)

Using the example provided by Haeppypuppy:

icdar_c4_train_imgs/img_440.jpg [{"transcription": "Sprite", ...

In this case, the image path and label are separated by a normal space (" ").
Therefore, you should either replace the space with a tab character (\t) in your label file,
or explicitly define the delimiter in the YAML config file under the dataset section as follows:

Train:
  dataset:
    delimiter: " "
    name: SimpleDataSet
    data_dir: ./dataset/rec_test/train_images/train
  • This issue applies to both detection and recognition tasks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants