Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

按操作文档finetune报错: styles = torch.LongTensor([[self.textnorm_int_dict[int(style)]] for style in text[:, 3]]).to(speech.device) IndexError: index 3 is out of bounds for dimension 1 with size 1 #158

Open
eatoncys opened this issue Nov 1, 2024 · 5 comments
Labels
bug Something isn't working

Comments

@eatoncys
Copy link

eatoncys commented Nov 1, 2024

Notice: In order to resolve issues more efficiently, please raise issue following the template.
(注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节)

🐛 Bug

To Reproduce

Steps to reproduce the behavior (always include the command you ran):

  1. Run cmd 'bash finetune.sh'
  2. See error

Traceback (most recent call last):

[2024-11-01 20:33:48,933][root][INFO] - rank: 0, dataloader start from step: 0, batch_num: 5, after: 5
[2024-11-01 20:33:48,963][root][INFO] - rank: 0, dataloader start from step: 0, batch_num: 5, after: 5
[2024-11-01 20:33:48,989][root][ERROR] - ERROR: data is empty!
[2024-11-01 20:33:51,222][root][ERROR] - ERROR: data is empty!
Error executing job with overrides: ['++model=/mnt/home/sensevoice/SenseVoiceSmall', '++trust_remote_code=true', '++train_data_set_list=/mnt/home/sensevoice/train_data/datasets/asr_dataset.jsonl', '++valid_data_set_list=/mnt/home/sensevoice/train_data/datasets/asr_val.jsonl', '++dataset_conf.data_split_num=1', '++dataset_conf.batch_sampler=BatchSampler', '++dataset_conf.batch_size=10', '++dataset_conf.sort_size=1024', '++dataset_conf.batch_type=token', '++dataset_conf.num_workers=1', '++train_conf.max_epoch=50', '++train_conf.log_interval=1', '++train_conf.resume=true', '++train_conf.validate_interval=2000', '++train_conf.save_checkpoint_interval=2000', '++train_conf.keep_nbest_models=20', '++train_conf.avg_nbest_model=10', '++train_conf.use_deepspeed=false', '++train_conf.deepspeed_config=/mnt/home/sensevoice/SenseVoice-main/deepspeed_conf/ds_stage1.json', '++optim_conf.lr=0.0002', '++output_dir=./outputs']
Traceback (most recent call last):
File "/mnt/home/sensevoice/FunASR-main/funasr/bin/train_ds.py", line 225, in
main_hydra()
File "/usr/local/miniconda3/envs/svnew/lib/python3.10/site-packages/hydra/main.py", line 94, in decorated_main
_run_hydra(
File "/usr/local/miniconda3/envs/svnew/lib/python3.10/site-packages/hydra/_internal/utils.py", line 394, in _run_hydra
_run_app(
File "/usr/local/miniconda3/envs/svnew/lib/python3.10/site-packages/hydra/_internal/utils.py", line 457, in _run_app
run_and_report(
File "/usr/local/miniconda3/envs/svnew/lib/python3.10/site-packages/hydra/_internal/utils.py", line 223, in run_and_report
raise ex
File "/usr/local/miniconda3/envs/svnew/lib/python3.10/site-packages/hydra/_internal/utils.py", line 220, in run_and_report
return func()
File "/usr/local/miniconda3/envs/svnew/lib/python3.10/site-packages/hydra/_internal/utils.py", line 458, in
lambda: hydra.run(
File "/usr/local/miniconda3/envs/svnew/lib/python3.10/site-packages/hydra/_internal/hydra.py", line 132, in run
_ = ret.return_value
File "/usr/local/miniconda3/envs/svnew/lib/python3.10/site-packages/hydra/core/utils.py", line 260, in return_value
raise self._return_value
File "/usr/local/miniconda3/envs/svnew/lib/python3.10/site-packages/hydra/core/utils.py", line 186, in run_job
ret.return_value = task_function(task_cfg)
File "/mnt/home/sensevoice/FunASR-main/funasr/bin/train_ds.py", line 56, in main_hydra
main(**kwargs)
File "/mnt/home/sensevoice/FunASR-main/funasr/bin/train_ds.py", line 173, in main
trainer.train_epoch(
File "/usr/local/miniconda3/envs/svnew/lib/python3.10/site-packages/funasr/train_utils/trainer_ds.py", line 603, in train_epoch
self.forward_step(model, batch, loss_dict=loss_dict)
File "/usr/local/miniconda3/envs/svnew/lib/python3.10/site-packages/funasr/train_utils/trainer_ds.py", line 670, in forward_step
retval = model(**batch)
File "/usr/local/miniconda3/envs/svnew/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/miniconda3/envs/svnew/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/mnt/home/sensevoice/SenseVoice-main/./model.py", line 680, in forward
encoder_out, encoder_out_lens = self.encode(speech, speech_lengths, text)
File "/mnt/home/sensevoice/SenseVoice-main/./model.py", line 733, in encode
styles = torch.LongTensor([[self.textnorm_int_dict[int(style)]] for style in text[:, 3]]).to(speech.device)
IndexError: index 3 is out of bounds for dimension 1 with size 1

数据集使用样例数据,sensevoice2jsonl转换后:
{"key": "BAC009S0764W0121", "source": "/mnt/home/sensevoice/data_example/voice/BAC009S0764W0121.wav", "source_len": 420, "target": "甚至出现交易几乎停滞的情况", "target_len": 13, "with_or_wo_itn": "<|woitn|>", "text_language": "<|zh|>", "emo_target": "<|NEUTRAL|>", "event_target": "<|Speech|>"}
{"key": "BAC009S0916W0489", "source": "/mnt/home/sensevoice/data_example/voice/BAC009S0916W0489.wav", "source_len": 573, "target": "湖北一公司以员工名义贷款数十员工负债千万", "target_len": 20, "with_or_wo_itn": "<|woitn|>", "text_language": "<|zh|>", "emo_target": "<|NEUTRAL|>", "event_target": "<|Speech|>"}
{"key": "asr_example_cn_en", "source": "/mnt/home/sensevoice/data_example/voice/asr_example_cn_en.wav", "source_len": 1474, "target": "所有只要处理 data 不管你是做 machine learning 做 deep learning 做 data analytics 做 data science 也好 scientist 也好通通都要都做的基本功啊那 again 先先对有一些也许对", "target_len": 19, "with_or_wo_itn": "<|woitn|>", "text_language": "<|zh|>", "emo_target": "<|NEUTRAL|>", "event_target": "<|Speech|>"}
{"key": "ID0012W0014", "source": "/mnt/home/sensevoice/data_example/voice/asr_example_en.wav", "source_len": 222, "target": "he tried to think how it could be", "target_len": 8, "with_or_wo_itn": "<|woitn|>", "text_language": "<|en|>", "emo_target": "<|EMO_UNKNOWN|>", "event_target": "<|Speech|>"}

Code sample

Expected behavior

Environment

  • OS (e.g., Linux):
  • FunASR Version (e.g., 1.0.0):
  • ModelScope Version (e.g., 1.11.0):
  • PyTorch Version (e.g., 2.0.0):
  • How you installed funasr (pip, source):
  • Python version:
  • GPU (e.g., V100M32)
  • CUDA/cuDNN version (e.g., cuda11.7):
  • Docker version (e.g., funasr-runtime-sdk-cpu-0.4.1)
  • Any other relevant information:

Additional context

@eatoncys eatoncys added the bug Something isn't working label Nov 1, 2024
@JonneryR
Copy link

我也遇到了这个问题。
他的代码看起来要对text去做padding,但我暂时没找到这个padding的代码在哪里,好像得自己补充了。

@qiuqiu-879
Copy link

qiuqiu-879 commented Nov 18, 2024 via email

@JonneryR
Copy link

问题解决了,是dataset的选择问题,需要选择SenseVoiceCTCDataset,只有这里才有给text前面做padding的代码。

@sadlay
Copy link

sadlay commented Nov 28, 2024

问题解决了,是dataset的选择问题,需要选择SenseVoiceCTCDataset,只有这里才有给text前面做padding的代码。

下面是我的参数,看样子我的就是SenseVoiceCTCDataset,但是还是报错。

[2024-11-28 19:33:47,484][root][INFO] - kwargs: {'encoder': 'SenseVoiceEncoderSmall', 'encoder_conf': {'output_size': 512, 'attention_heads': 4, 'linear_units': 2048, 'num_blocks': 50, 'tp_blocks': 20, 'dropout_rate': 0.1, 'positional_dropout_rate': 0.1, 'attention_dropout_rate': 0.1, 'input_layer': 'pe', 'pos_enc_class': 'SinusoidalPositionEncoder', 'normalize_before': True, 'kernel_size': 11, 'sanm_shfit': 0, 'selfattention_layer_type': 'sanm'}, 'model': 'SenseVoiceSmall', 'model_conf': {'length_normalized_loss': True, 'sos': 1, 'eos': 2, 'ignore_id': -1}, 'tokenizer': 'SentencepiecesTokenizer', 'tokenizer_conf': {'bpemodel': '/home/zhaogengs/.cache/modelscope/hub/iic/SenseVoiceSmall/chn_jpn_yue_eng_ko_spectok.bpe.model', 'unk_symbol': '<unk>', 'split_with_space': True}, 'frontend': 'WavFrontend', 'frontend_conf': {'fs': 16000, 'window': 'hamming', 'n_mels': 80, 'frame_length': 25, 'frame_shift': 10, 'lfr_m': 7, 'lfr_n': 6, 'cmvn_file': '/home/zhaogengs/.cache/modelscope/hub/iic/SenseVoiceSmall/am.mvn'}, 'dataset': 'SenseVoiceCTCDataset', 'dataset_conf': {'index_ds': 'IndexDSJsonl', 'batch_sampler': 'BatchSampler', 'data_split_num': 1, 'batch_type': 'token', 'batch_size': 100, 'max_token_length': 2000, 'min_token_length': 60, 'max_source_length': 2000, 'min_source_length': 60, 'max_target_length': 200, 'min_target_length': 0, 'shuffle': True, 'num_workers': 4, 'sos': 1, 'eos': 2, 'IndexDSJsonl': 'IndexDSJsonl', 'retry': 20, 'sort_size': 1024}, 'train_conf': {'accum_grad': 1, 'grad_clip': 5, 'max_epoch': 50, 'keep_nbest_models': 20, 'avg_nbest_model': 10, 'log_interval': 1, 'resume': True, 'validate_interval': 2000, 'save_checkpoint_interval': 2000, 'use_deepspeed': False, 'deepspeed_config': '/home/zhaogengs/workspace/SenseVoice/deepspeed_conf/ds_stage1.json'}, 'optim': 'adamw', 'optim_conf': {'lr': 0.0002}, 'scheduler': 'warmuplr', 'scheduler_conf': {'warmup_steps': 25000}, 'specaug': 'SpecAugLFR', 'specaug_conf': {'apply_time_warp': False, 'time_warp_window': 5, 'time_warp_mode': 'bicubic', 'apply_freq_mask': True, 'freq_mask_width_range': [0, 30], 'lfr_rate': 6, 'num_freq_mask': 1, 'apply_time_mask': True, 'time_mask_width_range': [0, 12], 'num_time_mask': 1}, 'init_param': '/home/zhaogengs/.cache/modelscope/hub/iic/SenseVoiceSmall/model.pt', 'config': '/home/zhaogengs/.cache/modelscope/hub/iic/SenseVoiceSmall/config.yaml', 'is_training': True, 'trust_remote_code': True, 'train_data_set_list': '/home/zhaogengs/workspace/SenseVoice/train_data/output/train.jsonl', 'valid_data_set_list': '/home/zhaogengs/workspace/SenseVoice/train_data/output/val.jsonl', 'output_dir': './outputs', 'model_path': '/home/zhaogengs/.cache/modelscope/hub/iic/SenseVoiceSmall', 'device': 'cpu'}
[2024-11-28 19:33:47,484][root][INFO] - config.yaml is saved to: ./outputs/config.yaml

错误信息如下


[2024-11-28 19:33:47,801][root][INFO] - rank: 0, dataloader start from step: 0, batch_num: 2, after: 2
[2024-11-28 19:33:47,858][root][INFO] - rank: 0, dataloader start from step: 0, batch_num: 2, after: 2
[2024-11-28 19:33:50,194][root][ERROR] - ERROR: data is empty!
[2024-11-28 19:33:50,328][root][ERROR] - ERROR: data is empty!
Error executing job with overrides: ['++model=iic/SenseVoiceSmall', '++trust_remote_code=true', '++train_data_set_list=/home/zhaogengs/workspace/SenseVoice/train_data/output/train.jsonl', '++valid_data_set_list=/home/zhaogengs/workspace/SenseVoice/train_data/output/val.jsonl', '++dataset_conf.data_split_num=1', '++dataset_conf.batch_sampler=BatchSampler', '++dataset_conf.batch_size=100', '++dataset_conf.sort_size=1024', '++dataset_conf.batch_type=token', '++dataset_conf.num_workers=4', '++train_conf.max_epoch=50', '++train_conf.log_interval=1', '++train_conf.resume=true', '++train_conf.validate_interval=2000', '++train_conf.save_checkpoint_interval=2000', '++train_conf.keep_nbest_models=20', '++train_conf.avg_nbest_model=10', '++train_conf.use_deepspeed=false', '++train_conf.deepspeed_config=/home/zhaogengs/workspace/SenseVoice/deepspeed_conf/ds_stage1.json', '++optim_conf.lr=0.0002', '++output_dir=./outputs']
Traceback (most recent call last):
  File "/home/zhaogengs/workspace/SenseVoice/FunASR/funasr/bin/train_ds.py", line 225, in <module>
    main_hydra()
  File "/home/zhaogengs/miniconda3/envs/SenseVoice/lib/python3.12/site-packages/hydra/main.py", line 94, in decorated_main
    _run_hydra(
  File "/home/zhaogengs/miniconda3/envs/SenseVoice/lib/python3.12/site-packages/hydra/_internal/utils.py", line 394, in _run_hydra
    _run_app(
  File "/home/zhaogengs/miniconda3/envs/SenseVoice/lib/python3.12/site-packages/hydra/_internal/utils.py", line 457, in _run_app
    run_and_report(
  File "/home/zhaogengs/miniconda3/envs/SenseVoice/lib/python3.12/site-packages/hydra/_internal/utils.py", line 223, in run_and_report
    raise ex
  File "/home/zhaogengs/miniconda3/envs/SenseVoice/lib/python3.12/site-packages/hydra/_internal/utils.py", line 220, in run_and_report
    return func()
           ^^^^^^
  File "/home/zhaogengs/miniconda3/envs/SenseVoice/lib/python3.12/site-packages/hydra/_internal/utils.py", line 458, in <lambda>
    lambda: hydra.run(
            ^^^^^^^^^^
  File "/home/zhaogengs/miniconda3/envs/SenseVoice/lib/python3.12/site-packages/hydra/_internal/hydra.py", line 132, in run
    _ = ret.return_value
        ^^^^^^^^^^^^^^^^
  File "/home/zhaogengs/miniconda3/envs/SenseVoice/lib/python3.12/site-packages/hydra/core/utils.py", line 260, in return_value
    raise self._return_value
  File "/home/zhaogengs/miniconda3/envs/SenseVoice/lib/python3.12/site-packages/hydra/core/utils.py", line 186, in run_job
    ret.return_value = task_function(task_cfg)
                       ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/zhaogengs/workspace/SenseVoice/FunASR/funasr/bin/train_ds.py", line 56, in main_hydra
    main(**kwargs)
  File "/home/zhaogengs/workspace/SenseVoice/FunASR/funasr/bin/train_ds.py", line 173, in main
    trainer.train_epoch(
  File "/home/zhaogengs/workspace/SenseVoice/FunASR/funasr/train_utils/trainer_ds.py", line 603, in train_epoch
    self.forward_step(model, batch, loss_dict=loss_dict)
  File "/home/zhaogengs/workspace/SenseVoice/FunASR/funasr/train_utils/trainer_ds.py", line 670, in forward_step
    retval = model(**batch)
             ^^^^^^^^^^^^^^
  File "/home/zhaogengs/miniconda3/envs/SenseVoice/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/zhaogengs/miniconda3/envs/SenseVoice/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/zhaogengs/workspace/SenseVoice/model.py", line 680, in forward
    encoder_out, encoder_out_lens = self.encode(speech, speech_lengths, text)
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/zhaogengs/workspace/SenseVoice/model.py", line 733, in encode
    styles = torch.LongTensor([[self.textnorm_int_dict[int(style)]] for style in text[:, 3]]).to(speech.device)
                                                                                 ~~~~^^^^^^
IndexError: index 3 is out of bounds for dimension 1 with size 1
E1128 19:33:55.141000 140491117365056 torch/distributed/elastic/multiprocessing/api.py:826] failed (exitcode: 1) local_rank: 0 (pid: 2095732) of binary: /home/zhaogengs/miniconda3/envs/SenseVoice/bin/python
Traceback (most recent call last):
  File "/home/zhaogengs/miniconda3/envs/SenseVoice/bin/torchrun", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/zhaogengs/miniconda3/envs/SenseVoice/lib/python3.12/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 347, in wrapper
    return f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^
  File "/home/zhaogengs/miniconda3/envs/SenseVoice/lib/python3.12/site-packages/torch/distributed/run.py", line 879, in main
    run(args)
  File "/home/zhaogengs/miniconda3/envs/SenseVoice/lib/python3.12/site-packages/torch/distributed/run.py", line 870, in run
    elastic_launch(
  File "/home/zhaogengs/miniconda3/envs/SenseVoice/lib/python3.12/site-packages/torch/distributed/launcher/api.py", line 132, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/zhaogengs/miniconda3/envs/SenseVoice/lib/python3.12/site-packages/torch/distributed/launcher/api.py", line 263, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
============================================================
/home/zhaogengs/workspace/SenseVoice/FunASR/funasr/bin/train_ds.py FAILED
------------------------------------------------------------
Failures:
  <NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2024-11-28_19:33:55
  host      : 172-16-158-67-Debian-22
  rank      : 0 (local_rank: 0)
  exitcode  : 1 (pid: 2095732)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================

@sadlay
Copy link

sadlay commented Dec 2, 2024

问题解决了,是dataset的选择问题,需要选择SenseVoiceCTCDataset,只有这里才有给text前面做padding的代码。

请问具体是怎么解决的?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants