IndexError in DataLoader Worker Process with Custom Dataset #48

yulrio · 2024-08-16T13:26:11Z

Hello,

I'm currently using your code from the repository [insert repository name] with my own dataset, but I'm encountering an IndexError during the training phase. Below is the traceback I received:

[ Fri Aug 16 10:18:36 2024 ] Parameters:
{'work_dir': './work_dir/baseline_res18/', 'config': './configs/baseline.yaml', 'random_fix': True, 'device': '3', 'phase': 'train', 'save_interval': 5, 'random_seed': 0, 'eval_interval': 1, 'print_log': True, 'log_interval': 50, 'evaluate_tool': 'sclite', 'feeder': 'dataset.dataloader_video.BaseFeeder', 'dataset': 'QSLR2024', 'dataset_info': {'dataset_root': './dataset/QSLR2024', 'dict_path': './preprocess/QSLR2024/gloss_dict.npy', 'evaluation_dir': './evaluation/slr_eval', 'evaluation_prefix': 'QSLR2024-groundtruth'}, 'num_worker': 10, 'feeder_args': {'mode': 'test', 'datatype': 'video', 'num_gloss': -1, 'drop_ratio': 1.0, 'prefix': './dataset/QSLR2024', 'transform_mode': False}, 'model': 'slr_network.SLRModel', 'model_args': {'num_classes': 65, 'c2d_type': 'resnet18', 'conv_type': 2, 'use_bn': 1, 'share_classifier': False, 'weight_norm': False}, 'load_weights': None, 'load_checkpoints': None, 'decode_mode': 'beam', 'ignore_weights': [], 'batch_size': 2, 'test_batch_size': 8, 'loss_weights': {'SeqCTC': 1.0}, 'optimizer_args': {'optimizer': 'Adam', 'base_lr': 0.0001, 'step': [20, 35], 'learning_ratio': 1, 'weight_decay': 0.0001, 'start_epoch': 0, 'nesterov': False}, 'num_epoch': 30}

0%| | 0/162 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/raid/data/m33221012/VAC_CSLR_QSLR/main.py", line 213, in
processor.start()
File "/raid/data/m33221012/VAC_CSLR_QSLR/main.py", line 44, in start
seq_train(self.data_loader['train'], self.model, self.optimizer,
File "/raid/data/m33221012/VAC_CSLR_QSLR/seq_scripts.py", line 18, in seq_train
for batch_idx, data in enumerate(tqdm(loader)):
File "/home/m33221012/miniconda3/envs/py31012/lib/python3.10/site-packages/tqdm/std.py", line 1181, in iter
for obj in iterable:
File "/home/m33221012/miniconda3/envs/py31012/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 630, in next
data = self._next_data()
File "/home/m33221012/miniconda3/envs/py31012/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1344, in _next_data
return self._process_data(data)
File "/home/m33221012/miniconda3/envs/py31012/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1370, in _process_data
data.reraise()
File "/home/m33221012/miniconda3/envs/py31012/lib/python3.10/site-packages/torch/_utils.py", line 706, in reraise
raise exception
IndexError: Caught IndexError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/m33221012/miniconda3/envs/py31012/lib/python3.10/site-packages/torch/utils/data/_utils/worker.py", line 309, in _worker_loop
data = fetcher.fetch(index) # type: ignore[possibly-undefined]
File "/home/m33221012/miniconda3/envs/py31012/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 52, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/m33221012/miniconda3/envs/py31012/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 52, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/raid/data/m33221012/VAC_CSLR_QSLR/dataset/dataloader_video.py", line 48, in getitem
input_data, label = self.normalize(input_data, label)
File "/raid/data/m33221012/VAC_CSLR_QSLR/dataset/dataloader_video.py", line 80, in normalize
video, label = self.data_aug(video, label, file_id)
File "/raid/data/m33221012/VAC_CSLR_QSLR/utils/video_augmentation.py", line 24, in call
image = t(image)
File "/raid/data/m33221012/VAC_CSLR_QSLR/utils/video_augmentation.py", line 119, in call
if isinstance(clip[0], np.ndarray):
IndexError: list index out of range

It seems the issue occurs within the video_augmentation.py script when accessing clip[0]. I suspect it might be related to the data augmentation process or the input data structure.

Since I'm using my own dataset, could you please let me know what specific adjustments or preprocessing steps are necessary to ensure compatibility with your code? Additionally, is there a possibility that this error is related to hardware settings, such as GPU configuration or memory limitations?

Any advice on how to resolve this error and properly integrate my dataset would be greatly appreciated.

Thank you in advance for your help!

RafaelAmauri · 2024-09-26T00:00:20Z

Did you run the preprocessing script on your training data before training? I was having this issue too when using a custom dataset, but after running the pre-processing script it worked out fine.

yulrio · 2024-09-29T09:04:13Z

Thank you for replying to my question.
May I know the configuration of the .yaml file?
Thanks in advance.

RafaelAmauri · 2024-09-29T13:52:33Z

I am using the default values. I haven't changed any configs

Onestringlab · 2024-10-09T01:26:54Z

I just ran the following command:

!python main.py --load-weights resnet18_baseline_dev_23.80_epoch25_model.pt --phase test --device 0

and got the following result:

Loading model finished.
Loading data
train 5671
Apply training transform.

train 5671
Apply testing transform.

dev 540
Apply testing transform.

test 629
Apply testing transform.

Loading data finished.
Working tree is dirty. Patch:
diff --git a/.gitignore b/.gitignore
old mode 100755
new mode 100644

[ Tue Oct  8 22:35:41 2024 ] Model: slr_network.SLRModel.
[ Tue Oct  8 22:35:42 2024 ] Weights: /content/drive/MyDrive/MyResearch/pretrain/resnet18_baseline_dev_23.80_epoch25_model.pt.
100% 68/68 [1:09:24<00:00, 61.24s/it]
/content/drive/MyDrive/MyResearch/VAC_CSLR_ORI_OSL
preprocess.sh ./work_dir/baseline_res18/output-hypothesis-dev-conv.ctm ./work_dir/baseline_res18/tmp.ctm ./work_dir/baseline_res18/tmp2.ctm
Tue Oct 8 11:45:07 PM UTC 2024
Preprocess Finished.
Unexpected error: <class 'AttributeError'>
[ Tue Oct  8 23:45:07 2024 ] Epoch 6667, dev 100.00%
100% 79/79 [1:15:47<00:00, 57.56s/it]
/content/drive/MyDrive/MyResearch/VAC_CSLR_ORI_OSL
preprocess.sh ./work_dir/baseline_res18/output-hypothesis-test-conv.ctm ./work_dir/baseline_res18/tmp.ctm ./work_dir/baseline_res18/tmp2.ctm
Wed Oct 9 01:00:55 AM UTC 2024
Preprocess Finished.
Unexpected error: <class 'AttributeError'>
[ Wed Oct  9 01:00:55 2024 ] Epoch 6667, test 100.00%
[ Wed Oct  9 01:00:55 2024 ] Evaluation Done.

Can you explain why the error Unexpected error: <class 'AttributeError'> occurred and which part of the code needs to be corrected?

Also, why did I get 100% for both dev and test?

Thanks in advance!

RafaelAmauri · 2024-10-30T04:01:12Z

File "/raid/data/m33221012/VAC_CSLR_QSLR/dataset/dataloader_video.py", line 48, in getitem
input_data, label = self.normalize(input_data, label)
File "/raid/data/m33221012/VAC_CSLR_QSLR/dataset/dataloader_video.py", line 80, in normalize
video, label = self.data_aug(video, label, file_id)
File "/raid/data/m33221012/VAC_CSLR_QSLR/utils/video_augmentation.py", line 24, in call
image = t(image)
File "/raid/data/m33221012/VAC_CSLR_QSLR/utils/video_augmentation.py", line 119, in call
if isinstance(clip[0], np.ndarray):
IndexError: list index out of range

Just in case anyone else runs into this, this error happens because the dataloader couldn't load the dataset for whatever reason. I just had this error again because inside my dataset I had it like this: dataset/features/train,test,dev. I forgot to add the 'fullFrame-256x256px' folder right after features, and because of that the dataloader wasn't able to find the train/test/dev folders. It is hard-coded to look specifically for a fullFrame-256x256px folder, and when it couldn't find one, nothing was loaded.

This is to say, make sure that the structure inside your custom dataset is 100% similar to the one found inside phoenix2014. Any changes could break the training script.

RafaelAmauri · 2024-10-30T04:10:14Z

I just ran the following command:

!python main.py --load-weights resnet18_baseline_dev_23.80_epoch25_model.pt --phase test --device 0

and got the following result:

Loading model finished.
Loading data
train 5671
Apply training transform.

train 5671
Apply testing transform.

dev 540
Apply testing transform.

test 629
Apply testing transform.

Loading data finished.
Working tree is dirty. Patch:
diff --git a/.gitignore b/.gitignore
old mode 100755
new mode 100644

[ Tue Oct  8 22:35:41 2024 ] Model: slr_network.SLRModel.
[ Tue Oct  8 22:35:42 2024 ] Weights: /content/drive/MyDrive/MyResearch/pretrain/resnet18_baseline_dev_23.80_epoch25_model.pt.
100% 68/68 [1:09:24<00:00, 61.24s/it]
/content/drive/MyDrive/MyResearch/VAC_CSLR_ORI_OSL
preprocess.sh ./work_dir/baseline_res18/output-hypothesis-dev-conv.ctm ./work_dir/baseline_res18/tmp.ctm ./work_dir/baseline_res18/tmp2.ctm
Tue Oct 8 11:45:07 PM UTC 2024
Preprocess Finished.
Unexpected error: <class 'AttributeError'>
[ Tue Oct  8 23:45:07 2024 ] Epoch 6667, dev 100.00%
100% 79/79 [1:15:47<00:00, 57.56s/it]
/content/drive/MyDrive/MyResearch/VAC_CSLR_ORI_OSL
preprocess.sh ./work_dir/baseline_res18/output-hypothesis-test-conv.ctm ./work_dir/baseline_res18/tmp.ctm ./work_dir/baseline_res18/tmp2.ctm
Wed Oct 9 01:00:55 AM UTC 2024
Preprocess Finished.
Unexpected error: <class 'AttributeError'>
[ Wed Oct  9 01:00:55 2024 ] Epoch 6667, test 100.00%
[ Wed Oct  9 01:00:55 2024 ] Evaluation Done.

Can you explain why the error Unexpected error: <class 'AttributeError'> occurred and which part of the code needs to be corrected?

Also, why did I get 100% for both dev and test?

Thanks in advance!

I don't know how to fix the AttributeError, but getting 100% WER on the dev and test splits happens because you need to have an 'evaluation' folder in the folder where the main code for VAC is. Inside this evaluation folder you need to have the .stm files with the groundtruth for the dev and test splits.

Luckily, the preprocessing step generates these automatically. After you run the preprocessing step, you should see a new folder created inside the preprocess folder with the name of your dataset. There you will find the .stm files with the groundtruth.

The phoenix dataset comes with this evaluation folder by default with a bunch of different files, not only the .stm files, so I don't know if it's only the .stms that you need or if you need the rest too. What I did was copy the entire 'evaluation' folder from phoenix and just replaced the .stms that come with phoenix with the ones generated by the preprocessing script for my custom dataset.

Good luck!

Onestringlab · 2024-10-31T06:24:46Z

Thank you for the answer.

Could you let me know which version of PyTorch you used for these experiments?

Thanks again!

RafaelAmauri · 2024-11-01T01:58:42Z

Thank you for the answer.

Could you let me know which version of PyTorch you used for these experiments?

Thanks again!

I'm using python 3.8.10 and pytorch 1.13.1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IndexError in DataLoader Worker Process with Custom Dataset #48

IndexError in DataLoader Worker Process with Custom Dataset #48

yulrio commented Aug 16, 2024

RafaelAmauri commented Sep 26, 2024

yulrio commented Sep 29, 2024

RafaelAmauri commented Sep 29, 2024

Onestringlab commented Oct 9, 2024

RafaelAmauri commented Oct 30, 2024

RafaelAmauri commented Oct 30, 2024 •

edited

Loading

Onestringlab commented Oct 31, 2024

RafaelAmauri commented Nov 1, 2024

IndexError in DataLoader Worker Process with Custom Dataset #48

IndexError in DataLoader Worker Process with Custom Dataset #48

Comments

yulrio commented Aug 16, 2024

RafaelAmauri commented Sep 26, 2024

yulrio commented Sep 29, 2024

RafaelAmauri commented Sep 29, 2024

Onestringlab commented Oct 9, 2024

RafaelAmauri commented Oct 30, 2024

RafaelAmauri commented Oct 30, 2024 • edited Loading

Onestringlab commented Oct 31, 2024

RafaelAmauri commented Nov 1, 2024

RafaelAmauri commented Oct 30, 2024 •

edited

Loading