You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm training the code on docker using pytorch/pytorch:1.10.0-cuda11.3-cudnn8-devel as base image with python 3.8. Setup worked fine until I tried to train the code, then these error came out:
/opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [33,0,0], thread: [57,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"failed. /opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [16,0,0], thread: [52,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds"failed. /opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [18,0,0], thread: [0,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds"failed. /opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [16,0,0], thread: [93,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds"failed. /opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [17,0,0], thread: [75,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds"failed. /opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [17,0,0], thread: [87,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds"failed. /opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [17,0,0], thread: [99,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds"failed. /opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [17,0,0], thread: [110,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds"failed. /opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [17,0,0], thread: [126,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds"failed. /opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [18,0,0], thread: [92,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds"failed. epochs: 0%| | 0/6 [00:13<?, ?it/s] Traceback (most recent call last): File "Models/SFD/tools/train.py", line 212, in <module> main() File "Models/SFD/tools/train.py", line 167, in main train_model( File "/workspace/Models/SFD/tools/train_utils/train_utils.py", line 86, in train_model accumulated_iter = train_one_epoch( File "/workspace/Models/SFD/tools/train_utils/train_utils.py", line 38, in train_one_epoch loss, tb_dict, disp_dict = model_func(model, batch) File "/workspace/OpenPCDet/pcdet/models/__init__.py", line 44, in model_func ret_dict, tb_dict, disp_dict = model(batch_dict) File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, **kwargs) File "/workspace/Models/SFD/pcdet_extensions/models/detectors/sfd.py", line 11, in forward batch_dict = cur_module(batch_dict) File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, **kwargs) File "/workspace/Models/SFD/pcdet_extensions/models/roi_heads/sfd_head.py", line 595, in forward self.roicrop3d_gpu(batch_dict, self.model_cfg.ROI_POINT_CROP.POOL_EXTRA_WIDTH) File "/workspace/Models/SFD/pcdet_extensions/models/roi_heads/sfd_head.py", line 554, in roicrop3d_gpu image[total_pts_features[:,7].long(), total_pts_features[:,6].long()] = global_index.to(device=total_pts_features.device) RuntimeError: CUDA error: device-side assert triggered
It seems like there is an index error from the function roicrop3d_gpu. Could you please verify this? I got stuck for a few days already. Thank you!
The text was updated successfully, but these errors were encountered:
Hello,
I'm training the code on docker using pytorch/pytorch:1.10.0-cuda11.3-cudnn8-devel as base image with python 3.8. Setup worked fine until I tried to train the code, then these error came out:
/opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [33,0,0], thread: [57,0,0] Assertion
index >= -sizes[i] && index < sizes[i] && "index out of bounds"failed. /opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [16,0,0], thread: [52,0,0] Assertion
index >= -sizes[i] && index < sizes[i] && "index out of bounds"failed. /opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [18,0,0], thread: [0,0,0] Assertion
index >= -sizes[i] && index < sizes[i] && "index out of bounds"failed. /opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [16,0,0], thread: [93,0,0] Assertion
index >= -sizes[i] && index < sizes[i] && "index out of bounds"failed. /opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [17,0,0], thread: [75,0,0] Assertion
index >= -sizes[i] && index < sizes[i] && "index out of bounds"failed. /opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [17,0,0], thread: [87,0,0] Assertion
index >= -sizes[i] && index < sizes[i] && "index out of bounds"failed. /opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [17,0,0], thread: [99,0,0] Assertion
index >= -sizes[i] && index < sizes[i] && "index out of bounds"failed. /opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [17,0,0], thread: [110,0,0] Assertion
index >= -sizes[i] && index < sizes[i] && "index out of bounds"failed. /opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [17,0,0], thread: [126,0,0] Assertion
index >= -sizes[i] && index < sizes[i] && "index out of bounds"failed. /opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [18,0,0], thread: [92,0,0] Assertion
index >= -sizes[i] && index < sizes[i] && "index out of bounds"failed. epochs: 0%| | 0/6 [00:13<?, ?it/s] Traceback (most recent call last): File "Models/SFD/tools/train.py", line 212, in <module> main() File "Models/SFD/tools/train.py", line 167, in main train_model( File "/workspace/Models/SFD/tools/train_utils/train_utils.py", line 86, in train_model accumulated_iter = train_one_epoch( File "/workspace/Models/SFD/tools/train_utils/train_utils.py", line 38, in train_one_epoch loss, tb_dict, disp_dict = model_func(model, batch) File "/workspace/OpenPCDet/pcdet/models/__init__.py", line 44, in model_func ret_dict, tb_dict, disp_dict = model(batch_dict) File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, **kwargs) File "/workspace/Models/SFD/pcdet_extensions/models/detectors/sfd.py", line 11, in forward batch_dict = cur_module(batch_dict) File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, **kwargs) File "/workspace/Models/SFD/pcdet_extensions/models/roi_heads/sfd_head.py", line 595, in forward self.roicrop3d_gpu(batch_dict, self.model_cfg.ROI_POINT_CROP.POOL_EXTRA_WIDTH) File "/workspace/Models/SFD/pcdet_extensions/models/roi_heads/sfd_head.py", line 554, in roicrop3d_gpu image[total_pts_features[:,7].long(), total_pts_features[:,6].long()] = global_index.to(device=total_pts_features.device) RuntimeError: CUDA error: device-side assert triggered
It seems like there is an index error from the function
roicrop3d_gpu
. Could you please verify this? I got stuck for a few days already. Thank you!The text was updated successfully, but these errors were encountered: