[DCNv4 ERROR] cuda op error when use internimage-L and internimage-LX with DCNv4, however internimage-B works well with DCNv4 #280

dou3516 · 2024-02-01T03:10:46Z

cuda op error when use internimage-L and internimage-LX with DCNv4, however internimage-B works well with DCNv4. What is wrong?

Environments:
DCNv4: build from https://github.com/OpenGVLab/DCNv4/tree/main/DCNv4_op/make.sh
DCNv3: build from https://github.com/OpenGVLab/InternImage/tree/master/segmentation/ops_dcnv3/make.sh

internimage-L config:

    backbone=dict(
        _delete_=True,
        type='InternImage',
        core_op='DCNv3',
        channels=160,
        depths=[5, 5, 22, 5],
        groups=[10, 20, 40, 80],
        mlp_ratio=4.,
        drop_path_rate=0.5, 
        norm_layer='LN',
        layer_scale=1.0,
        offset_scale=2.0,
        post_norm=True,
        with_cp=False,
        out_indices=(0, 1, 2, 3),
        dcn_output_bias=True,  # dcnv4
        mlp_fc2_bias=True,  # dcnv4
        dw_kernel_size=3,  # dcnv4
        use_dcn_v4_op=use_dcn_v4_op,  # dcnv4
        init_cfg=dict(type='Pretrained', checkpoint=pretrained)),

error log:

error in dcnv4_im2col_cuda: invalid configuration argument
launch arguments: gridDim=(1568, 1, 1), blockDim=(16, 80, 1), shm_size=5760
...
...
  File "/home/miniconda3/envs/dcnv4/lib/python3.9/site-packages/DCNv4-1.0.0.post2-py3.9-linux-x86_64.egg/DCNv4/functions/dcnv4_func.py", line 125, in backward
    ext.dcnv4_backward(*args)
RuntimeError: falseINTERNAL ASSERT FAILED at "/home/dbc/AIcode/DL/SS/mmsegmentation-dev1.x/DCNv4_op/src/cuda/dcnv4_col2im_cuda.cuh":470, please report a bug to PyTorch. kernel launch error

The text was updated successfully, but these errors were encountered:

zhiqi-li · 2024-03-11T09:03:04Z

Hi, what the shape of your input tensor? Since DCNv4 utilizes share memory to store tensors, tensors with extremely large shape will cause errors.

dou3516 · 2024-03-26T07:23:42Z

Hi, what the shape of your input tensor? Since DCNv4 utilizes share memory to store tensors, tensors with extremely large shape will cause errors.

B x C x H x W = 8 x 3 x 448 x 448

SepJourney mentioned this issue Feb 1, 2024

flash_internimage_large problem OpenGVLab/DCNv4#26

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DCNv4 ERROR] cuda op error when use internimage-L and internimage-LX with DCNv4, however internimage-B works well with DCNv4 #280

[DCNv4 ERROR] cuda op error when use internimage-L and internimage-LX with DCNv4, however internimage-B works well with DCNv4 #280

dou3516 commented Feb 1, 2024

zhiqi-li commented Mar 11, 2024

dou3516 commented Mar 26, 2024

[DCNv4 ERROR] cuda op error when use internimage-L and internimage-LX with DCNv4, however internimage-B works well with DCNv4 #280

[DCNv4 ERROR] cuda op error when use internimage-L and internimage-LX with DCNv4, however internimage-B works well with DCNv4 #280

Comments

dou3516 commented Feb 1, 2024

zhiqi-li commented Mar 11, 2024

dou3516 commented Mar 26, 2024