Skip to content

[BUG]: Memory leak in fp8 kernel launch #51

@ZhangZhiPku

Description

@ZhangZhiPku

Version

1.0.0

Version

13.1

Which installation method(s) does this occur on?

Pip

Describe the bug.

It seems that launching the FP8 kernel may cause PyTorch memory leaks, where the memory occupied by the input tensor cannot be released properly.

No similar issues have been observed when the input tensor is of other data types.

Test Environment:
GPU: RTX 5090
CUDA Version: 13.1
CuTILE Version: 1.0.0

Minimum reproducible example

try following script:

import cuda.tile as ct
import torch

@ct.kernel
def kernel(A: ct.Array):
    pass

for i in range(100000):
    x = torch.zeros(size=[4025, 5394], device="cuda", dtype=torch.float8_e4m3fn)
    ct.launch(torch.cuda.current_stream(), (1, 1, 1), kernel, (x, ))

Relevant log output

Traceback (most recent call last):
  File "/root/autodl-tmp/super.py", line 9, in <module>
    x = torch.zeros(size=[4025, 5394], device="cuda", dtype=torch.float8_e4m3fn)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 22.00 MiB. GPU 0 has a total capacity of 31.36 GiB of which 17.06 MiB is free. Including non-PyTorch memory, this process has 31.33 GiB memory in use. Of the allocated memory 28.96 GiB is allocated by PyTorch, and 1.81 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

Full env printout

Other/Misc.

No response

Contributing Guidelines

  • I agree to follow cuTile Python's contributing guidelines
  • I have searched the open bugs and have found no duplicates for this bug report

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions