Add stream context #5

cyx-6 · 2025-09-14T02:59:58Z

This PR adds the stream context into ffi, so that ffi env stream can be updated. The tvm_ffi.use_torch_stream is for wrapping the torch stream/graph context. And lower-level tvm_ffi.use_raw_stream is for creating context with existing stream handle.

Example for tvm_ffi.use_torch_stream:

case with torch stream:

stream = torch.cuda.Stream()
stream_context = torch.cuda.stream(stream)
with tvm_ffi.use_torch_stream(stream_context):
  ...

case with torch cuda graph

graph = torch.cuda.CUDAGraph()
graph_context = torch.cuda.graph(graph)
with tvm_ffi.use_torch_stream(graph_context):
  ...

case with current stream by default

stream = torch.cuda.Stream()
stream_context = torch.cuda.stream(stream)
with torch.cuda.stream(stream):
  with tvm_ffi.use_torch_stream():
    ...

Eaxmple for tvm_ffi.use_raw_stream:

device = tvm_ffi.device(...)
stream_handle = ...
with tvm_ffi.use_raw_stream(device, stream_handle):
  ...

tqchen · 2025-09-14T22:31:27Z

python/tvm_ffi/stream.py

+        Examples
+        --------
+        .. code-block:: python
+        s = torch.cuda.Stream()


need indent one level and a blank line between code block and code

tqchen · 2025-09-14T22:31:50Z

python/tvm_ffi/cython/base.pxi

+    return <uint64_t>prev_stream
+
+
+class StreamContext:


given this is not cython dependent, move out to stream.py

tqchen · 2025-09-14T22:34:15Z

python/tvm_ffi/cython/base.pxi

                                  TVMFFIStreamHandle stream,
                                  TVMFFIStreamHandle* opt_out_original_stream) nogil

+cdef _env_set_current_stream(int device_type, int device_id, uint64_t stream):


we can expose this as def, so it can be called from python side

tqchen · 2025-09-14T22:34:22Z

python/tvm_ffi/cython/base.pxi

    DLTensor* TVMFFITensorGetDLTensorPtr(TVMFFIObjectHandle obj) nogil
    DLDevice TVMFFIDLDeviceFromIntPair(int32_t device_type, int32_t device_id) nogil

+


new line not needed?

…ontext

This PR addresses a memory corruption issue in Cython. The fix involves ensuring that the `ByteArrayArg` object, which holds the type key, is properly destructed after being passed to the `TVMFFITypeKeyToIndex` function. This prevents a potential read-after-free scenario, as reported by ASan. ## ASan Report ``` READ of size 9 at 0x604000420a30 thread T0 ... #5 0x7fdb57299506 in __pyx_f_4core__type_info_create_from_type_key /home/dolores/Projects/tvm-ffi/build/core.cpp:17732 ... 0x604000420a30 is located 32 bytes inside of 42-byte region [0x604000420a10,0x604000420a3a) freed by thread T0 here: ... #4 0x7fdb572994e2 in __pyx_f_4core__type_info_create_from_type_key /home/dolores/Projects/tvm-ffi/build/core.cpp:17731 ... previously allocated by thread T0 here: ... #8 0x7fdb57299366 in __pyx_f_4core__type_info_create_from_type_key /home/dolores/Projects/tvm-ffi/build/core.cpp:17718 ``` <img width="1444" height="904" alt="image" src="https://github.com/user-attachments/assets/7a80d33d-dedf-41ca-ac77-108e63b8e57b" /> ## Recommended ASan Options One will need to preload `libasan` to properly work with CPython, and `libstdc++` to properly intercept `__cxa_throw`. The path to those two files can be found using: ``` ASAN="$(gcc -print-file-name=libasan.so)" STDCXX="$(g++ -print-file-name=libstdc++.so.6)" LD_PRELOAD="$ASAN $STDCXX" ``` Additionally, it might be helpful to tweak ``` PYTHONMALLOC=malloc ``` and run with ASan options ``` ASAN_OPTIONS="detect_leaks=0:abort_on_error=1:symbolize=1:fast_unwind_on_malloc=0" ``` Notably, turning on `detect_leaks=1` will lead to bunch of irrelevant noisy reports. Better turning it off.

This PR exposes the get env stream method as followup of #5.

cyx-6 added 4 commits September 14, 2025 02:42

Add stream context

e78788b

fix lint

817a8cd

fix lint

0ab7131

fix test

defd5f2

tqchen reviewed Sep 14, 2025

View reviewed changes

cyx-6 added 2 commits September 15, 2025 02:44

Merge commit 'c100338de52825097ddc44bbac3d03a92f45b33a' into stream-c…

41985d1

…ontext

upd

35e2834

tqchen approved these changes Sep 15, 2025

View reviewed changes

tqchen merged commit 3197cd0 into apache:dev Sep 15, 2025
6 checks passed

cyx-6 mentioned this pull request Oct 8, 2025

Expose get stream method #97

Merged

tqchen pushed a commit that referenced this pull request Oct 9, 2025

Expose get stream method (#97)

22c049b

This PR exposes the get env stream method as followup of #5.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add stream context #5

Add stream context #5

Uh oh!

cyx-6 commented Sep 14, 2025

Uh oh!

tqchen Sep 14, 2025

Uh oh!

tqchen Sep 14, 2025

Uh oh!

tqchen Sep 14, 2025

Uh oh!

tqchen Sep 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		DLTensor* TVMFFITensorGetDLTensorPtr(TVMFFIObjectHandle obj) nogil
		DLDevice TVMFFIDLDeviceFromIntPair(int32_t device_type, int32_t device_id) nogil

Add stream context #5

Add stream context #5

Uh oh!

Conversation

cyx-6 commented Sep 14, 2025

Uh oh!

tqchen Sep 14, 2025

Choose a reason for hiding this comment

Uh oh!

tqchen Sep 14, 2025

Choose a reason for hiding this comment

Uh oh!

tqchen Sep 14, 2025

Choose a reason for hiding this comment

Uh oh!

tqchen Sep 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants