Skip to content

Commit 34dfeca

Browse files
author
dlpack-gh-actions-bot
committed
Generate DLPack website
0 parents  commit 34dfeca

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

55 files changed

+27050
-0
lines changed

.nojekyll

Whitespace-only changes.

latest/.buildinfo

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
# Sphinx build info version 1
2+
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
3+
config: aa5259f4f73fb5580435e0671ae0c59c
4+
tags: 645f666f9bcd5a90fca523b33c5a78b7

latest/.doctrees/c_api.doctree

141 KB
Binary file not shown.
2.19 MB
Binary file not shown.

latest/.doctrees/index.doctree

24.5 KB
Binary file not shown.
46.7 KB
Binary file not shown.

latest/_images/DLPack_diagram.png

174 KB
Loading

latest/_sources/c_api.rst.txt

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
.. _c_api:
2+
3+
C API (``dlpack.h``)
4+
====================
5+
6+
Macros
7+
~~~~~~
8+
9+
.. doxygendefine:: DLPACK_EXTERN_C
10+
11+
.. doxygendefine:: DLPACK_MAJOR_VERSION
12+
13+
.. doxygendefine:: DLPACK_MINOR_VERSION
14+
15+
.. doxygendefine:: DLPACK_DLL
16+
17+
.. doxygendefine:: DLPACK_FLAG_BITMASK_READ_ONLY
18+
19+
.. doxygendefine:: DLPACK_FLAG_BITMASK_IS_COPIED
20+
21+
.. doxygendefine:: DLPACK_FLAG_BITMASK_IS_SUBBYTE_TYPE_PADDED
22+
23+
Enumerations
24+
~~~~~~~~~~~~
25+
26+
.. doxygenenum:: DLDeviceType
27+
28+
.. doxygenenum:: DLDataTypeCode
29+
30+
Structs
31+
~~~~~~~
32+
33+
.. doxygenstruct:: DLPackVersion
34+
:members:
35+
36+
.. doxygenstruct:: DLDevice
37+
:members:
38+
39+
.. doxygenstruct:: DLDataType
40+
:members:
41+
42+
.. doxygenstruct:: DLTensor
43+
:members:
44+
45+
.. doxygenstruct:: DLManagedTensor
46+
:members:
47+
48+
.. doxygenstruct:: DLManagedTensorVersioned
49+
:members:

latest/_sources/index.rst.txt

Lines changed: 85 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,85 @@
1+
Welcome to DLPack's documentation!
2+
==================================
3+
4+
5+
Purpose
6+
~~~~~~~
7+
8+
In order for an ndarray system to interact with a variety of frameworks, a
9+
stable in-memory data structure is needed.
10+
11+
DLPack is one such data structure that allows exchange between major
12+
frameworks. It is developed with inputs from many deep learning system core
13+
developers. Highlights include:
14+
15+
* Minimum and stable: :ref:`simple header <c_api>`
16+
* Designed for cross hardware: CPU, CUDA, OpenCL, Vulkan, Metal, VPI, ROCm,
17+
WebGPU, Hexagon
18+
* Already a standard with wide community adoption and support:
19+
20+
* `NumPy <https://numpy.org/doc/stable/release/1.22.0-notes.html#add-nep-47-compatible-dlpack-support>`_
21+
* `CuPy <https://docs.cupy.dev/en/stable/reference/generated/cupy.fromDlpack.html>`_
22+
* `PyTorch <https://pytorch.org/docs/stable/dlpack.html>`_
23+
* `Tensorflow <https://www.tensorflow.org/api_docs/python/tf/experimental/dlpack/from_dlpack>`_
24+
* `MXNet <https://mxnet.apache.org/versions/master/api/python/docs/_modules/mxnet/dlpack.html>`_
25+
* `TVM <https://tvm.apache.org/docs/reference/api/python/contrib.html#module-tvm.contrib.dlpack>`_
26+
* `mpi4py <https://mpi4py.readthedocs.io/en/stable/overview.html#support-for-gpu-aware-mpi>`_
27+
* `Paddle <https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/api/paddle/utils/dlpack/from_dlpack_cn.html>`_
28+
* `JAX <https://jax.readthedocs.io/en/latest/_autosummary/jax.dlpack.from_dlpack.html#jax.dlpack.from_dlpack>`_
29+
* `Hidet <https://hidet.org/docs/stable/python_api/root.html#hidet.from_dlpack>`_
30+
31+
* Clean C ABI compatible.
32+
33+
* Means you can create and access it from any language.
34+
* It is also essential for building JIT and AOT compilers to support these
35+
data types.
36+
37+
38+
Scope
39+
~~~~~
40+
41+
The main design rationale of DLPack is the minimalism. DLPack drops the
42+
consideration of allocator, device API and focus on the minimum data
43+
structure. While still considering the need for cross hardware support
44+
(e.g. the data field is opaque for platforms that does not support normal
45+
addressing).
46+
47+
It also simplifies some of the design to remove legacy issues (e.g. everything
48+
assumes to be row major, strides can be used to support other case, and avoid
49+
the complexity to consider more layouts).
50+
51+
52+
Roadmap
53+
~~~~~~~
54+
55+
* C API that could be exposed as a new Python attribute ``__dlpack_info__``
56+
for returning API and ABI versions. (see `#34 <https://github.com/dmlc/dlpack/issues/34>`_,
57+
`#72 <https://github.com/dmlc/dlpack/pull/72>`_)
58+
* Clarify alignment requirements. (see
59+
`data-apis/array-api#293 <https://github.com/data-apis/array-api/issues/293>`_,
60+
`numpy/numpy#20338 <https://github.com/numpy/numpy/issues/20338>`_,
61+
`data-apis/array-api#293 (comment) <https://github.com/data-apis/array-api/issues/293#issuecomment-964434449>`_)
62+
* Adding support for boolean data type (see `#75 <https://github.com/dmlc/dlpack/issues/75>`_)
63+
* Adding a read-only flag (ABI break) or making it a hard requirement in the spec that
64+
imported arrays should be treated as read-only. (see
65+
`data-apis/consortium-feedback#1 (comment) <https://github.com/data-apis/consortium-feedback/issues/1#issuecomment-675857753>`_,
66+
`data-apis/array-api#191 <https://github.com/data-apis/array-api/issues/191>`_)
67+
* Standardize C interface for stream exchange. (see `#74 <https://github.com/dmlc/dlpack/issues/74>`_,
68+
`#65 <https://github.com/dmlc/dlpack/issues/65>`_)
69+
70+
71+
DLPack Documentation
72+
~~~~~~~~~~~~~~~~~~~~
73+
74+
.. toctree::
75+
:maxdepth: 2
76+
77+
c_api
78+
python_spec
79+
80+
81+
Indices and tables
82+
==================
83+
84+
* :ref:`genindex`
85+
* :ref:`search`
Lines changed: 200 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,200 @@
1+
.. _python-spec:
2+
3+
Python Specification for DLPack
4+
===============================
5+
6+
The Python specification for DLPack is a part of the
7+
`Python array API standard <https://data-apis.org/array-api/latest/index.html>`_.
8+
More details about the spec can be found under the :ref:`data-interchange` page.
9+
10+
11+
Syntax for data interchange with DLPack
12+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
13+
14+
The array API will offer the following syntax for data interchange:
15+
16+
1. A :func:`~array_api.from_dlpack` function, which accepts any (array) object with
17+
the two DLPack methods implemented (see below) and uses them to construct
18+
a new array containing the data from the input array.
19+
2. :meth:`~array_api.array.__dlpack__` and :meth:`~array_api.array.__dlpack_device__` methods on the
20+
array object, which will be called from within :func:`~array_api.from_dlpack`, to query
21+
what device the array is on (may be needed to pass in the correct
22+
stream, e.g. in the case of multiple GPUs) and to access the data.
23+
24+
25+
Semantics
26+
~~~~~~~~~
27+
28+
DLPack describes the memory layout of dense, strided, n-dimensional arrays.
29+
When a user calls ``y = from_dlpack(x)``, the library implementing ``x`` (the
30+
"producer") will provide access to the data from ``x`` to the library
31+
containing ``from_dlpack`` (the "consumer"). If possible, this must be
32+
zero-copy (i.e. ``y`` will be a *view* on ``x``). If not possible, that library
33+
may flag this and make a copy of the data. In both cases:
34+
35+
- The producer keeps owning the memory of ``x`` (and ``y`` if a copy is made)
36+
- ``y`` may or may not be a view, therefore the user must keep the recommendation to
37+
avoid mutating ``y`` in mind - see :ref:`copyview-mutability`.
38+
- Both ``x`` and ``y`` may continue to be used just like arrays created in other ways.
39+
40+
If an array that is accessed via the interchange protocol lives on a device that
41+
the requesting (consumer) library does not support, it is recommended to raise a
42+
:obj:`BufferError`, unless an explicit copy is requested (see below) and the producer
43+
can support the request.
44+
45+
Stream handling through the ``stream`` keyword applies to CUDA and ROCm (perhaps
46+
to other devices that have a stream concept as well, however those haven't been
47+
considered in detail). The consumer must pass the stream it will use to the
48+
producer; the producer must synchronize or wait on the stream when necessary.
49+
In the common case of the default stream being used, synchronization will be
50+
unnecessary so asynchronous execution is enabled.
51+
52+
Starting Python array API standard v2023, a copy can be explicitly requested (or
53+
disabled) through the new ``copy`` argument of ``from_dlpack()``. When a copy is
54+
made, the producer must set the :c:macro:`DLPACK_FLAG_BITMASK_IS_COPIED` bit flag.
55+
It is also possible to request cross-device copies through the new ``device``
56+
argument, though the v2023 standard only mandates the support of :c:enumerator:`kDLCPU`.
57+
58+
Implementation
59+
~~~~~~~~~~~~~~
60+
61+
*Note that while this API standard largely tries to avoid discussing
62+
implementation details, some discussion and requirements are needed
63+
here because data interchange requires coordination between
64+
implementers on, e.g., memory management.*
65+
66+
.. image:: /_static/images/DLPack_diagram.png
67+
:alt: Diagram of DLPack structs
68+
69+
*DLPack diagram. Dark blue are the structs it defines, light blue
70+
struct members, gray text enum values of supported devices and data
71+
types.*
72+
73+
Starting Python array API standard v2023, a new ``max_version`` argument
74+
is added to ``__dlpack__`` for the consumer to signal the producer the
75+
maximal supported DLPack version. Starting DLPack 1.0, the :c:struct:`DLManagedTensorVersioned`
76+
struct should be used and the existing :c:struct:`DLManagedTensor` struct is considered
77+
deprecated, though a library should try to support both during the transition
78+
period if possible.
79+
80+
.. note::
81+
In the rest of this document, ``DLManagedTensorVersioned`` and ``DLManagedTensor``
82+
are treated as synonyms, assuming a proper handling of ``max_version`` has been
83+
done to choose the right struct. As far as the capsule name is concerned,
84+
when ``DLManagedTensorVersioned`` is in use the capsule names ``dltensor``
85+
and ``used_dltensor`` will need a ``_versioned`` suffix.
86+
87+
The ``__dlpack__`` method will produce a :c:type:`PyCapsule` containing a
88+
``DLManagedTensor``, which will be consumed immediately within
89+
``from_dlpack`` - therefore it is consumed exactly once, and it will not be
90+
visible to users of the Python API.
91+
92+
The producer must set the ``PyCapsule`` name to ``"dltensor"`` so that
93+
it can be inspected by name, and set :c:type:`PyCapsule_Destructor` that calls
94+
the ``deleter`` of the ``DLManagedTensor`` when the ``"dltensor"``-named
95+
capsule is no longer needed.
96+
97+
The consumer must transer ownership of the ``DLManagedTensor`` from the
98+
capsule to its own object. It does so by renaming the capsule to
99+
``"used_dltensor"`` to ensure that ``PyCapsule_Destructor`` will not get
100+
called (ensured if ``PyCapsule_Destructor`` calls ``deleter`` only for
101+
capsules whose name is ``"dltensor"``), but the ``deleter`` of the
102+
``DLManagedTensor`` will be called by the destructor of the consumer
103+
library object created to own the ``DLManagedTensor`` obtained from the
104+
capsule. Below is an example of the capsule deleter written in the Python
105+
C API which is called either when the refcount on the capsule named
106+
``"dltensor"`` reaches zero or the consumer decides to deallocate its array:
107+
108+
.. code-block:: C
109+
110+
static void dlpack_capsule_deleter(PyObject *self){
111+
if (PyCapsule_IsValid(self, "used_dltensor")) {
112+
return; /* Do nothing if the capsule has been consumed. */
113+
}
114+
115+
DLManagedTensor *managed = (DLManagedTensor *)PyCapsule_GetPointer(self, "dltensor");
116+
if (managed == NULL) {
117+
PyErr_WriteUnraisable(self);
118+
return;
119+
}
120+
/* the spec says the deleter can be NULL if there is no way for the caller to provide a reasonable destructor. */
121+
if (managed->deleter) {
122+
managed->deleter(managed);
123+
}
124+
}
125+
126+
Note: the capsule names ``"dltensor"`` and ``"used_dltensor"`` must be
127+
statically allocated.
128+
129+
The ``DLManagedTensor`` deleter must ensure that sharing beyond Python
130+
boundaries is possible, this means that the GIL must be acquired explicitly
131+
if it uses Python objects or API.
132+
In Python, the deleter usually needs to :c:func:`Py_DECREF` the original owner
133+
and free the ``DLManagedTensor`` allocation.
134+
For example, NumPy uses the following code to ensure sharing with arbitrary
135+
non-Python code is safe:
136+
137+
.. code-block:: C
138+
139+
static void array_dlpack_deleter(DLManagedTensor *self)
140+
{
141+
/*
142+
* Leak the Python object if the Python runtime is not available.
143+
* This can happen if the DLPack consumer destroys the tensor late
144+
* after Python runtime finalization (for example in case the tensor
145+
* was indirectly kept alive by a C++ static variable).
146+
*/
147+
if (!Py_IsInitialized()) {
148+
return;
149+
}
150+
151+
PyGILState_STATE state = PyGILState_Ensure();
152+
153+
PyObject *array = (PyObject *)self->manager_ctx;
154+
// This will also free the shape and strides as it's one allocation.
155+
PyMem_Free(self);
156+
Py_XDECREF(array);
157+
158+
PyGILState_Release(state);
159+
}
160+
161+
When the :c:member:`~DLTensor.strides` field in the :c:struct:`DLTensor` struct is ``NULL``, it indicates a
162+
row-major compact array. If the array is of size zero, the data pointer in
163+
``DLTensor`` should be set to either ``NULL`` or ``0``.
164+
165+
For further details on DLPack design and how to implement support for it,
166+
refer to `github.com/dmlc/dlpack <https://github.com/dmlc/dlpack>`_.
167+
168+
.. warning::
169+
DLPack contains a :c:member:`~DLDevice.device_id`, which will be the device
170+
ID (an integer, ``0, 1, ...``) which the producer library uses. In
171+
practice this will likely be the same numbering as that of the
172+
consumer, however that is not guaranteed. Depending on the hardware
173+
type, it may be possible for the consumer library implementation to
174+
look up the actual device from the pointer to the data - this is
175+
possible for example for CUDA device pointers.
176+
177+
It is recommended that implementers of this array API consider and document
178+
whether the :attr:`~array_api.array.device` attribute of the array returned from ``from_dlpack`` is
179+
guaranteed to be in a certain order or not.
180+
181+
182+
Reference Implementations
183+
~~~~~~~~~~~~~~~~~~~~~~~~~
184+
185+
Several Python libraries have adopted this standard using Python C API, C++, Cython,
186+
ctypes, cffi, etc:
187+
188+
* NumPy: `Python C API <https://github.com/numpy/numpy/blob/main/numpy/_core/src/multiarray/dlpack.c>`__
189+
* CuPy: `Cython <https://github.com/cupy/cupy/blob/master/cupy/_core/dlpack.pyx>`__
190+
* Tensorflow: `C++ <https://github.com/tensorflow/tensorflow/blob/master/tensorflow/c/eager/dlpack.cc>`__,
191+
`Python wrapper using Python C API <https://github.com/tensorflow/tensorflow/blob/a97b01a4ff009ed84a571c138837130a311e74a7/tensorflow/python/tfe_wrapper.cc#L1562>`__,
192+
`XLA <https://github.com/tensorflow/tensorflow/blob/master/third_party/xla/xla/python/dlpack.cc>`__
193+
* PyTorch: `C++ <https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/DLConvertor.cpp>`__,
194+
`Python wrapper using Python C API <https://github.com/pytorch/pytorch/blob/c22b8a42e6038ed2f6a161114cf3d8faac3f6e9a/torch/csrc/Module.cpp#L355>`__
195+
* MXNet: `ctypes <https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/dlpack.py>`__
196+
* TVM: `ctypes <https://github.com/apache/tvm/blob/main/python/tvm/_ffi/_ctypes/ndarray.py>`__,
197+
`Cython <https://github.com/apache/tvm/blob/main/python/tvm/_ffi/_cython/ndarray.pxi>`__
198+
* mpi4py: `Cython <https://github.com/mpi4py/mpi4py/blob/master/src/mpi4py/MPI.src/asdlpack.pxi>`_
199+
* Paddle: `C++ <https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/framework/tensor_util.cc#L901-L951>`__, `Python wrapper using Python C API <https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/pybind/pybind.cc#L1263-L1280>`__
200+
* Hidet: `ctypes <https://github.com/hidet-org/hidet/blob/main/python/hidet/graph/impl/dlpack.py>`__

0 commit comments

Comments
 (0)