From 06d9ec9041d66db10468edb7452f3076aaeb0531 Mon Sep 17 00:00:00 2001 From: "Mads R. B. Kristensen" Date: Tue, 12 Sep 2023 07:54:41 +0200 Subject: [PATCH] Docs (#268) Authors: - Mads R. B. Kristensen (https://github.com/madsbk) Approvers: - AJ Schmidt (https://github.com/ajschmidt8) - Benjamin Zaitlen (https://github.com/quasiben) - Vukasin Milovanovic (https://github.com/vuule) URL: https://github.com/rapidsai/kvikio/pull/268 --- README.md | 182 ++---------------- .../all_cuda-118_arch-x86_64.yaml | 4 +- .../all_cuda-120_arch-x86_64.yaml | 4 +- cpp/doxygen/main_page.md | 137 ++++++++++++- cpp/include/kvikio/defaults.hpp | 4 +- dependencies.yaml | 4 +- docs/source/api.rst | 1 - docs/source/conf.py | 143 ++++++++++++-- docs/source/index.rst | 32 +-- docs/source/install.rst | 71 +++++++ docs/source/quickstart.rst | 37 ++++ docs/source/runtime_settings.rst | 26 +++ docs/source/zarr.rst | 15 ++ python/examples/zarr_cupy_nvcomp.py | 2 +- 14 files changed, 455 insertions(+), 207 deletions(-) create mode 100644 docs/source/install.rst create mode 100644 docs/source/quickstart.rst create mode 100644 docs/source/runtime_settings.rst create mode 100644 docs/source/zarr.rst diff --git a/README.md b/README.md index 0422028dd9..4df538aa2d 100644 --- a/README.md +++ b/README.md @@ -1,181 +1,23 @@ -# KvikIO: C++ and Python bindings to cuFile +# KvikIO: High Performance File IO ## Summary -This provides C++ and Python bindings to cuFile, which enables GPUDirect Storage (GDS). -KvikIO also works efficiently when GDS isn't available and can read/write both host and -device data seamlessly. +KvikIO is a Python and C++ library for high performance file IO. It provides C++ and Python +bindings to [cuFile](https://docs.nvidia.com/gpudirect-storage/api-reference-guide/index.html), +which enables [GPUDirect Storage (GDS)](https://developer.nvidia.com/blog/gpudirect-storage/). +KvikIO also works efficiently when GDS isn't available and can read/write both host and device data seamlessly. +The C++ library is header-only making it easy to include in [existing projects](https://github.com/rapidsai/kvikio/blob/HEAD/cpp/examples/downstream/). + ### Features -* Object Oriented API. -* Exception handling. +* Object oriented API of [cuFile](https://docs.nvidia.com/gpudirect-storage/api-reference-guide/index.html) with C++/Python exception handling. +* A Python [Zarr](https://zarr.readthedocs.io/en/stable/) backend for reading and writing GPU data to file seamlessly. * Concurrent reads and writes using an internal thread pool. * Non-blocking API. -* Python Zarr reader. * Handle both host and device IO seamlessly. * Provides Python bindings to [nvCOMP](https://github.com/NVIDIA/nvcomp). -## Requirements - -To install users should have a working Linux machine with CUDA Toolkit -installed (v11.4+) and a working compiler toolchain (C++17 and cmake). - -### C++ - -The C++ bindings are header-only and depends on the CUDA Driver API. -In order to build and run the example code, CMake and the CUDA Runtime -API is required. - -### Python - -The Python package depends on the following packages: - -* cython -* pip -* setuptools -* scikit-build - -For nvCOMP, benchmarks, examples, and tests: - -* pytest -* numpy -* cupy - -## Install - -### Conda - -Install the stable release from the `rapidsai` channel like: - -``` -conda create -n kvikio_env -c rapidsai -c conda-forge kvikio -``` - -Install the `kvikio` conda package from the `rapidsai-nightly` channel like: - -``` -conda create -n kvikio_env -c rapidsai-nightly -c conda-forge python=3.10 cuda-version=11.8 kvikio -``` - -If the nightly install doesn't work, set `channel_priority: flexible` in your `.condarc`. - -In order to setup a development environment run: -``` -conda env create --name kvikio-dev --file conda/environments/all_cuda-118_arch-x86_64.yaml -``` - -### C++ (build from source) - -To build the C++ example run: - -``` -./build.sh libkvikio -``` - -Then run the example: - -``` -./examples/basic_io -``` - -### Python (build from source) - -To build and install the extension run: - -``` -./build.sh kvikio -``` - -One might have to define `CUDA_HOME` to the path to the CUDA installation. - -In order to test the installation, run the following: - -``` -pytest tests/ -``` - -And to test performance, run the following: - -``` -python benchmarks/single-node-io.py -``` - -## Examples - - -### Notebooks - - [How to read and write GPU memory directly to/from Zarr files](notebooks/zarr.ipynb) - - -### C++ - -```c++ -#include -#include -#include -using namespace std; - -int main() -{ - // Create two arrays `a` and `b` - constexpr std::size_t size = 100; - void *a = nullptr; - void *b = nullptr; - cudaMalloc(&a, size); - cudaMalloc(&b, size); - - // Write `a` to file - kvikio::FileHandle fw("test-file", "w"); - size_t written = fw.write(a, size); - fw.close(); - - // Read file into `b` - kvikio::FileHandle fr("test-file", "r"); - size_t read = fr.read(b, size); - fr.close(); - - // Read file into `b` in parallel using 16 threads - kvikio::default_thread_pool::reset(16); - { - kvikio::FileHandle f("test-file", "r"); - future future = f.pread(b_dev, sizeof(a), 0); // Non-blocking - size_t read = future.get(); // Blocking - // Notice, `f` closes automatically on destruction. - } -} -``` - -### Python - -```python -import cupy -import kvikio - -a = cupy.arange(100) -f = kvikio.CuFile("test-file", "w") -# Write whole array to file -f.write(a) -f.close() - -b = cupy.empty_like(a) -f = kvikio.CuFile("test-file", "r") -# Read whole array from file -f.read(b) -assert all(a == b) - -# Use contexmanager -c = cupy.empty_like(a) -with kvikio.CuFile(path, "r") as f: - f.read(c) -assert all(a == c) - -# Non-blocking read -d = cupy.empty_like(a) -with kvikio.CuFile(path, "r") as f: - future1 = f.pread(d[:50]) - future2 = f.pread(d[50:], file_offset=d[:50].nbytes) - future1.get() # Wait for first read - future2.get() # Wait for second read -assert all(a == d) -``` +### Documentation + * Python: + * C++: diff --git a/conda/environments/all_cuda-118_arch-x86_64.yaml b/conda/environments/all_cuda-118_arch-x86_64.yaml index f965809e50..989e854364 100644 --- a/conda/environments/all_cuda-118_arch-x86_64.yaml +++ b/conda/environments/all_cuda-118_arch-x86_64.yaml @@ -23,16 +23,18 @@ dependencies: - libcufile=1.4.0.31 - ninja - numpy>=1.21 +- numpydoc - nvcc_linux-64=11.8 - nvcomp==2.6.1 - packaging - pre-commit -- pydata-sphinx-theme - pytest - pytest-cov - python>=3.9,<3.11 - scikit-build>=0.13.1 - sphinx +- sphinx-click +- sphinx_rtd_theme - sysroot_linux-64=2.17 - zarr name: all_cuda-118_arch-x86_64 diff --git a/conda/environments/all_cuda-120_arch-x86_64.yaml b/conda/environments/all_cuda-120_arch-x86_64.yaml index 12caededb4..c335744993 100644 --- a/conda/environments/all_cuda-120_arch-x86_64.yaml +++ b/conda/environments/all_cuda-120_arch-x86_64.yaml @@ -23,15 +23,17 @@ dependencies: - libcufile-dev - ninja - numpy>=1.21 +- numpydoc - nvcomp==2.6.1 - packaging - pre-commit -- pydata-sphinx-theme - pytest - pytest-cov - python>=3.9,<3.11 - scikit-build>=0.13.1 - sphinx +- sphinx-click +- sphinx_rtd_theme - sysroot_linux-64=2.17 - zarr name: all_cuda-120_arch-x86_64 diff --git a/cpp/doxygen/main_page.md b/cpp/doxygen/main_page.md index 5494d5c580..a6120024ce 100644 --- a/cpp/doxygen/main_page.md +++ b/cpp/doxygen/main_page.md @@ -1,4 +1,135 @@ -# libkvikio +# Welcome to KvikIO's C++ documentation! -libkvikio is a C++ header-only library providing bindings to -cuFile, which enables GPUDirectStorage (GDS). +KvikIO is a Python and C++ library for high performance file IO. It provides C++ and Python +bindings to [cuFile](https://docs.nvidia.com/gpudirect-storage/api-reference-guide/index.html) +which enables [GPUDirect Storage (GDS)](https://developer.nvidia.com/blog/gpudirect-storage/). +KvikIO also works efficiently when GDS isn't available and can read/write both host and device data seamlessly. + +KvikIO C++ is a header-only library that is part of the [RAPIDS](https://rapids.ai/) suite of open-source software libraries for GPU-accelerated data science. + +--- +**Notice** this is the documentation for the C++ library. For the Python documentation of KvikIO, see under **KvikIO**. + +--- + +## Features + +* Object Oriented API. +* Exception handling. +* Concurrent reads and writes using an internal thread pool. +* Non-blocking API. +* Handle both host and device IO seamlessly. + +## Installation + +KvikIO is a header-only library and as such doesn't need installation. +However, for convenience we release Conda packages that makes it easy +to include KvikIO in your CMake projects. + +### Conda/Mamba + +We strongly recommend using `mamba `_ inplace of conda, which we will do throughout the documentation. + +Install the **stable release** from the ``rapidsai`` channel with the following: +```sh +# Install in existing environment +mamba install -c rapidsai -c conda-forge libkvikio +# Create new environment (CUDA 11.8) +mamba create -n libkvikio-env -c rapidsai -c conda-forge cuda-version=11.8 libkvikio +# Create new environment (CUDA 12.0) +mamba create -n libkvikio-env -c rapidsai -c conda-forge cuda-version=12.0 libkvikio +``` + +Install the **nightly release** from the ``rapidsai-nightly`` channel with the following: + +```sh +# Install in existing environment +mamba install -c rapidsai-nightly -c conda-forge libkvikio +# Create new environment (CUDA 11.8) +mamba create -n libkvikio-env -c rapidsai-nightly -c conda-forge python=3.10 cuda-version=11.8 libkvikio +# Create new environment (CUDA 12.0) +mamba create -n libkvikio-env -c rapidsai-nightly -c conda-forge python=3.10 cuda-version=12.0 libkvikio +``` + +--- +**Notice** if the nightly install doesn't work, set ``channel_priority: flexible`` in your ``.condarc``. + +--- + +### Include KvikIO in a CMake project +An example of how to include KvikIO in an existing CMake project can be found here: . + + +### Build from source + +To build the C++ example run: + +``` +./build.sh libkvikio +``` + +Then run the example: + +``` +./examples/basic_io +``` + +## Runtime Settings + +#### Compatibility Mode (KVIKIO_COMPAT_MODE) +When KvikIO is running in compatibility mode, it doesn't load `libcufile.so`. Instead, reads and writes are done using POSIX. Notice, this is not the same as the compatibility mode in cuFile. That is cuFile can run in compatibility mode while KvikIO is not. + +Set the environment variable `KVIKIO_COMPAT_MODE` to enable/disable compatibility mode. By default, compatibility mode is enabled: + - when `libcufile.so` cannot be found. + - when running in Windows Subsystem for Linux (WSL). + - when `/run/udev` isn't readable, which typically happens when running inside a docker image not launched with `--volume /run/udev:/run/udev:ro`. + +#### Thread Pool (KVIKIO_NTHREADS) +KvikIO can use multiple threads for IO automatically. Set the environment variable `KVIKIO_NTHREADS` to the number of threads in the thread pool. If not set, the default value is 1. + +#### Task Size (KVIKIO_TASK_SIZE) +KvikIO splits parallel IO operations into multiple tasks. Set the environment variable `KVIKIO_TASK_SIZE` to the maximum task size (in bytes). If not set, the default value is 4194304 (4 MiB). + +#### GDS Threshold (KVIKIO_GDS_THRESHOLD) +In order to improve performance of small IO, `.pread()` and `.pwrite()` implement a shortcut that circumvent the threadpool and use the POSIX backend directly. Set the environment variable `KVIKIO_GDS_THRESHOLD` to the minimum size (in bytes) to use GDS. If not set, the default value is 1048576 (1 MiB). + + +## Example + +```cpp +#include +#include +#include +using namespace std; + +int main() +{ + // Create two arrays `a` and `b` + constexpr std::size_t size = 100; + void *a = nullptr; + void *b = nullptr; + cudaMalloc(&a, size); + cudaMalloc(&b, size); + + // Write `a` to file + kvikio::FileHandle fw("test-file", "w"); + size_t written = fw.write(a, size); + fw.close(); + + // Read file into `b` + kvikio::FileHandle fr("test-file", "r"); + size_t read = fr.read(b, size); + fr.close(); + + // Read file into `b` in parallel using 16 threads + kvikio::default_thread_pool::reset(16); + { + kvikio::FileHandle f("test-file", "r"); + future future = f.pread(b_dev, sizeof(a), 0); // Non-blocking + size_t read = future.get(); // Blocking + // Notice, `f` closes automatically on destruction. + } +} +``` + +For a full runnable example see . diff --git a/cpp/include/kvikio/defaults.hpp b/cpp/include/kvikio/defaults.hpp index e515297408..d2ee6b8d91 100644 --- a/cpp/include/kvikio/defaults.hpp +++ b/cpp/include/kvikio/defaults.hpp @@ -218,8 +218,8 @@ class defaults { * In order to improve performance of small IO, `.pread()` and `.pwrite()` implement a shortcut * that circumvent the threadpool and use the POSIX backend directly. * - * Set the default value using `kvikio::default::task_size_reset()` or by setting the - * `KVIKIO_TASK_SIZE` environment variable. If not set, the default value is 1 MiB. + * Set the default value using `kvikio::default::gds_threshold_reset()` or by setting the + * `KVIKIO_GDS_THRESHOLD` environment variable. If not set, the default value is 1 MiB. * * @return The default GDS threshold size in bytes. */ diff --git a/dependencies.yaml b/dependencies.yaml index c90408f79e..0feb43f58c 100644 --- a/dependencies.yaml +++ b/dependencies.yaml @@ -236,8 +236,10 @@ dependencies: common: - output_types: [conda, requirements] packages: - - pydata-sphinx-theme + - numpydoc - sphinx + - sphinx-click + - sphinx_rtd_theme - output_types: conda packages: - doxygen=1.9.1 # pre-commit hook needs a specific version. diff --git a/docs/source/api.rst b/docs/source/api.rst index 5973ac8f29..4d19c09bbb 100644 --- a/docs/source/api.rst +++ b/docs/source/api.rst @@ -18,7 +18,6 @@ Zarr .. autoclass:: GDSStore :members: - Defaults -------- .. currentmodule:: kvikio.defaults diff --git a/docs/source/conf.py b/docs/source/conf.py index efd375e1f4..d36282c096 100644 --- a/docs/source/conf.py +++ b/docs/source/conf.py @@ -1,5 +1,5 @@ -#!/usr/bin/env python3 -# Copyright (c) 2022-2023, NVIDIA CORPORATION. +# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved. +# See file LICENSE for terms. # # Configuration file for the Sphinx documentation builder. # @@ -17,11 +17,10 @@ # import sys # sys.path.insert(0, os.path.abspath('.')) - # -- Project information ----------------------------------------------------- project = "kvikio" -copyright = "2022, NVIDIA" +copyright = "2023, NVIDIA" author = "NVIDIA" # The short X.Y version. @@ -37,40 +36,152 @@ # ones. extensions = [ "sphinx.ext.autodoc", + "sphinx.ext.mathjax", + "sphinx.ext.viewcode", + "sphinx.ext.githubpages", + "sphinx.ext.autosummary", + "sphinx.ext.intersphinx", + "sphinx.ext.extlinks", + "numpydoc", + "sphinx_click", + "sphinx_rtd_theme", ] +numpydoc_show_class_members = False + # Add any paths that contain templates here, relative to this directory. templates_path = ["_templates"] +# The suffix(es) of source filenames. +# You can specify multiple suffix as a list of string: +# +# source_suffix = ['.rst', '.md'] +source_suffix = ".rst" + +# The master toctree document. +master_doc = "index" + +# The language for content autogenerated by Sphinx. Refer to documentation +# for a list of supported languages. +# +# This is also used if you do content translation via gettext catalogs. +# Usually you set "language" from the command line for these cases. +language = "en" + # List of patterns, relative to source directory, that match files and # directories to ignore when looking for source files. # This pattern also affects html_static_path and html_extra_path. exclude_patterns = [] +# The name of the Pygments (syntax highlighting) style to use. +pygments_style = None + # -- Options for HTML output ------------------------------------------------- # The theme to use for HTML and HTML Help pages. See the documentation for # a list of builtin themes. # -html_theme = "pydata_sphinx_theme" -html_logo = "_static/RAPIDS-logo-purple.png" - -html_theme_options = { - "external_links": [], - # https://github.com/pydata/pydata-sphinx-theme/issues/1220 - "icon_links": [], - "github_url": "https://github.com/rapidsai/kvikio", - "twitter_url": "https://twitter.com/rapidsai", - "show_toc_level": 1, - "navbar_align": "right", -} +html_theme = "sphinx_rtd_theme" + +# Theme options are theme-specific and customize the look and feel of a theme +# further. For a list of options available for each theme, see the +# documentation. +# +# html_theme_options = {} # Add any paths that contain custom static files (such as style sheets) here, # relative to this directory. They are copied after the builtin static files, # so a file named "default.css" will overwrite the builtin "default.css". html_static_path = ["_static"] +# Custom sidebar templates, must be a dictionary that maps document names +# to template names. +# +# The default sidebars (for documents that don't match any pattern) are +# defined by theme itself. Builtin themes are using these templates by +# default: ``['localtoc.html', 'relations.html', 'sourcelink.html', +# 'searchbox.html']``. +# +# html_sidebars = {} + + +# -- Options for HTMLHelp output --------------------------------------------- + +# Output file base name for HTML help builder. +htmlhelp_basename = "kvikiodoc" + + +# -- Options for LaTeX output ------------------------------------------------ + +latex_elements = { + # The paper size ('letterpaper' or 'a4paper'). + # + # 'papersize': 'letterpaper', + # The font size ('10pt', '11pt' or '12pt'). + # + # 'pointsize': '10pt', + # Additional stuff for the LaTeX preamble. + # + # 'preamble': '', + # Latex figure (float) alignment + # + # 'figure_align': 'htbp', +} + +# Grouping the document tree into LaTeX files. List of tuples +# (source start file, target name, title, +# author, documentclass [howto, manual, or own class]). +latex_documents = [ + (master_doc, "kvikio.tex", "kvikio Documentation", "NVIDIA", "manual") +] + + +# -- Options for manual page output ------------------------------------------ + +# One entry per manual page. List of tuples +# (source start file, name, description, authors, manual section). +man_pages = [(master_doc, "kvikio", "kvikio Documentation", [author], 1)] + + +# -- Options for Texinfo output ---------------------------------------------- + +# Grouping the document tree into Texinfo files. List of tuples +# (source start file, target name, title, author, +# dir menu entry, description, category) +texinfo_documents = [ + ( + master_doc, + "kvikio", + "kvikio Documentation", + author, + "kvikio", + "One line description of project.", + "Miscellaneous", + ) +] + + +# -- Options for Epub output ------------------------------------------------- + +# Bibliographic Dublin Core info. +epub_title = project + +# The unique identifier of the text. This can be a ISBN number +# or the project homepage. +# +# epub_identifier = '' + +# A unique identification for the text. +# +# epub_uid = '' + +# A list of files that should not be packed into the epub file. +epub_exclude_files = ["search.html"] + + +# -- Extension configuration ------------------------------------------------- + def setup(app): app.add_css_file("https://docs.rapids.ai/assets/css/custom.css") diff --git a/docs/source/index.rst b/docs/source/index.rst index 31754db736..86c08a78df 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -1,18 +1,28 @@ -Welcome to KvikIO's documentation! -================================== +Welcome to KvikIO's Python documentation! +========================================= -KvikIO is a Python library providing bindings to `cuFile `_, which enables `GPUDirectStorage `_ (GDS). +KvikIO is a Python and C++ library for high performance file IO. It provides C++ and Python +bindings to `cuFile `_, +which enables `GPUDirect Storage `_ (GDS). +KvikIO also works efficiently when GDS isn't available and can read/write both host and device data seamlessly. -.. toctree:: - :maxdepth: 2 - :caption: Contents: +KvikIO is a part of the `RAPIDS `_ suite of open-source software libraries for GPU-accelerated data science. - api +.. note:: + This is the documentation for the Python library. For the C++ documentation of KvikIO, see under **libkvikio**. -Indices and tables -================== +Contents +-------- -* :ref:`genindex` -* :ref:`search` +.. toctree:: + :maxdepth: 1 + :caption: Getting Started + + install + quickstart + zarr + runtime_settings + api + genindex diff --git a/docs/source/install.rst b/docs/source/install.rst new file mode 100644 index 0000000000..c6f11a7a93 --- /dev/null +++ b/docs/source/install.rst @@ -0,0 +1,71 @@ +Installation +============ + +KvikIO can be installed using Conda/Mamba or from source. + + +Conda/Mamba +----------- + +We strongly recommend using `mamba `_ inplace of conda, which we will do throughout the documentation. + +Install the **stable release** from the ``rapidsai`` channel like: + +.. code-block:: + + # Install in existing environment + mamba install -c rapidsai -c conda-forge kvikio + # Create new environment (CUDA 11.8) + mamba create -n kvikio-env -c rapidsai -c conda-forge python=3.10 cuda-version=11.8 kvikio + # Create new environment (CUDA 12.0) + mamba create -n kvikio-env -c rapidsai -c conda-forge python=3.10 cuda-version=12.0 kvikio + +Install the **nightly release** from the ``rapidsai-nightly`` channel like: + +.. code-block:: + + # Install in existing environment + mamba install -c rapidsai-nightly -c conda-forge kvikio + # Create new environment (CUDA 11.8) + mamba create -n kvikio-env -c rapidsai-nightly -c conda-forge python=3.10 cuda-version=11.8 kvikio + # Create new environment (CUDA 12.0) + mamba create -n kvikio-env -c rapidsai-nightly -c conda-forge python=3.10 cuda-version=12.0 kvikio + + +.. note:: + + If the nightly install doesn't work, set ``channel_priority: flexible`` in your ``.condarc``. + +Build from source +----------------- + +In order to setup a development environment run: + +.. code-block:: + + # CUDA 11.8 + mamba env create --name kvikio-dev --file conda/environments/all_cuda-118_arch-x86_64.yaml + # CUDA 12.0 + mamba env create --name kvikio-dev --file conda/environments/all_cuda-120_arch-x86_64.yaml + +To build and install the extension run: + +.. code-block:: + + ./build.sh kvikio + + +One might have to define ``CUDA_HOME`` to the path to the CUDA installation. + +In order to test the installation, run the following: + +.. code-block:: + + pytest tests/ + + +And to test performance, run the following: + +.. code-block:: + + python benchmarks/single-node-io.py diff --git a/docs/source/quickstart.rst b/docs/source/quickstart.rst new file mode 100644 index 0000000000..448b3101f5 --- /dev/null +++ b/docs/source/quickstart.rst @@ -0,0 +1,37 @@ +Quickstart +========== + +KvikIO can be used inplace of Python's built-in `open() `_ function with the caveat that a file is always opened in binary (``"b"``) mode. +In order to open a file, use KvikIO's filehandle :py:meth:`kvikio.cufile.CuFile`. + +.. code-block:: python + + import cupy + import kvikio + + a = cupy.arange(100) + f = kvikio.CuFile("test-file", "w") + # Write whole array to file + f.write(a) + f.close() + + b = cupy.empty_like(a) + f = kvikio.CuFile("test-file", "r") + # Read whole array from file + f.read(b) + assert all(a == b) + + # Use contexmanager + c = cupy.empty_like(a) + with kvikio.CuFile(path, "r") as f: + f.read(c) + assert all(a == c) + + # Non-blocking read + d = cupy.empty_like(a) + with kvikio.CuFile(path, "r") as f: + future1 = f.pread(d[:50]) + future2 = f.pread(d[50:], file_offset=d[:50].nbytes) + future1.get() # Wait for first read + future2.get() # Wait for second read + assert all(a == d) diff --git a/docs/source/runtime_settings.rst b/docs/source/runtime_settings.rst new file mode 100644 index 0000000000..2d03eb2f87 --- /dev/null +++ b/docs/source/runtime_settings.rst @@ -0,0 +1,26 @@ +Runtime Settings +================ + +Compatibility Mode ``KVIKIO_COMPAT_MODE`` +----------------------------------------- +When KvikIO is running in compatibility mode, it doesn't load ``libcufile.so``. Instead, reads and writes are done using POSIX. Notice, this is not the same as the compatibility mode in cuFile. That is cuFile can run in compatibility mode while KvikIO is not. +Set the environment variable ``KVIKIO_COMPAT_MODE`` to enable/disable compatibility mode. By default, compatibility mode is enabled: + + * when ``libcufile.so`` cannot be found. + * when running in Windows Subsystem for Linux (WSL). + * when ``/run/udev`` isn't readable, which typically happens when running inside a docker image not launched with ``--volume /run/udev:/run/udev:ro``. + + +Thread Pool ``KVIKIO_NTHREADS`` +------------------------------- +KvikIO can use multiple threads for IO automatically. Set the environment variable ``KVIKIO_NTHREADS`` to the number of threads in the thread pool. If not set, the default value is 1. + + +Task Size ``KVIKIO_TASK_SIZE`` +------------------------------ +KvikIO splits parallel IO operations into multiple tasks. Set the environment variable ``KVIKIO_TASK_SIZE`` to the maximum task size (in bytes). If not set, the default value is 4194304 (4 MiB). + + +GDS Threshold ``KVIKIO_GDS_THRESHOLD`` +-------------------------------------- +In order to improve performance of small IO, ``.pread()`` and ``.pwrite()`` implement a shortcut that circumvent the threadpool and use the POSIX backend directly. Set the environment variable ``KVIKIO_GDS_THRESHOLD`` to the minimum size (in bytes) to use GDS. If not set, the default value is 1048576 (1 MiB). diff --git a/docs/source/zarr.rst b/docs/source/zarr.rst new file mode 100644 index 0000000000..f2a697d525 --- /dev/null +++ b/docs/source/zarr.rst @@ -0,0 +1,15 @@ +Zarr +==== + +`Zarr `_ is a binary file format for chunked, compressed, N-Dimensional array. It is used throughout the PyData ecosystem and especially for climate and biological science applications. + + +`Zarr-Python `_ is the official Python package for reading and writing Zarr arrays. Its main feature is a NumPy-like array that translates array operations into file IO seamlessly. +KvikIO provides a GPU backend to Zarr-Python that enables `GPUDirect Storage (GDS) `_ seamlessly. + +The following is an example of how to use the convenience function :py:meth:`kvikio.zarr.open_cupy_array` +to create a new Zarr array and how open an existing Zarr array. + + +.. literalinclude:: ../../python/examples/zarr_cupy_nvcomp.py + :language: python diff --git a/python/examples/zarr_cupy_nvcomp.py b/python/examples/zarr_cupy_nvcomp.py index 03d96b21ef..766139b442 100644 --- a/python/examples/zarr_cupy_nvcomp.py +++ b/python/examples/zarr_cupy_nvcomp.py @@ -29,7 +29,7 @@ def main(path): # Normally, we cannot assume that GPU and CPU compressors are compatible. # E.g., `open_cupy_array()` uses nvCOMP's Snappy GPU compression by default, # which, as far as we know, isn’t compatible with any CPU compressor. Thus, - # let’s re-write our Zarr array using a CPU and GPU compatible compressor. + # let's re-write our Zarr array using a CPU and GPU compatible compressor. z = kvikio.zarr.open_cupy_array( store=path, mode="w",