Skip to content

jfc4050/fast-ops

Repository files navigation

fast-ops is a personal project library containing efficient PyTorch operators, usually targeting (NVIDIA) GPUs.

Generally, we focus on operators that aren't already implemented in other high-performance operator libraries, unless we feel we can beat them on performance, features, or usability. Some other places you can go "shopping" for operators are:

Operators

Development

Dependencies

This project's Python dependencies are managed with Poetry.

You can install dependencies (or subsets for development and testing) using:

> poetry install --no-root

this will create a new virtual environment all dependencies installed, which can be activated using

> source $(poetry env info --path)/bin/activate

Testing

Some test files support using pytest-xdist to parallelize tests across GPUs. After installing it (you would have gotten it from poetry install), you can run your tests like:

pytest -n 8

to utilize 8 devices. Sometimes you can get away with more workers than devices but other times you'll get OOMs.

Language Server Support

We use Bear to generate the compile_commands.json file that is used for language servers. If you need to update this file you can run:

> bear python setup.py develop

or

> bear pytest

Formatting

Run scripts/fmt to reformat all project files using clang-format and black.

Resources

About

library for efficient PyTorch operators

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published