Operators

fast-ops is a personal project library containing efficient PyTorch operators, usually targeting (NVIDIA) GPUs.

Generally, we focus on operators that aren't already implemented in other high-performance operator libraries, unless we feel we can beat them on performance, features, or usability. Some other places you can go "shopping" for operators are:

NVIDIA Apex
Facebook xFormers
ByteDance LightSeq
FlashAttention - There's lots of other optimized operators in there other than FlashAttention.
bitsandbytes - Various operations related to low precision (8-bit) training and inference.

Operators

(Flash) Multi-Head Attention: Algorithm from FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness. Significantly faster than vanilla attention due to fused implementation, and also has $O(n)$ rather than $O(n^2)$ memory complexity.
(Fused) Lion Optimizer: Optimizer described in Symbolic Discovery of Optimization Algorithms. Claims some improved convergence properties and optimizer states only consists of half precision momentum, meaning pretty large memory savings over commonly used optimizers like AdamW.

Development

Dependencies

This project's Python dependencies are managed with Poetry.

You can install dependencies (or subsets for development and testing) using:

> poetry install --no-root

this will create a new virtual environment all dependencies installed, which can be activated using

> source $(poetry env info --path)/bin/activate

Testing

Some test files support using pytest-xdist to parallelize tests across GPUs. After installing it (you would have gotten it from poetry install), you can run your tests like:

pytest -n 8

to utilize 8 devices. Sometimes you can get away with more workers than devices but other times you'll get OOMs.

Language Server Support

We use Bear to generate the compile_commands.json file that is used for language servers. If you need to update this file you can run:

> bear python setup.py develop

or

> bear pytest

Formatting

Run scripts/fmt to reformat all project files using clang-format and black.

Name		Name	Last commit message	Last commit date
Latest commit History 120 Commits
benchmarks		benchmarks
fast_ops		fast_ops
scripts		scripts
tests		tests
third-party		third-party
.clang-format		.clang-format
.clangd		.clangd
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
compile_commands.json		compile_commands.json
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Operators

Development

Dependencies

Testing

Language Server Support

Formatting

Resources

About

Uh oh!

Releases

Packages

Uh oh!

Languages

jfc4050/fast-ops

Folders and files

Latest commit

History

Repository files navigation

Operators

Development

Dependencies

Testing

Language Server Support

Formatting

Resources

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages