In theory, Binarized Neural Networks boast of x32 compression and x10 acceleration on GPUs. However, all variables are stored and manipulated in the form of float-32 on PyTorch. In this case, if we want to implement 1-bit networks on PyTorch, we have to change the kernel.
This repository contains a wrapped CUDA kernel that supports both 1-bit storage and computation on PyTorch
GCC: >=5.0
PyTorch: <1.0
PyThon: >=3.5
CUDA: >=8.0
cd ./csrc/binop
make
cd ./csrc/binop_cpu
make
import torch
import binop
...
import torch
import binop_cpu
Pytorch Custom CUDA kernel Tutorial
You can see the GPU example in test
You can see the CPU example in test