Skip to content

SwekeR-463/kernels

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Kernels in CUDA || Triton

kernels of different DL funcs

activation

  • ELU (fp32, fp16, fp16x2, fp16x8_packed)
  • GeLU (fp32, fp16, fp16x4_packed)
  • Sigmoid (fp32, fp16, fp16x8_packed)
  • ReLU (fp32, fp16)
  • Swish (fp32, fp16)

embedding

  • similar kernel to torch.nn.functional.embedding in fp32 & fp16

About

learning & making kernels in cuda / triton

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published