Skip to content

Releases: fanshiqing/grouped_gemm

v1.1.4

11 Jul 08:51
Compare
Choose a tag to compare

fix moe_permute_topK for token-drop case.

Token drop support for permute & unpermute ops

07 May 08:23
Compare
Choose a tag to compare

Token drop support for permute & unpermute ops.

Add streams sync to multi-stream cublas.

25 Apr 13:37
Compare
Choose a tag to compare
  • Add streams sync to multi-stream cublas.

Optimized permute/unpermute kernels for topk router.

19 Apr 07:43
Compare
Choose a tag to compare

Optimized permute/unpermute kernels for topk router.

Initial release

01 Apr 09:51
Compare
Choose a tag to compare
Initial release Pre-release
Pre-release
  • Megablocks based gmm;
  • Multi-stream cublas gemm for sm90;
  • permute/unpermute kernel;
  • sinkhorn kernel.