You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For non-blockscaled blackwell gemm, theoretically all multiple number of 16 as ctaN = {16, 32, 48, 64, 80, ...} should be supported as the UMMA specification.
However, I get compilation error with those CtaNs other than 16, 32, 64, 96, 128, 192, 256.
For example, when I set the tile shape of the example 70_blackwell_fp16_gemm.cu to <_64, _48, _64>, I encounter some long cryptic error that appears to be related to shape divisibility.
PR #2220 addresses this kind of issue with Hopper kernels, so I will come up with the similar approach to this issue soon.