You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Will the kernel affect the sparsity? When i read the chapter 3 of the paper and intuitive code implementation of index building, i thought that vertical / slash lines are selected just token by token, namely with 1*1 token mask. But the attached image showed that "Slash lines use 64 × 64 blocks, while vertical lines use 1 × 64 blocks.". Then the sparsity has decreased a lot?
Will it be the reason why @iofu728 said that the pattern all replaced by "vertical_and_slash" works better? Since the sparsity is different? #17 (comment)
The text was updated successfully, but these errors were encountered:
Hi @susu1210, thank you for your question. The block-wise computation is designed with a kernel-oriented approach.
To align different patterns with their corresponding sparsity, we adjust the offline search space. You can find the details in Section 3.2 and Appendix C.2. Additionally, the actual sparsity in the kernel is approximately 80%-95%, as shown in Figure 12. This level of sparsity is necessary to achieve a high end-to-end speedup.
Furthermore, I believe that replacing all patterns with "vertical_and_slash" works well mainly because it retains more dynamism than the A-shape pattern while still capturing most of the block-sparse information. We will continue to analyze this aspect further.
Describe the issue
Will the kernel affect the sparsity? When i read the chapter 3 of the paper and intuitive code implementation of index building, i thought that vertical / slash lines are selected just token by token, namely with 1*1 token mask. But the attached image showed that "Slash lines use 64 × 64 blocks, while vertical lines use 1 × 64 blocks.". Then the sparsity has decreased a lot?Will it be the reason why @iofu728 said that the pattern all replaced by "vertical_and_slash" works better? Since the sparsity is different? #17 (comment)
The text was updated successfully, but these errors were encountered: