Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High error in 50% sparsity #14

Open
simlaharma opened this issue Dec 11, 2024 · 0 comments
Open

High error in 50% sparsity #14

simlaharma opened this issue Dec 11, 2024 · 0 comments

Comments

@simlaharma
Copy link

Hello,
I already read the issue about the total error reported at the end and I understand that the errors are pretty low in that particular case. I ran the same configuration and got the same error, but when decrease the sparsity ration down to 50%, there is a very high error and big mismatches between cublas and flashllm:

First 10 Mismatches between Cublas and MySpMM: NOTHING PRINTED
******************************************Problem Size******************************************
M: 7168 N: 8 K: 7168 Pruning Rate: 90 SplitK: 7
CuBlas_SIMT      ->              Time/ms: 0.454          Performance/TFLOPs: 1.81        TotalError: 0.00
CuBlas_TC        ->              Time/ms: 0.224          Performance/TFLOPs: 3.67        TotalError: 0.00
FlashLLM_v1      ->              Time/ms: 0.064          Performance/TFLOPs: 12.84       TotalError: 408.53
FlashLLM_v2      ->              Time/ms: 0.064          Performance/TFLOPs: 12.85       TotalError: 408.53

------

First 10 Mismatches between Cublas and MySpMM: NOTHING PRINTED
******************************************Problem Size******************************************
M: 7168 N: 8 K: 7168 Pruning Rate: 70 SplitK: 7
CuBlas_SIMT      ->              Time/ms: 0.454          Performance/TFLOPs: 1.81        TotalError: 0.00
CuBlas_TC        ->              Time/ms: 0.224          Performance/TFLOPs: 3.67        TotalError: 0.00
FlashLLM_v1      ->              Time/ms: 0.136          Performance/TFLOPs: 6.05        TotalError: 1099.75
FlashLLM_v2      ->              Time/ms: 0.136          Performance/TFLOPs: 6.05        TotalError: 1099.75

------


First 10 Mismatches between Cublas and MySpMM: NOTHING PRINTED
******************************************Problem Size******************************************
M: 7168 N: 8 K: 7168 Pruning Rate: 60 SplitK: 7
CuBlas_SIMT      ->              Time/ms: 0.454          Performance/TFLOPs: 1.81        TotalError: 0.00
CuBlas_TC        ->              Time/ms: 0.224          Performance/TFLOPs: 3.67        TotalError: 0.00
FlashLLM_v1      ->              Time/ms: 0.178          Performance/TFLOPs: 4.62        TotalError: 1699.12
FlashLLM_v2      ->              Time/ms: 0.178          Performance/TFLOPs: 4.62        TotalError: 1699.12

------


First 10 Mismatches between Cublas and MySpMM:
(128,0) CuBlas=-340.000000 MySpMM=-290.750000
(128,1) CuBlas=-343.250000 MySpMM=-299.000000
(128,2) CuBlas=-363.500000 MySpMM=-299.500000
(128,3) CuBlas=-377.250000 MySpMM=-317.000000
(128,4) CuBlas=-342.250000 MySpMM=-297.250000
(128,5) CuBlas=-372.500000 MySpMM=-318.250000
(128,6) CuBlas=-333.000000 MySpMM=-288.250000
(128,7) CuBlas=-337.500000 MySpMM=-279.500000
(129,0) CuBlas=-330.750000 MySpMM=-271.000000
(129,1) CuBlas=-333.000000 MySpMM=-287.000000
******************************************Problem Size******************************************
M: 7168 N: 8 K: 7168 Pruning Rate: 50 SplitK: 7
CuBlas_SIMT      ->              Time/ms: 0.454          Performance/TFLOPs: 1.81        TotalError: 0.00
CuBlas_TC        ->              Time/ms: 0.224          Performance/TFLOPs: 3.67        TotalError: 0.00
FlashLLM_v1      ->              Time/ms: 0.223          Performance/TFLOPs: 3.69        TotalError: 64917.88
FlashLLM_v2      ->              Time/ms: 0.223          Performance/TFLOPs: 3.69        TotalError: 72974.00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant