-
Notifications
You must be signed in to change notification settings - Fork 22
High error in 50% sparsity #14
Copy link
Copy link
Open
Description
Hello,
I already read the issue about the total error reported at the end and I understand that the errors are pretty low in that particular case. I ran the same configuration and got the same error, but when decrease the sparsity ration down to 50%, there is a very high error and big mismatches between cublas and flashllm:
First 10 Mismatches between Cublas and MySpMM: NOTHING PRINTED
******************************************Problem Size******************************************
M: 7168 N: 8 K: 7168 Pruning Rate: 90 SplitK: 7
CuBlas_SIMT -> Time/ms: 0.454 Performance/TFLOPs: 1.81 TotalError: 0.00
CuBlas_TC -> Time/ms: 0.224 Performance/TFLOPs: 3.67 TotalError: 0.00
FlashLLM_v1 -> Time/ms: 0.064 Performance/TFLOPs: 12.84 TotalError: 408.53
FlashLLM_v2 -> Time/ms: 0.064 Performance/TFLOPs: 12.85 TotalError: 408.53
------
First 10 Mismatches between Cublas and MySpMM: NOTHING PRINTED
******************************************Problem Size******************************************
M: 7168 N: 8 K: 7168 Pruning Rate: 70 SplitK: 7
CuBlas_SIMT -> Time/ms: 0.454 Performance/TFLOPs: 1.81 TotalError: 0.00
CuBlas_TC -> Time/ms: 0.224 Performance/TFLOPs: 3.67 TotalError: 0.00
FlashLLM_v1 -> Time/ms: 0.136 Performance/TFLOPs: 6.05 TotalError: 1099.75
FlashLLM_v2 -> Time/ms: 0.136 Performance/TFLOPs: 6.05 TotalError: 1099.75
------
First 10 Mismatches between Cublas and MySpMM: NOTHING PRINTED
******************************************Problem Size******************************************
M: 7168 N: 8 K: 7168 Pruning Rate: 60 SplitK: 7
CuBlas_SIMT -> Time/ms: 0.454 Performance/TFLOPs: 1.81 TotalError: 0.00
CuBlas_TC -> Time/ms: 0.224 Performance/TFLOPs: 3.67 TotalError: 0.00
FlashLLM_v1 -> Time/ms: 0.178 Performance/TFLOPs: 4.62 TotalError: 1699.12
FlashLLM_v2 -> Time/ms: 0.178 Performance/TFLOPs: 4.62 TotalError: 1699.12
------
First 10 Mismatches between Cublas and MySpMM:
(128,0) CuBlas=-340.000000 MySpMM=-290.750000
(128,1) CuBlas=-343.250000 MySpMM=-299.000000
(128,2) CuBlas=-363.500000 MySpMM=-299.500000
(128,3) CuBlas=-377.250000 MySpMM=-317.000000
(128,4) CuBlas=-342.250000 MySpMM=-297.250000
(128,5) CuBlas=-372.500000 MySpMM=-318.250000
(128,6) CuBlas=-333.000000 MySpMM=-288.250000
(128,7) CuBlas=-337.500000 MySpMM=-279.500000
(129,0) CuBlas=-330.750000 MySpMM=-271.000000
(129,1) CuBlas=-333.000000 MySpMM=-287.000000
******************************************Problem Size******************************************
M: 7168 N: 8 K: 7168 Pruning Rate: 50 SplitK: 7
CuBlas_SIMT -> Time/ms: 0.454 Performance/TFLOPs: 1.81 TotalError: 0.00
CuBlas_TC -> Time/ms: 0.224 Performance/TFLOPs: 3.67 TotalError: 0.00
FlashLLM_v1 -> Time/ms: 0.223 Performance/TFLOPs: 3.69 TotalError: 64917.88
FlashLLM_v2 -> Time/ms: 0.223 Performance/TFLOPs: 3.69 TotalError: 72974.00
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels