You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, I am trying to reproduce KVQaunt but have encountered some errors. Your assistance with this matter would be appreciated.
1. Reproduce the bug
I followed the provided instructions and set up the environment for gradient/quant/deployment. The gradient and quantization processes performed well; I successfully computed the gradient and built the quantizer. However, when I tested the deployment code using the following instructions, I encountered the error message "CUDA error: an illegal memory access was encountered."
According to my understanding, it appears that the error is somehow related to CUDA kernel implementation "vecquant4appendvecKsparse," which modifies the variable "outliers_rescaled".
Due to hardware constraints, I intend to perform a quick test on the smaller model weights as indicated above. KVQuant is expected to work properly, as the smaller model differs from Llama-7B only in terms of weight size while sharing a similar architecture.
4、Related solutions that I have tried
As suggested in the discussion related to this CUDA error on https://github.com/pytorch/pytorch/issues/21819 , I have updated CUDA, torch, and other relevant components to the latest versions. However, I am still encountering the same error.
What's the potential problem of this error and how could I solve it?
Thanks in advance!
The text was updated successfully, but these errors were encountered:
hi, I met the same problem, in my case, my tensor variables are not in the same device, that's the problem,
after I fixed the tensor variables to the same device(cpu or cuda), the problem was solved, maybe this case will help.
hi, I met the same problem, in my case, my tensor variables are not in the same device, that's the problem, after I fixed the tensor variables to the same device(cpu or cuda), the problem was solved, maybe this case will help.
Thanks for your suggestions, I will give it a try!
Thank you for your excellent work!
Currently, I am trying to reproduce KVQaunt but have encountered some errors. Your assistance with this matter would be appreciated.
1. Reproduce the bug
I followed the provided instructions and set up the environment for gradient/quant/deployment. The gradient and quantization processes performed well; I successfully computed the gradient and built the quantizer. However, when I tested the deployment code using the following instructions, I encountered the error message "CUDA error: an illegal memory access was encountered."
2. Error logs
The detailed error logs are shown as follows:
According to my understanding, it appears that the error is somehow related to CUDA kernel implementation "vecquant4appendvecKsparse," which modifies the variable "outliers_rescaled".
3. Environment
Due to hardware constraints, I intend to perform a quick test on the smaller model weights as indicated above. KVQuant is expected to work properly, as the smaller model differs from Llama-7B only in terms of weight size while sharing a similar architecture.
4、Related solutions that I have tried
As suggested in the discussion related to this CUDA error on https://github.com/pytorch/pytorch/issues/21819 , I have updated CUDA, torch, and other relevant components to the latest versions. However, I am still encountering the same error.
What's the potential problem of this error and how could I solve it?
Thanks in advance!
The text was updated successfully, but these errors were encountered: