-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add logit softcapping to GQA #876
Conversation
ORT 1.19.2 does not support
|
The |
@kunal-vaishnavi Thank you, makes sense |
### Description This PR adds the `softcap` attribute to the `GroupQueryAttention` op. ### Motivation and Context This PR helps resolve the `NaN` output issue with Gemma-2 raised in [this issue](#692).
Description
This PR adds the
softcap
attribute to theGroupQueryAttention
op.Motivation and Context
This PR helps resolve the
NaN
output issue with Gemma-2 raised in this issue.