We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
您好,我看了挺多篇代码的,但是其中QKV没有统一,有些QKV用的都是相同的一个x就forward了,但是有些也和你一样,用的是不同的linear然后再算,所以我想问下,到底应该是哪种方法?或者说两种方法其实都OK,对最后的结果影响不大?
The text was updated successfully, but these errors were encountered:
No branches or pull requests
您好,我看了挺多篇代码的,但是其中QKV没有统一,有些QKV用的都是相同的一个x就forward了,但是有些也和你一样,用的是不同的linear然后再算,所以我想问下,到底应该是哪种方法?或者说两种方法其实都OK,对最后的结果影响不大?
The text was updated successfully, but these errors were encountered: