Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to optimize opencl gemm #62

Closed
liao0028 opened this issue Mar 7, 2024 · 1 comment
Closed

how to optimize opencl gemm #62

liao0028 opened this issue Mar 7, 2024 · 1 comment

Comments

@liao0028
Copy link

liao0028 commented Mar 7, 2024

根据您这个issues,#55,我实现了一版opencl代码在手机端的gpu上运行,并将尝试将m_tile和n_tile调整成4和8,比之前的m_tiles=8,n_tiles=4得到了优化,想问一下,还有没有其他优化手段提供一下思路。

@ysh329
Copy link
Owner

ysh329 commented Mar 19, 2024

你可以参考其他关于调优的issue链接,但首先,有必要检查当前的计算是否达到你期望的计算峰值和内存带宽,这个需要看看硬件的文档

@ysh329 ysh329 closed this as completed Mar 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants