-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The Midgard Shader Core #44
Comments
Mali Midgard GPU架构优化细节Midgard架构包括了T600、T700以及800系列,Arm官方给了对该架构的优化细节,下面将逐一展开,并结合我的理解。这部分内容主题来自其官网对Midgard GPU在OpenCL的文档。 kernel中所有线程结束的时间是相同的
Make your kernel code as simple as possible
Use vector operations in kernel code
Vectorize your code
Vectorize incrementallyVectorize in incremental steps. For example, start processing one pixel at a time, then two, then four. Avoid processing single valuesAvoid writing kernels that operate on single bytes or other small values. Write kernels that work on vectors. Use 128-bit vectorsVector sizes of 128-bits are optimal. Vector sizes greater than 128-bits are broken into 128-bit parts and operated on separately. For example, adding two 256-bit vectors takes twice as long as adding two 128-bit vectors. You can use vector sizes less than 128 bits without issue. The disadvantage of using vectors greater than 128 bits is that they can increase code size. Increased code size uses more instruction cache space and this can reduce performance. |
https://developer.arm.com/solutions/graphics-and-gaming/developer-guides/learn-the-basics/the-midgard-shader-core/single-page
https://developer.arm.com/documentation/100614/0314/OpenCL-optimizations-list/Mali-Midgard-GPU-specific-optimizations
The text was updated successfully, but these errors were encountered: