-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Paddle-Lite OpenCL后端整体架构 #53
Comments
OpenCL的模型转换OpenCL的Predictor创建OpenCL的内存管理AutoTunecl::NDRange CLContext::DefaultLocalWorkSize(
const cl::NDRange &gws,
register size_t max_ws,
const int &divisor /*=2*/,
const bool &reverse /*=false*/,
const size_t &user_def_max_ws /*=0*/) {
register size_t lx = reverse ? gws[2] : gws[0];
register size_t ly = gws[1];
register size_t lz = reverse ? gws[0] : gws[2];
max_ws = (user_def_max_ws > 0 && user_def_max_ws <= max_ws) ? user_def_max_ws
: max_ws;
max_ws = divisor > 1 ? max_ws / divisor : max_ws;
if (max_ws > 0) {
while (ly > max_ws) {
// replace mod with bit operate
ly = (ly & 0x01) ? 1 : ly >> 1;
}
while (ly * lz > max_ws) {
lz = (lz & 0x01) ? 1 : lz >> 1;
}
while (ly * lz * lx > max_ws) {
lx = (lx & 0x01) ? 1 : lx >> 1;
}
}
return reverse ? cl::NDRange{lz, ly, lx} : cl::NDRange{lx, ly, lz};
} 精度设置API架构
精度API提供3种精度设置:
类似AutoTune,只不过将实现的过程放在了 |
Image2D的数据排布static std::map<std::string, size_t> InitImageDimInfoWith(
const DDim& tensor_dim) {
size_t new_dims[] = {1, 1, 1, 1};
for (size_t j = 0; j < tensor_dim.size(); ++j) {
new_dims[4 - tensor_dim.size() + j] = tensor_dim[j];
}
size_t N, C, H, W;
N = new_dims[0];
C = new_dims[1];
H = new_dims[2];
W = new_dims[3];
size_t width = W * ((C + 3) / 4);
size_t height = H * N;
return std::map<std::string, size_t>({{"width", width}, {"height", height}});
} 如果维度不够4维如只有2维度,则将2维度前面补1 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Paddle-Lite OpenCL后端主要分为如下4部分:
cl::Program
的创建;cl::Kernel
的创建、设置LocalWorkSize、GlobalWorkSize、AutoTune等;CLWrapper
的部分OpenCL API函数做框架层面的封装,如Image2D和Buffer的Malloc、Free等。其它琐碎的地方:
The text was updated successfully, but these errors were encountered: