Skip to content

Commit

Permalink
#5 Added padding for all dimensions, as this is a requirement for the…
Browse files Browse the repository at this point in the history
… optimized 1d diffusions for each axis.
  • Loading branch information
carljohnsen committed Sep 10, 2024
1 parent 8c90bdf commit ed3c47f
Showing 1 changed file with 5 additions and 1 deletion.
6 changes: 5 additions & 1 deletion src/lib/cpp/gpu/diffusion.cc
Original file line number Diff line number Diff line change
Expand Up @@ -567,7 +567,11 @@ namespace gpu {

void diffusion_in_memory(const uint8_t *__restrict__ voxels, const shape_t &N, const float *__restrict__ kernel, const int64_t kernel_size, const int64_t repititions, uint16_t *__restrict__ output) {
constexpr int32_t veclen = 32; // TODO
const shape_t P = { N.z, N.y, (N.x + veclen-1) / veclen * veclen };
const shape_t P = {
((N.z + veclen-1) / veclen) * veclen,
((N.y + veclen-1) / veclen) * veclen,
((N.x + veclen-1) / veclen) * veclen
};
const int64_t
padded_size = P.z*P.y*P.x,
total_size = N.z*N.y*N.x,
Expand Down

0 comments on commit ed3c47f

Please sign in to comment.