Should uint3 in CUDA be converted to sycl::marray<unsigned int, 3> ? https://github.com/intel/llvm/issues/20555