Release 0.4.0: Multi-Platform Support, Logo and License Change · alpaka-group/alpaka

Compatibility Changes:

added support for CUDA 10.0, 10.1 and 10.2
dropped support for CUDA 7.0 and 7.5
added official support for Visual Studio 2017 on Windows with CUDA 10 (built on Travis CI instead of appveyor now)
added support for xcode10.2-11.3 (no official CUDA support yet)
added support for Ubuntu 18.04
added support for gcc 9
added support for clang 7.0, 8.0 and 9.0
dropped support for clang 3.5, 3.6, 3.7, 3.8 and 3.9
added support for CMake 3.13, 3.14, 3.15 and 3.16
dropped support for CMake 3.11.3 and lower, 3.11.4 is the lowest supported version
added support for Boost 1.69, 1.70 and 1.71
added support for usage of libc++ instead of libstdc++ for clang builds
removed dependency to Boost.MPL and BOOST_CURRENT_FUNCTION
replaced Boost.Test with Catch2 using an internal version of Catch2 by default but allowing to use an external one

Bug Fixes:

fixed some incorrect host/device function attributes
fixed warning about comparison unsigned < 0
There is no need to disable all other backends manually when using ALPAKA_ACC_GPU_CUDA_ONLY_MODE anymore
fixed static block shared memory of types with alignemnt higher than defaultAlignment
fixed race-condition in HIP/NVCC queue
fixed data races when a GPU updates host memory by aligning host memory buffers always to 4kib

New Features:

Added a new alpaka Logo!
the whole alpaka code has been relicensed to MPL2 and the examples to ISC
added ALPAKA_CXX_STANDARD CMake option which allows to select the C++ standard to be used
added ALPAKA_CUDA_NVCC_SEPARABLE_COMPILATION option to enable separable compilation for nvcc
added ALPAKA_CUDA_NVCC_EXPT_EXTENDED_LAMBDA and ALPAKA_CUDA_NVCC_EXPT_RELAXED_CONSTEXPR CMake options to enable/disable those nvcc options (they were always ON before)
added headers for standalone usage without CMake (alpaka/standalone/GpuCudaRt.h, ...) which set the backend defines
added experimental HIP back-end with using nvcc (HIP >= 1.5.1 required, latest rocRand). More on HIP setup: doc/markdown/user/implementation/mapping/HIP.md
added sincos math function implementations
allowed to copy and move construct ViewPlainPtr
added support for CUDA atomics using "unsigned long int"
added compile-time error for atomic CUDA ops which are not available due to sm restrictions
added explicit errors for unsupported types/operations for CUDA atomics
replaced usages of assert with ALPAKA_ASSERT
replaced BOOST_VERIFY by ALPAKA_CHECK and returned success from all test kernels
added alpaka::ignore_unused as replacement for boost::ignore_unused

Breaking changes:

renamed Queue*Async to Queue*NonBlocking and Queue*Sync to Queue*Blocking
renamed alpaka::size::Size to alpaka::idx::Idx, alpaka::size::SizeType to alpaka::idx::IdxType (and TSize to TIdx internally)
replaced ALPAKA_FN_ACC_NO_CUDA by ALPAKA_FN_HOST
replaced ALPAKA_FN_ACC_CUDA_ONLY by direct usage of __device__
renamed ALPAKA_STATIC_DEV_MEM_CONSTANT to ALPAKA_STATIC_ACC_MEM_CONSTANT and ALPAKA_STATIC_DEV_MEM_GLOBAL to ALPAKA_STATIC_ACC_MEM_GLOBAL
renamed alpaka::kernel::createTaskExec to alpaka::kernel::createTaskKernel
QueueCpuSync now correctly blocks when called from multiple threads
** This broke some previous use-cases (e.g. usage within existing OpenMP parallel regions)
** This use case can now be handled with the support for external CPU queues as can bee seen in the example QueueCpuOmp2CollectiveImpl
previously it was possible to have kernels return values even though they were always ignored. Now kernels are checked to always return void
renamed all files with *Stl suffix to *StdLib
renamed BOOST_ARCH_CUDA_DEVICE to BOOST_ARCH_PTX
executors have been renamed due to the upcoming standard C++ feature with a different meaning. All files within alpaka/exec/ have been moved to alpaka/kernel/ and the files and classes have been renamed from Exec* to TaskKernel*. This should not affect users of alpaka but will affect extensions.

Provide feedback