Skip to content

v2024.02.0

Compare
Choose a tag to compare
@rhornung67 rhornung67 released this 14 Feb 20:43
· 492 commits to main since this release
82d1b92

This release contains several RAJA improvements and submodule updates.

Please download the RAJA-v2024.02.0.tar.gz file below. The others, generated by GitHub, may not work for you due to RAJA's dependencies on git submodules.

Notable changes include:

  • New features / API changes:

    • BREAKING CHANGE (ALMOST): The loop_exec and associated policies such as loop_atomic, loop_reduce, etc. were deprecated in the v2023.06.0 release (please see the release notes for that version for details). Users should replace these with seq_exec and associated policies for sequential CPU execution. The code behavior will be identical to what you observed with loop_exec, etc. However, due to a request from some users with special circumstances, the loop_* policies still exist in this release as type aliases to their seq_* analogues. The loop_* policies will be removed in a future release.
    • BREAKING CHANGE: RAJA TBB back-end support has been removed. It was not feature complete and the TBB API has changed so that the code no longer compiles with newer Intel compilers. Since we know of no project that depends on it, we have removed it.
    • An IndexLayout concept was added, which allows for accessing elements of a RAJA View via a collection of indicies and use a different indexing strategy along different dimensions of a multi-dimensional View. Please the RAJA User Guide for more information.
    • Add support for SYCL reductions using the new RAJA reduction API.
    • Add support for new reduction API for all back-ends in RAJA::launch.
  • Build changes/improvements:

    • Update BLT submodule to v0.6.1 and incorporate its new macros for managing TPL targets in CMake.
    • Update camp submodule to v2024.02.0, which contains changes to support ROCm 6.x compilers.
    • Update desul submodule to afbd448.
    • Replace internal use of HIP and CUDA platform macros to their newer versions to support latest compilers.
  • Bug fixes/improvements:

    • Change internal memory allocation for HIP to use coarse-grained pinned memory, which improves performance because it can be cached on a device.
    • Fix compilation error resulting from incorrect namespacing of OpenMP execution policy.
    • Several fixes to internal implementation of Reducers and Operators.