Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Kokkos execution policies on DDC algorithms #254

Closed
tpadioleau opened this issue Jan 8, 2024 · 6 comments
Closed

Support Kokkos execution policies on DDC algorithms #254

tpadioleau opened this issue Jan 8, 2024 · 6 comments
Assignees
Labels
enhancement New feature or request

Comments

@tpadioleau
Copy link
Member

No description provided.

@tpadioleau tpadioleau added the enhancement New feature or request label Jan 8, 2024
@tpadioleau tpadioleau added this to the Align with Kokkos milestone Jan 8, 2024
@tpadioleau
Copy link
Member Author

No new execution policy for this issue

@tpadioleau tpadioleau mentioned this issue Jan 10, 2024
@thierryantoun
Copy link
Member

thierryantoun commented Jan 31, 2024

I added few modifications on a first file for_each_kokkos.hpp.
Perhaps someone could tell me if I'm on the right track or not to determine whether I should continue to work on this issue in this manner.

Main changes on the file

No more namespace policies

namespace policies {

inline constexpr serial_host_policy serial_host;
inline constexpr parallel_host_policy parallel_host;
inline constexpr parallel_device_policy parallel_device;

template <typename ExecSpace>
constexpr auto policy([[maybe_unused]] ExecSpace exec_space)
{
    if constexpr (std::is_same_v<ExecSpace, Kokkos::Serial>) {
        return ddc::policies::serial_host;
#ifdef KOKKOS_ENABLE_OPENMP
    } else if constexpr (std::is_same_v<ExecSpace, Kokkos::OpenMP>) {
        return ddc::policies::parallel_host;
#endif
#if defined(KOKKOS_ENABLE_CUDA) || defined(KOKKOS_ENABLE_HIP)
    } else {
        return ddc::policies::parallel_device;
#endif
    }
}

} // namespace policies

The goal for me is to replace this namespace and to explicit the Kokkos execution spaces.

"for_each" functions

In the original version, there are 3 for_each functions. Each one is executed on a different execution space:

  • Serial
  • Host
  • Device
    And the original code used the policy to explicit which execution space to use.
    In my code, I just have one for_each function:
template <class ExecSpace, class... DDims, class Functor>
inline void for_each(
        DiscreteDomain<DDims...> const& domain,
        Functor&& f) noexcept
{
    detail::for_each_kokkos<ExecSpace>(domain, std::forward<Functor>(f));
}

With the ExecSpace template, the user will just have to use:

  • Kokkos::Serial
  • Kokkos::DefaultExecutionSpace
  • Kokkos::DefaultHostExecutionSpace

I also deleted the for_each_serial function because for me it is the same as for_each_kokkos with the use of Kokkos::Serial as a template. Have to check with somebody on that. We see a consequence of this change in the next change I present.

"for_each_n" function

The initial function:

template <class... DDims, class Functor>
inline void for_each_n(
        serial_host_policy,
        DiscreteVector<DDims...> const& extent,
        Functor&& f) noexcept
{
    DiscreteVector<DDims...> const ddc_begin {};
    DiscreteVector<DDims...> const ddc_end = extent;
    std::array const begin = detail::array(ddc_begin);
    std::array const end = detail::array(ddc_end);
    detail::for_each_serial<DiscreteVector<DDims...>>(begin, end, std::forward<Functor>(f));
}

My function:

template <class... DDims, class Functor>
inline void for_each_n(
        DiscreteVector<DDims...> const& extent,
        Functor&& f) noexcept
{
    DiscreteVector<DDims...> const ddc_begin {};
    DiscreteVector<DDims...> const ddc_end = extent;
    std::array const begin = detail::array(ddc_begin);
    std::array const end = detail::array(ddc_end);
    detail::for_each_kokkos<Kokkos::Serial, DiscreteVector<DDims...>>(begin, end, std::forward<Functor>(f));
}

I was thinking the "for_each_serial" function could be replaced like that.

thierryantoun added a commit that referenced this issue Jan 31, 2024
@jbigot
Copy link
Member

jbigot commented Feb 1, 2024

Regarding the difference between for_each & for_each_n you can have a look at the standard, this is the same difference as std::for_each vs. std::for_each_n one iterates over the elements of a container while the other iterates over integers.

To be discussed with @tpadioleau , but I would keep a specialized ddc::policies::serial (Cf #174 ) that is different from Kokkos::Serial. The former is just a simple loop that stays on the current execution space and can be nested at will. The later is a specific case of Kokkos loop with the same constraints as the parallel one: i.e. no nesting for example.

Maybe the best would be to split it in two versions:

  • for_each[_n] that is sequential, and
  • par_for_each[_n] that takes a kokkos execution space as template parameter (and we should add an execution space instance as optional function parameter)

jbigot pushed a commit that referenced this issue Feb 8, 2024
@tpadioleau
Copy link
Member Author

Actually the only difference between std::for_each and std::for_each_n is the way you describe the end of the algorithm. DDC does not really follow this, it should be updated at some point.
Also, looking at the new constrained versions of these two algorithms, std::ranges::for_each has both an iterator based and a range based version. On the other hand std::ranges::for_each_n only has an iterator based version. Because DDC is closer to the range based versions because of DiscreteDomain/DiscreteVector, I tend to think for_each_n should be removed or only be an overload of for_each ?

@jbigot
Copy link
Member

jbigot commented Feb 8, 2024

Totally agree

jbigot pushed a commit that referenced this issue Feb 8, 2024
@tpadioleau
Copy link
Member Author

Done through different MR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants