Conversation
|
Hi @LLNL/raja-core, in collaboration with AMD staff we found a key optimization for nested loops within RAJA::launch. |
|
I wonder if we can't use the same policies but with an extra template argument to choose between the global and context versions of the variables. |
|
I don't think it's a problem to keep the old policies and have a lot of alternatives for folks to try. We do need to work on documenting policies better and have a comprehensive cookbook of examples that clearly show the differences between policies choices, including usage, performance, and how to choose. |
|
Do we want to make populating the context variables optional in case we find any overhead there? |
|
Any thoughts on blockIdx and gridDim? |
oh for register heavy kernels? that makes sense |
I see pro and cons, pro - for completeness could be handy, con - takes up more registers. Maybe we can do partial specializations of the launch context or something like that. This may be less common use cases though. |
|
Ya, I'm imagining the context and some policies both having a switch. Then in the loop implementation it checks that if the policy uses the switch then the context must have the same switch. template < bool switch >
struct Policy;
template < bool switch >
struct Context;
template < bool policy_switch, bool context_switch >
void loop(Policy<policy_switch>, Context<context_switch>)
{
static_assert(!policy_switch || (policy_switch && context_switch),
"If policy has switch then context must have switch");
} |
include/RAJA/policy/cuda/launch.hpp
Outdated
| for (int i = ::RAJA::internal::CudaDimHelper<DIM>::get(ctx.thread_id); | ||
| i < len; i += ::RAJA::internal::CudaDimHelper<DIM>::get(ctx.block_dim)) |
There was a problem hiding this comment.
If constexpr to get the values based on StoreDim3?
There was a problem hiding this comment.
See line 165 above if constexpr (LaunchContextT<LaunchContextPolicy>::hasDim3)
There was a problem hiding this comment.
I wanted to echo that I think this is a good idea
There was a problem hiding this comment.
Not sure if possible though because ctx is a function parameter
There was a problem hiding this comment.
We were able to get the argument type if there is one operator() that is not templated and has at least one argument. If that isn't true then it will use the default.
Co-authored-by: Jason Burmark <[email protected]>
|
@MrBurmark , do you have time to take a look? I think I pushed up the ideas we had yesterday |
| //is pointer a type just pass it through otherwise do give me the operator | ||
| //static error if not a function type | ||
|
|
||
| //template <typename Lambda> |
There was a problem hiding this comment.
looks like there might be some cruft here, we should delete before merging
| // - If T is a function pointer, use function_traits<T> directly | ||
| // - Otherwise, assume it is a callable object and use &T::operator() | ||
| template <typename T> | ||
| using lambda_traits = |
include/RAJA/policy/cuda/launch.hpp
Outdated
| for (int i = ::RAJA::internal::CudaDimHelper<DIM>::get(ctx.thread_id); | ||
| i < len; i += ::RAJA::internal::CudaDimHelper<DIM>::get(ctx.block_dim)) |
There was a problem hiding this comment.
I wanted to echo that I think this is a good idea
include/RAJA/policy/cuda/launch.hpp
Outdated
| for (int i = ::RAJA::internal::CudaDimHelper<DIM>::get(ctx.thread_id); | ||
| i < len; i += ::RAJA::internal::CudaDimHelper<DIM>::get(ctx.block_dim)) |
There was a problem hiding this comment.
Not sure if possible though because ctx is a function parameter
include/RAJA/policy/cuda/launch.hpp
Outdated
| ctx.shared_mem_ptr = raja_shmem_ptr; | ||
|
|
||
| RAJA::expt::invoke_body(reduce_params, body, ctx); | ||
| if constexpr (LaunchContextT<LaunchContextPolicy>::hasDim3) |
There was a problem hiding this comment.
I think we need to use your new type traits here as well? similar to the HIP backend
| template<named_dim DIM, typename Dim3Like> | ||
| RAJA_INLINE RAJA_DEVICE int get_dim(Dim3Like const& d) | ||
| { | ||
| if constexpr (DIM == named_dim::x) |
There was a problem hiding this comment.
@MrBurmark @johnbowen42 , is this the usage of if constexpr you all had in mind?
There was a problem hiding this comment.
No, I was thinking about doing if constexcpr (LaunchContextType::hasDim3) in the exec functions instead of adding new policies. Also there is already a function to get the named dims from a dim3.
There was a problem hiding this comment.
ohh.. that makes sense now
This PR is a collaboration space for exploring optimization within RAJA launch and the loop abstraction