Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/kokkos #188

Open
wants to merge 143 commits into
base: develop
Choose a base branch
from
Open

Feature/kokkos #188

wants to merge 143 commits into from

Conversation

ajpowelsnl
Copy link
Contributor

Summary

@davidbeckingsale and @rhornung67 have been our main points of contact and collaboration in this work

The proposed code implements RAJAPerf Suite (RPS) - based performance testing for Kokkos.

Main new code and changes include:

  • Bringing Kokkos in as a RPS (tpl) submodule
  • Kokkos implementations (with Kokkos_Lambda as a kernel variant) of the basic, stream, lcals , apps and algorithm kernel groups (polybench will be completed in FY22)
  • Changes to the build system / cmake
  • Significant changes to the infrastructure for registering new kernel groups and new kernels:

common/Executor.hpp: kernelID registerKernel(std::string, KernelBase*);
common/Executor.cpp: exec->registerKernel(groupName, kernel);
RAJAPerfSuiteDriver.cpp: //executor.registerKernel

Configure Kokkos build and runtime in RPS:

cmake
-DENABLE_KOKKOS=ON
-DENABLE_OPENMP=ON
-DCMAKE_BUILD_TYPE=Debug .. \

Please do not hesitate to contact @ajpowelsnl or @DavidPoliakoff for any questions and / or further discussion.

@ajpowelsnl
Copy link
Contributor Author

@ajpowelsnl did you branch from an older version of the persuite. I was interested in looking at the kokkos version of the MAT_MAT_SHARED kernel, but don't see it.

Hi @artv3 -- Thanks for the question. I started working on the project late last year (first commit was Nov. 16, 2020), so it probably is an "older" version of the Perf Suite. I did not work on anything called "MAT_MAT_SHARED." Instead, I provided Kokkos translations of RAJAPerf Suite kernels (that existed around that date). Does that make sense?

@ajpowelsnl
Copy link
Contributor Author

ajpowelsnl commented Oct 26, 2021

@davidbeckingsale , @DavidPoliakoff -- I'm also removing or suppressing this debugging output in this commit:

Running specified kernels and variants...
Value is 3.14159
Value is 3.14159
Value is 3.14159
Value is 3.14159
Value is 3.14159
Value is 3.14159
Value is 3.14159
FIX ME STREAM DOT -- GET DATA FROM VIEWS

@artv3
Copy link
Member

artv3 commented Oct 27, 2021

@ajpowelsnl did you branch from an older version of the persuite. I was interested in looking at the kokkos version of the MAT_MAT_SHARED kernel, but don't see it.

Hi @artv3 -- Thanks for the question. I started working on the project late last year (first commit was Nov. 16, 2020), so it probably is an "older" version of the Perf Suite. I did not work on anything called "MAT_MAT_SHARED." Instead, I provided Kokkos translations of RAJAPerf Suite kernels (that existed around that date). Does that make sense?

Yes, do you plan to add Kokkos kernels for newer kernels in RAJAPerf?

@ajpowelsnl
Copy link
Contributor Author

@ajpowelsnl did you branch from an older version of the persuite. I was interested in looking at the kokkos version of the MAT_MAT_SHARED kernel, but don't see it.

Hi @artv3 -- Thanks for the question. I started working on the project late last year (first commit was Nov. 16, 2020), so it probably is an "older" version of the Perf Suite. I did not work on anything called "MAT_MAT_SHARED." Instead, I provided Kokkos translations of RAJAPerf Suite kernels (that existed around that date). Does that make sense?

Yes, do you plan to add Kokkos kernels for newer kernels in RAJAPerf?

Morning @artv3 -- Thanks again for your question, and your interest. @DavidPoliakoff, @davidbeckingsale and I have briefly discussed providing Kokkos translations for the new kernels , but that area of work has not been discussed yet with our project leads / folks who control the purse strings. Do you have a particular interest in Kokkos versions of the new kernels? If so, then it might help me strengthen the argument for doing the Kokkos translations of the new kernels. Does it benefit both groups to understand performance behavior over time?

@artv3
Copy link
Member

artv3 commented Oct 27, 2021

@ajpowelsnl did you branch from an older version of the persuite. I was interested in looking at the kokkos version of the MAT_MAT_SHARED kernel, but don't see it.

Hi @artv3 -- Thanks for the question. I started working on the project late last year (first commit was Nov. 16, 2020), so it probably is an "older" version of the Perf Suite. I did not work on anything called "MAT_MAT_SHARED." Instead, I provided Kokkos translations of RAJAPerf Suite kernels (that existed around that date). Does that make sense?

Yes, do you plan to add Kokkos kernels for newer kernels in RAJAPerf?

Morning @artv3 -- Thanks again for your question, and your interest. @DavidPoliakoff, @davidbeckingsale and I have briefly discussed providing Kokkos translations for the new kernels , but that area of work has not been discussed yet with our project leads / folks who control the purse strings. Do you have a particular interest in Kokkos versions of the new kernels? If so, then it might help me strengthen the argument for doing the Kokkos translations of the new kernels. Does it benefit both groups to understand performance behavior over time?

Yes absolutely! Those kernels express hierarchical parallelism and for our applications we have found that by taking advantage of the memory hierachy on the GPU (using shared memory [CUDA/HIP] or device local memory [SYCL]) we can really improve kernel performance. Recognizing that I always wonder how do you expose these device feature features in a portable and friendly way across programing models and of course minimize abstraction layerover head. As new programming models come online how do we maintain these features?

CMakeLists.txt Outdated Show resolved Hide resolved
list(APPEND RAJA_PERFSUITE_DEPENDS hip)
endif()

set(RAJAPERF_BUILD_SYSTYPE $ENV{SYS_TYPE})
set(RAJAPERF_BUILD_HOST $ENV{HOSTNAME})

if (ENABLE_CUDA)
set(CMAKE_CUDA_STANDARD 11)
if (ENABLE_CUDA AND ENABLE_KOKKOS)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ENABLE_CUDA here is redundant (since we are already inside an if (ENABLE_CUDA) block.

CMakeLists.txt Outdated Show resolved Hide resolved
#set_target_properties(kokkoscore kokkoscontainers PROPERTIES LANGUAGE CUDA)
endif()
get_property(KOKKOS_INCLUDE_DIRS DIRECTORY tpl/kokkos PROPERTY INCLUDE_DIRECTORIES)
include_directories(${KOKKOS_INCLUDE_DIRS})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this required? I think that having kokkos in the RAJA_PERFSUITE_DEPENDS should be enough.

lcals-kokkos/INT_PREDICT-Kokkos.cpp
lcals-kokkos/PLANCKIAN-Kokkos.cpp
lcals-kokkos/TRIDIAG_ELIM-Kokkos.cpp
#polybench/POLYBENCH_2MM.cpp
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these all commented?

Copy link
Contributor Author

@ajpowelsnl ajpowelsnl Dec 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Polybench kernels are commented because we do not yet have Kokkos translations for them. I believe the situation calls for additional logic to deal with the relative immaturity of the Kokkos design. Does this make sense? It's a choice (commenting) David P and I made in the course of prototyping.

//rajaperf::RunParams params(argc, argv);
//executor.registerGroup("Sparse");

//executor.registerKernel("Sparse", rajaperf::make_kernel_base(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleted all debugging code.

@@ -20,184 +20,175 @@ namespace rajaperf
namespace apps
{

//
//COUPLE::COUPLE(const RunParams& params)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots more commented code?

add_subdirectory(apps)
add_subdirectory(apps-kokkos)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these should be guarded with ENABLE_KOKKOS

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these should be guarded with ENABLE_KOKKOS

Hi David, if I am understanding your suggestion, guarding all things Kokkos in this file will require significant reorganization of this file. Is that what you would like for me to do, i.e., cleanly separate RAJAPerf Suite content from Kokkos content?

add_subdirectory(common)

if(NOT INFRASTRUCTURE_ONLY)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does the INFRASTRUCTRUE_ONLY option come from?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does the INFRASTRUCTRUE_ONLY option come from?

Hi David -- Taking a step back to explain the "big picture", remember that we are using RAJAPerf Suite (RPS) in two distinct manners: 1) Provide Kokkos translations of RPS kernels that can be run along side native RPS kernels, and 2) Using RPS as the infrastructure driver for Kokkos Kernels and Kokkos performance tests that we have written. To be clear, this second case has nothing to do with RPS kernels & their Kokkos translations. This second case is tied to the "INFRASTRUCTURE_ONLY" option. Does that make sense?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The why makes sense, I meant that INFRASTRUCTURE_ONLY should be explicitly defined as a CMake option (https://cmake.org/cmake/help/latest/command/option.html) with a default value of False, that you can override when you are using the perfsuite as the infrastructure driver.

DEPENDS_ON ${RAJA_PERFSUITE_DEPENDS}
)

else()
if(NOT INFRASTRUCTURE_ONLY)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks unnecessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi David -- Please see my comment above on this point (i.e., there are two distinct scenarios in which we're using RAJAPerf Suite). Would it be possible for us to meet, and work through at least some / most of your suggested changes? I just want to make sure I'm doing what you want me to do for the merge.

@@ -158,15 +165,18 @@ void Executor::setupSuite()
run_params.setInvalidExcludeFeatureInput(invalid);

//
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: @rhornung67 , @davidbeckingsale , there is functionality we have not yet implemented in Kokkos, but that is important for RAJAPerf Suite. Starting at line 167, and ending at ~ line 342 in Executor.cpp, we have commented this functionality. Rich, David B. suggested that I work with you on how to handle integrating this (currently) commented functionality with Kokkos design (i.e., dynamic addition of new kernels and kernel groups).

str << std::endl;
} // loop over features
str.flush();
}

// TODO for Kokkos Team: Commenting function body, because this infrastructure
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: @rhornung67 , @davidbeckingsale - commented lines here, 654-674, also represent important functionality for RAJAPerf Suite that has not yet been implemented for Kokkos. Would you guys please help me work through integrating these bits?

@crtrott
Copy link
Contributor

crtrott commented Jan 11, 2022

Hey guys, I didn't yet read all the comments but I think we should try and split the infrastructure changes from the Kokkos kernel variant changes.

So essentially something like this:
PR1: Infrastructure changes (how to add Kernel groups etc. + let perf suite serve as a perf testing infrastructure in other projects [i.e. Kokkos Core/Kokkos Kernels])
PR2: Add optional Kokkos dependency with all the kernels variants.

@davidbeckingsale what do you think about this?

Note that PR2 are not really dependent on PR1 so we could also do those two first and then tackle the infrastructure change. We could also split PR2 into doing kernel groups individually I guess?

@davidbeckingsale
Copy link
Member

@davidbeckingsale what do you think about this?

I agree that splitting the changes up into two distinct pieces would be best, and I suggested that approach before. I think the two main sticking points on this PR right now are 1) ensuring that the infrastructure changes maintain all the current capability, and 2) deciding how to handle cases where a variant is "missing".

@ajpowelsnl
Copy link
Contributor Author

@davidbeckingsale what do you think about this?

I agree that splitting the changes up into two distinct pieces would be best, and I suggested that approach before. I think the two main sticking points on this PR right now are 1) ensuring that the infrastructure changes maintain all the current capability, and 2) deciding how to handle cases where a variant is "missing".

Hi David & Christian -- Many thanks for responding. As for sticking point 2, in the most recent push, I did stub in Kokkos variants for the polybench kernels, somewhat alleviating that issue. But we are still left with sticking point 1. I will work with Christian on this, and see if we can come up with a satisfactory solution that preserves current infrastructure capability.

@ajpowelsnl ajpowelsnl mentioned this pull request Jan 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants