-
Notifications
You must be signed in to change notification settings - Fork 110
Add initial unit tests for jit gpu kernels #4278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds initial unit tests for JIT GPU kernels, providing a new test framework specifically designed for testing GPU kernel code. The main purpose is to improve test coverage for GPU kernel functionality, which previously had no unit testing infrastructure.
- Creates a new GPU kernel testing framework with test driver and helper utilities
- Adds comprehensive unit tests for array and algorithm functionality in GPU kernels
- Implements a GPU-specific test harness that can compile and execute kernel tests on GPU
Reviewed Changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 6 comments.
Show a summary per file
File | Description |
---|---|
test/include/test.hpp | Adds callback mechanism for test case selection and refactors test execution |
test/gpu/kernels/main.cpp | Main test driver for GPU kernel tests with compilation and execution logic |
test/gpu/kernels/basic.cpp | Basic sanity tests for the GPU testing framework |
test/gpu/kernels/array.cpp | Comprehensive tests for GPU array functionality |
test/gpu/kernels/algorithm.cpp | Extensive tests for GPU algorithm implementations |
test/gpu/kernels/CMakeLists.txt | Build configuration for GPU kernel tests |
test/CMakeLists.txt | Integration of GPU kernel tests into main test suite |
src/targets/gpu/kernels/include/migraphx/kernels/test.hpp | GPU-specific testing framework and macros |
Multiple other files | Supporting changes for compilation, kernel management, and utilities |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
template <class... Ts, | ||
MIGRAPHX_REQUIRES(sizeof...(Ts) == N and (is_convertible<Ts, T>{} and ...))> | ||
constexpr array(Ts... xs) : d{xs...} | ||
constexpr array(Ts... xs) : d{static_cast<value_type>(xs)...} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The explicit cast to value_type
in the variadic constructor could cause silent truncation or precision loss. Consider using a concept or SFINAE to ensure safe conversions, or document the potential for data loss in conversions.
Copilot uses AI. Check for mistakes.
{ | ||
auto failures = migraphx::gpu::write_to_gpu(int32_t{0}, true); | ||
compile(); | ||
k.launch(nullptr, options.global, options.local)(test_cases.at(case_name), failures.get()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using test_cases.at(case_name)
could throw an exception if the case name is not found. Since this is in a test context, consider using find()
with proper error handling or add a check to ensure the case exists before calling at()
.
k.launch(nullptr, options.global, options.local)(test_cases.at(case_name), failures.get()); | |
auto it = test_cases.find(case_name); | |
if(it == test_cases.end()) | |
{ | |
std::cerr << "Test case not found: " << case_name << std::endl; | |
CHECK(false); // Fail the test | |
return; | |
} | |
k.launch(nullptr, options.global, options.local)(it->second, failures.get()); |
Copilot uses AI. Check for mistakes.
if(suites.count(name) == 0) | ||
continue; | ||
run_suites.insert(suites.at(name)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code performs two lookups in the suites
map: first with count()
and then with at()
. Consider using find()
once and checking the iterator, or use suites.find(name)
to avoid the double lookup.
if(suites.count(name) == 0) | |
continue; | |
run_suites.insert(suites.at(name)); | |
auto it = suites.find(name); | |
if(it == suites.end()) | |
continue; | |
run_suites.insert(it->second); |
Copilot uses AI. Check for mistakes.
auto it = content.cbegin(); | ||
while(std::regex_search(it, content.cend(), m, case_re)) | ||
{ | ||
test_names.push_back(m[1].str()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creating a temporary string with m[1].str()
and then copying it into the vector could be inefficient. Consider using emplace_back(m[1].str())
or directly constructing the string in place if possible.
test_names.push_back(m[1].str()); | |
test_names.emplace_back(m[1].str()); |
Copilot uses AI. Check for mistakes.
friend constexpr Stream& operator<<(Stream& s, const lhs_expression& self) | ||
{ | ||
const char* op = Operator::as_string(); | ||
if(op != nullptr or *op != '\0') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The condition op != nullptr or *op != '\0'
has a logic error. If op
is nullptr
, dereferencing it with *op
will cause undefined behavior. The condition should use and
instead of or
: if(op != nullptr and *op != '\0')
.
if(op != nullptr or *op != '\0') | |
if(op != nullptr && *op != '\0') |
Copilot uses AI. Check for mistakes.
file(GLOB KERNELS_TESTS CONFIGURE_DEPENDS *.cpp) | ||
list(REMOVE_ITEM KERNELS_TESTS ${CMAKE_CURRENT_SOURCE_DIR}/main.cpp) | ||
|
||
message("KERNELS_TESTS: ${KERNELS_TESTS}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debug message statements should typically be removed or made conditional before merging to production. Consider removing this message or wrapping it in a conditional check for debug builds.
file(GLOB KERNELS_TESTS CONFIGURE_DEPENDS *.cpp) | |
list(REMOVE_ITEM KERNELS_TESTS ${CMAKE_CURRENT_SOURCE_DIR}/main.cpp) | |
message("KERNELS_TESTS: ${KERNELS_TESTS}") | |
option(MIGRAPHX_VERBOSE "Enable verbose output for kernel tests" OFF) | |
file(GLOB KERNELS_TESTS CONFIGURE_DEPENDS *.cpp) | |
list(REMOVE_ITEM KERNELS_TESTS ${CMAKE_CURRENT_SOURCE_DIR}/main.cpp) | |
if(MIGRAPHX_VERBOSE) | |
message("KERNELS_TESTS: ${KERNELS_TESTS}") | |
endif() |
Copilot uses AI. Check for mistakes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. How would we test larger kernels like TopK as a unit test?
{ | ||
// There is no way to easily exit with no error. We can terminate the | ||
// current wavefront without an error, but if there is more wavefronts | ||
// than we need to fallback to a trap which throws an error in HSA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// than we need to fallback to a trap which throws an error in HSA | |
// then we need to fallback to a trap which throws an error in HSA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM to me too. Copilot begin a tad pedantic here but overall lets get this in and ensure we have test coverage
Motivation
Add initial unit tests for jit gpu kernels as we dont have any unit tests. This should cover the tests for #4194.
Technical Details
This adds a test driver to run unit tests. Currently it only works with one lane or all lanes being uniform, but we could extend it in the future to check across all lanes within a wavefront and show which lanes failed. I didnt implement this yet because its not needed right now.
Changelog Category
Improve test coverage(there isnt a category for this below).