Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more tests for clang transpiler #7

Open
wants to merge 1 commit into
base: clang_transpiler_integration
Choose a base branch
from

Conversation

thilinarmtb
Copy link

Description

I get the following error when the vectorDot() kernel is run.

terminate called after throwing an instance of 'occa::exception'
  what():  
---[ Error ]--------------------------------------------------------------------
    File     : /lustre/orion/fus166/scratch/thilina/Workspace/anl/occa-transpiler/occa/src/dtype/dtype.cpp
    Line     : 528
    Function : fromJson
    Message  : Unknown dtype builtin [unsigned int]
    Stack
      15 occa::error(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
      14 occa::dtype_t::fromJson(occa::json const&)
      13 occa::lang::argMetadata_t::fromJson(occa::json const&)
      12 occa::lang::kernelMetadata_t::fromJson(occa::json const&)
      11 occa::transpiler::makeMetadata(occa::lang::sourceMetadata_t&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
      10 /lustre/orion/fus166/scratch/thilina/Workspace/anl/occa-transpiler/occa/build/lib/libocca.so(+0x2de573)
       9 occa::transpiler::Transpiler::run(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, occa::json const&)
       8 occa::serial::v3::transpileFile(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, occa::json const&, occa::lang::sourceMetadata_t&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
       7 occa::serial::device::buildKernel(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, occa::hash_t, occa::json const&, bool)
       6 occa::serial::device::buildKernel(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, occa::hash_t, occa::json const&)
       5 occa::device::buildKernel(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, occa::json const&) const
       4 ./examples_cpp_oklt_v3_moving_avg()
       3 ./examples_cpp_oklt_v3_moving_avg()
       2 /lib64/libc.so.6(__libc_start_main+0xef)
       1 ./examples_cpp_oklt_v3_moving_avg()
================================================================================

I think OCCA supports unsigned int (I may be wrong). I am assuming this is a
transpiler issue since the error is from the transpiler).

@YuraCobain
Copy link
Collaborator

@thilinarmtb according to the log error occurs in OCCA itself. The new transpiler is a separate library..
It's OCCA issue because of missing support of this data type.
Please refer this file for details

namespace dtype {

Also you could confirm it by using legacy bultin OCCA transpiler by removing transpiler version option from JSON properly.

@thilinarmtb
Copy link
Author

thilinarmtb commented Sep 24, 2024

@YuraCobain, with the following diff (to enable original OCCA), the original OCCA passes the test.

diff --git a/examples/cpp/31_oklt_v3_moving_avg/main.cpp b/examples/cpp/31_oklt_v3_moving_avg/main.cpp
index 538f85c6..b5e594bb 100644
--- a/examples/cpp/31_oklt_v3_moving_avg/main.cpp
+++ b/examples/cpp/31_oklt_v3_moving_avg/main.cpp
@@ -124,11 +124,11 @@ int main(int argc, const char **argv) {
   occa::device device(deviceOpts);
 
   occa::json buildProps({
-      {"transpiler-version", 3}
+  //    {"transpiler-version", 3}
   });
 
   int failure = 0;
-  failure |= runMovingAverageTest(device, buildProps);
+  // failure |= runMovingAverageTest(device, buildProps);
   failure |= runVectorDotTest(device, buildProps);
 
   return failure;
diff --git a/examples/cpp/31_oklt_v3_moving_avg/vectorDot.okl b/examples/cpp/31_oklt_v3_moving_avg/vectorDot.okl
index becf10a7..e1251641 100644
--- a/examples/cpp/31_oklt_v3_moving_avg/vectorDot.okl
+++ b/examples/cpp/31_oklt_v3_moving_avg/vectorDot.okl
@@ -1,4 +1,4 @@
-#include "constants.h"
+#define THREADS_PER_BLOCK 1024
 
 @kernel void vectorDot(double *temp, const unsigned int n, const double *a,
     const double *b) {

PS: For some reason, it was unable to find constants.h when I commented out the transpiler-version.

@YuraCobain
Copy link
Collaborator

I see, the root cause is probably due to legacy transpiler transform this data type in JSON for launch and kernel into something that is mapped in file I shared with you.
Then we need to add similar transformation for a new transpiler.
I will validate it tomorrow morning and let you know.

@IuriiKobein
Copy link
Collaborator

Hi @thilinarmtb

The root cause is found.
Actually there are 2 bugs in OCCA framework that a new transpiler helped to locate:

  1. Legacy OCCA transpiler performs narrowing conversions from "unsigned int" to "int" while generates JSON metadata file for launcher and kernel.
  2. OCCA misses builtin type mapping and JSON parsing for "unsigned int" aka "uint_". That leads to error you reported.

So there are two ways to fix it:

  1. Open issue in OCCA repo to fix both bugs:
    • incorrect type narrowing conversions
    • add mapping for uint_.
  2. Add mapping for uint_ in scope of this PR to pass your test but still open the issue in OCCA repo to fix narrowing conversions in legacy transpiler.

I prefer option 1 because both problems are coupled and related to OCCA core itself not new transpiler.
However option 2 helps us to merge this PR faster.

@thilinarmtb
Copy link
Author

@IuriiKobein : Thank you for checking it !

Seems like this issue is fixed for unsigned long in libocca #756 (ref. include/occa/dtype/builtins.hpp,
src/dtype/builtins.cpp, src/dtype/dtype.cpp)?

Maybe we can do the same for unsigned int in the same PR? @kris-rowe What do you think?

@YuraCobain
Copy link
Collaborator

Ok, I fix missing mapping on 'unsigned int'.
But for type narrowing conversion by legacy transpiler separate issue under OCCA repo should be opened.

@IuriiKobein
Copy link
Collaborator

@thilinarmtb the fix is pushed in scope of the PR.
Could you please test it and confirm that it works?

@YuraCobain
Copy link
Collaborator

Hi @thilinarmtb
Could you have a chance to test fix?

@thilinarmtb
Copy link
Author

@YuraCobain @IuriiKobein : I can run the test successfully with the latest fix.

@IuriiKobein
Copy link
Collaborator

@thilinarmtb greate news. Are we ready to merge the PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants