You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
was found when running a HeCBench benchmark (bn-cuda), which was compiled with -O3 flag. This error comes up even though the program does not call the ldexpf function, and it is checked that chipStar does have a ldexpf definition here that uses OpenCL ldexp function.
The trace was checked, and it is found that the zeModuleCreate returns ZE_RESULT_SUCCESS even though there was an error in the build log (unresolved external symbol ldexpf), and because the kernel was failed to be created with ZE_RESULT_ERROR_INVALID_MODULE_UNLINKED, the kernel launch fails during run-time.
After digging in, the error seems to be coming from the powf call in the program (please see the reproducer at the end of this issue for reference), and the __builtin_powf in one of the two powf definitions seems to be causing the error:
// TODO This function is affected by the --use_fast_math compiler flag
return ::pow(x, y);
}
#endif
When commented out the __builtin_powf and forced using the OpenCL pow function as powf definition, the error disappeared. However, if we only call powf in a reproducer, no errors are observed, so it seems like the __builtin_powf is not the only source of the error. Also, the program has to be compiled with an optimization flag for the error to show up (tested -O, -O1, -O2, and -O3, all of which produce the error).
[Reproducer]
Clone and build chipStar
Create a reproducer.cu file and paste the following code:
__global__ void kernel() {
float lsinblock[10000] = { 0 };
int t = 0;
//int a = 0; // used for following testing
for (int i=0; i<10; i++) {
t = (int)lsinblock[(int)powf(2.0, i)+t]; // error
//powf(2.0, i); // works
//a = (int)powf(2.0, i); // works
//a = (int)lsinblock[(int)powf(2.0,i)+t]; // works
//a = (int)lsinblock[(int)powf(2.0,i)+0]; // works
//t = (int)lsinblock[(int)powf(2.0,i)+0]; // works
//t = (int)lsinblock[(int)powf(2.0,i)+5]; // works
}
}
int main(int argc, char** argv) {
int N = 1<<20;
kernel<<<(N+255)/256, 256, 256 * sizeof(float)>>>();
printf("done\n");
}
Compile the code with nvcc -O3 reproducer.cu
Run the program with ./a.out
The error shown above should pop up.
[Notes on the reproducer]
The lines with // works were individually tested to run without errors.
The text was updated successfully, but these errors were encountered:
jjennychen
changed the title
"Missing definition for ldexpf" and hipErrorTbd run-time errors
"Missing definition for ldexpf" and hipErrorTbd run-time errors in benchmarks involves __builtin_powf
Jun 30, 2024
jjennychen
changed the title
"Missing definition for ldexpf" and hipErrorTbd run-time errors in benchmarks involves __builtin_powf
"Missing definition for ldexpf" and hipErrorTbd run-time errors in benchmark involving __builtin_powf
Jun 30, 2024
The following error:
was found when running a HeCBench benchmark (bn-cuda), which was compiled with
-O3
flag. This error comes up even though the program does not call theldexpf
function, and it is checked that chipStar does have a ldexpf definition here that uses OpenCLldexp
function.The trace was checked, and it is found that the zeModuleCreate returns ZE_RESULT_SUCCESS even though there was an error in the build log (
unresolved external symbol ldexpf
), and because the kernel was failed to be created with ZE_RESULT_ERROR_INVALID_MODULE_UNLINKED, the kernel launch fails during run-time.After digging in, the error seems to be coming from the
powf
call in the program (please see the reproducer at the end of this issue for reference), and the __builtin_powf in one of the two powf definitions seems to be causing the error:chipStar/include/hip/devicelib/single_precision/sp_math.hh
Lines 439 to 449 in 4edbcb6
When commented out the
__builtin_powf
and forced using the OpenCLpow
function aspowf
definition, the error disappeared. However, if we only callpowf
in a reproducer, no errors are observed, so it seems like the__builtin_powf
is not the only source of the error. Also, the program has to be compiled with an optimization flag for the error to show up (tested-O, -O1, -O2, and -O3
, all of which produce the error).[Reproducer]
nvcc -O3 reproducer.cu
./a.out
The error shown above should pop up.
[Notes on the reproducer]
The lines with // works were individually tested to run without errors.
The text was updated successfully, but these errors were encountered: