Skip to content

Latest commit

 

History

History
36 lines (20 loc) · 4.3 KB

notes-conventions.md

File metadata and controls

36 lines (20 loc) · 4.3 KB

Notes about the conversion

  • Memory conventions adopted for NPB-CPP improve the performance of the C++ code by reducing the execution time and memory consumption (for some benchmarks, these conventions let NPB-CPP even better than the original Fortran NPB3.4.1 such as BT pseudo-application).

    • All global array is allocated with dynamic memory and as one single dimension.

    • In the kernels, a cast is made in the functions so that it is possible to work using multi-dimension accesses with the arrays. For example, a function can receive an array-like matrix_aux[NY*NY], and work with accesses like matrix_aux[j][i], instead of one single dimension access (actually, the cast in the functions follows the original NPB3.4.1 way).

    • In the pseudo-applications, the cast is done already in the array declarations (NPB3.4.1 does not use one single dimension on pseudo-applications, so we cast the arrays directly on declarations, because this way, changes in the structure of the functions are not necessary).

    • To disable this convention (dynamic memory and on single dimension) is necessary to set the flag -DDO_NOT_ALLOCATE_ARRAYS_WITH_DYNAMIC_MEMORY_AND_AS_SINGLE_DIMENSION on the compilation.

    • Also, we keep every single structure of the loops the same as Fortran's original. However, as Fortran's memory behavior is different from C/C++, we invert the dimensions of the arrays and consequently, any array access. It is like changing from an access array[i][j][k] to array[k][j][i] (but keeping the same organization of the loops), and change array dimensions from array[NX][NY][NZ] to array[NZ][NY][NX].

  • The original NPB has a file for print results of IS (c_print_results.c) and another file for print results of the other Benchmarks (print_results.f). It means one file for Fortran code and one file for C code (IS is the only Benchmark that was written using C language). As the entire NPB-CPP is in C++, we keep only a single file to print results of all Benchmarks, we merged these two files and created c_print_results.cpp.

  • All goto in the benchmarks were replaced with an equivalent code using loops, keeping the same logic.

  • There are some little differences for indexes and ranges of loops, which are inherent to conversions from Fortran to C/C++.

  • In the file common/npb-CPP.hpp we defined additional things like the structure and operations for complex numbers.

  • FT kernel

    • Instead of converting code directly from serial FT 3.4.1, we convert Fortran code from FT OpenMP version, where the format is equal to the FT serial versions before NPB 3.0.

    • In non of the NPB versions with OpenMP are based on the serial code presented in 3.0 and 3.4.1 versions.

    • In version 3.4 (the most recent in this date), they state that the sequential code will no longer be available. The sequential code will be the OpenMP code without compiling with OpenMP.

    • In the global.hpp, historically, the constants FFTBLOCK_DEFAULT and FFTBLOCKPAD_DEFAULT receive values that change the cache behavior of the applications and the performance can be better or worse for each processor according to which values are chosen. We define these constants with the value 1 (DEFAULT_BEHAVIOR), which determines a default behavior independently of the processor where the application is running.

    • The size of the matrixes on the original NPB is [NZ][NY][NX+1], but we changed to [NZ][NY][NX] because the additional positions generated by NX+1 are not used on the application, they only spend more memory.

    • On the original NPB, the auxiliary matrixes y1, y2, and u have the size as NX. But only in the cffts1 routine, the size is NX. In the cffts2 routine, the correct size is NY and cffts3 the size is NZ. It is a problem when NX is not the bigger dimension. To fix this, we assigned the size of these matrixes as MAXDIM which is the size of the bigger dimension. Consequently MAXDIM is also used as an argument in the function fft_init that initializes the values of the matrix u.

  • IS

    • In the original NPB, IS has on its own source functions like randlc. On our version of NPB we do not need it, because we already have these functions implemented. In the original NPB, these functions are needed in the IS code because IS is in C and these functions were written in Fortran for the other benchmarks.