Skip to content
Axel Huebl edited this page Aug 9, 2018 · 36 revisions

You are here: Home > Developer Documentation > Debugging


Build Type

Change your build type from Release to Debug: pic-configure -c "-DCMAKE_BUILD_TYPE=Debug" [...]

Add Additional Log Information

You can add additional information to your output by adding -DPIC_VERBOSE=<N> and -DPMACC_VERBOSE=<M> to your cmake options during compile time (or use ccmake . after pic-configure [...]).

To activate multiple levels, simply add them.

Example:

# PHYSICS (1) + CRITICAL(4) + SIMULATION_STATE(16)
pic-configure -c "-DCMAKE_BUILD_TYPE=Debug -DPIC_VERBOSE=21" ../paramSets/lwfa

PIConGPU Log Levels

From src/picongpu/include/debug/PIConGPUVerbose.hpp:

DEFINE_LOGLVL(0,NOTHING);
DEFINE_LOGLVL(1,PHYSICS);
DEFINE_LOGLVL(2,DOMAINS);
DEFINE_LOGLVL(4,CRITICAL);
DEFINE_LOGLVL(8,MEMORY);
DEFINE_LOGLVL(16,SIMULATION_STATE);
DEFINE_LOGLVL(32,INPUT_OUTPUT);

libPMacc Log Levels

From src/libPMacc/include/debug/PMaccVerbose.hpp:

DEFINE_LOGLVL(0,NOTHING);
DEFINE_LOGLVL(1,MEMORY);
DEFINE_LOGLVL(2,INFO);
DEFINE_LOGLVL(4,CRITICAL);
DEFINE_LOGLVL(8,MPI);
DEFINE_LOGLVL(16,CUDA_RT);
DEFINE_LOGLVL(32,COMMUNICATION);
DEFINE_LOGLVL(64,EVENT);

Show types during compile time

A very useful tool to find out the resolved type of an object that produces a compile time error is: PMACC_CASSERT_MSG_TYPE(pmacc_msg,pmacc_typeInfo,...), defined in libPMacc/include/static_assert.hpp
Therein:

  • pmacc_msg can be a self-defined message but must be a valid C++ class name
  • pmacc_typeInfo is the type to be resolved
  • ... must be a condition that returns true or false The static assert will fail if the condition is false and return the message and the resolved type.
    Example:
PMACC_CASSERT_MSG_TYPE(
                        This_is_the_resolved_type_of_MyObjectType,
                        MyObjectType,
                        1==2
                      );
MyObjectType myObject; // this declaration crashed for you previously
myObject(arg1,arg2,...)

Add Debug Flags to the Code

The following tools will profit from additional information in your compiled binaries, such as code lines.

Consider activating at least the following cmake flags with ccmake . after the configure step:

  • -DCUDA_SHOW_CODELINES=ON:source code lines known in cuda-gdb and source code lines in ptx code (if kept via -DCUDA_KEEP_FILES=ON
  • -DPMACC_BLOCKING_KERNEL=ON: no parallel kernels any more -> cudaGetLastError() is now at the exact right kernel that crashes
  • -DCUPLA_STREAM_ASYNC_ENABLE=OFF: disable asynchronous streams (requires PIConGPU 0.4.0+)
  • -DCUDA_NVCC_FLAGS_DEBUG="-g;-G": adds full in-device symbols, very long compile time, heavy RT overhead

On more information on the flags or how to use ccmake, see our documentation on available cmake flags.

A warning on debug flags: using -g/-G usually implies no code optimization or -O0. That might alter your code and can make it hard to track down race conditions.


Parallel Debugging

This page collects some useful hints about how to debug a hybrid (CUDA + device) parallel (MPI) application.

MPI + Valgrind

Use the OpenMPI supressions list

mpiexec <mpi flags> valgrind --suppressions=$MPI_ROOT/share/openmpi/openmpi-valgrind.supp picongpu ...

See also: Valgrind Manual, section 4.9 - Debugging MPI Parallel Programs with Valgrind

MPI + GDB

Multi-Node Host-Side

Login into an interactive shell/batch session with X-forwarding ssh -X. Launch PIConGPU with gdb and trigger start and back trace automatically:

mpiexec <mpi flags> xterm -e gdb -ex r -ex tb --args picongpu ...

MPI + CUDA-MEMCHECK

mpiexec <mpi flags> cuda-memcheck --tool <memcheck|racecheck> picongpu ...

CUDA-GDB

Single-Node device-side

Manual.

(!) Compile with nvcc -g -G <...> if you want to set device-side breakpoints.

cd <path>/simOutput
cuda-gdb --args <path2picongpu> -d 1 1 1 -g <...> -s 100 <...> 

in cuda-gdb:

# breakpoints before running the code (if the code lines was not optimized out)
b <FileName>:<LineNumber>
# run program
r
# alternatively: start next step
# print a variable when code stopped or crashed
print <var>
# backtrace: where is the current line of code in the program
bt
# surrounding code lines
list