How to decompile autovectorized binaries? #6045
Replies: 2 comments
-
Here's a better example of the problem. It occurs more often than other autovectorizations, and with more variability. typedef struct { char c[16]; } c16;
/* copy fixed 128 bits of memory when we don't know the alignment of source or destination*/
void cpymem_3 (c16 *a, c16* b)
{
*a = *b;
} Compile this with gcc-14 and -O2 on a riscv64 toolchain - or possibly on other toolchains where word and doubleword operations can trigger an alignment exception. Ghidra 11.0 can't disassemble or decompile this simple structure copy. The gcc compiler's processing makes this common pattern harder to recognize:
The desirable Ghidra decompilation for the void function_xxx(void *a, void* b)
{
__builtin_memcpy(a, b, 16);
} |
Beta Was this translation helpful? Give feedback.
-
One way to approach this problem is to borrow from our AI friends and generate a training set of source code and autovectorized binaries, which can be used in recognizing vector instruction sequences. The GCC RISCV autovec compiler test suite provides over a thousand source code examples, which can be easily crosscompiled with a number of different machine architectures. Perhaps the existing Ghidra BSIM capabilities can be applied here. There are a lot of ways compilers can apply vector and other instruction set extensions to optimize code. These will vary with compiler releases, and especially with the performance quirks of specific evolving RISCV microarchitectures. Control code that has nothing to do with data vectors gets optimized just as often as vector math code, disrupting manual Ghidra analysis. Short loops over arrays of structures can be especially hard to understand once vector instructions and huge vector registers are available to the compiler. That argues for working up Ghidra models and compiler mockups together, tuning the compiler configurations to generate assembly code that aligns with critical sections of the binary being reviewed. Other approaches look more complicated, like teaching SLEIGH how to generate pcode based on run-time values of vector context registers or teaching the Ghidra decompiler how to recognize the 10K+ riscv vector intrinsic C functions. |
Beta Was this translation helpful? Give feedback.
-
Has any thought gone into helping the Ghidra decompiler make sense of code autovectorized by the compiler?
For example, compile and build this C file with gcc-13 or gcc-14, O3, and a machine architecture flag indicating vector instructions are supported:
For the x86_64 platform the decompiler results for a simple loop are very hard to interpret. For the RISCV-64 platform and gcc-14, the decompiler bails out completely. This gets worse when calls to
memcpy
orstrlen
are inlined by the compiler and autovectorized, as they can be in gcc-14 toolchains.Example:
Compile the following under gcc-14 with an x86_64 toolchain and
-march=sapphirerapids
or-march=x86-64-v4
, then pass the binary to Ghidragcc-14 will replace
memcpy
with inlined vector instructions optimized for sapphirerapids Intel processors, which are apparently notrecognized by Ghidra 11.
Beta Was this translation helpful? Give feedback.
All reactions