-
Function signatures are very useful when analyzing programs, especially when the program links some libraries. However, if the object being analyzed is a statically linked ELF program, then a considerable portion of the code in the ELF program consists of library functions. For instance, if a program statically links glibc, at this point, it is necessary to manually create the function signature of that glibc. However, the function signature seems to only support the calculation and import of a single file. If only the so file of this glibc is calculated and imported, then even if the functions other than the exported function in glibc are successfully matched, their function names will be meaningless. Therefore, only the function name of the exported function exists in this so file. So, is there a way to directly calculate the function signature of a static library, that is, libc.a? If imported directly in Batch mode, the libc.a is actually split into multiple.o files for analysis and calculation. This is not only difficult to operate but also makes it impossible to map them to the static library itself. I have found that WARP actually supports direct analysis and calculation of libc.a. However, as far as I know, WARP is a rather strict matching mechanism. Many times, we do not have the original static library of the static link program, so we can only look for some alternative and similar static libraries. And they may cause the WARP GUID to be inconsistent due to external conditions such as the compiler. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
You are referring to SigKit correct? I assume as much considering that is the only other provider of "signature" information in Binary Ninja, besides WARP. We do not plan on continuing support of that (SigKit) and instead want to focus on developing WARP, which does support .a files directly as you said. However you mention WARP as being too strict when matching, however I think its important to have some reference when talking about matching: 1. Do you have a specific signature function that you are looking at with a different GUID than the target function?If so, can you enable the WARP render layer like so: ![]() This will show masked / blacklisted instructions: ![]() 2. What other tools have you used that have the behavior you want?For example, given the function you identified in 1 what tool are you using currently that matches? Or is there no tool currently that does what you want, which is not to say that your issue isn't valid, I just want to get a better frame of reference for when you expect WARP to work. Function matching tools fall into three categories:
We do not have any intention to offer a partial matching as it makes management of a dataset tedious. However we do have plans for offering some fuzzy matching capabilities in the future, which would make matching functions with differing function GUID's possible. If you are familiar with any of the tools mentioned above (or any other!) please mention so we can better understand your expectations for WARP. |
Beta Was this translation helpful? Give feedback.
-
I think BSIM is quite useful in some cases. As is well known, it is based on function behavior. Perhaps WARP can start from function behavior, which will bring the following benefits: Cross-platform Achieve a relatively high match under user intervention. It turns out that although BSIM requires us to manually click one by one to match, based on experience, the match with the highest BSIM score is indeed very accurate. Moreover, this is basically a feasible solution for C++ static libraries. According to my actual experience, I find that techniques like FLIRT seem far inferior to those in C++ static library matching. When conducting reverse analysis, many times we cannot have the target static library, which is restricted by various conditions such as GCC version, compilation environment version, and library version. The upper limit of the matching results based on instruction features may be very limited. I think BSIM has solved this problem quite well. What I want to express is that if WARP also wants to slightly relax the matching restrictions in the future, could it also perform a match from the "semantics" or "behaviors" of HLIL? In this way, precise matching can meet the function matching requirements when there is a target static library, while semantic or behavioral matching can achieve a relatively loose yet precise matching. As far as I know, at present, the matching of other commercial decompilers is still limited to the characteristics of instruction sequences. |
Beta Was this translation helpful? Give feedback.
The approach of WARP is to build a common format for this sort of thing, we do intend to add a fuzzy matching system as you describe however it should be noted that the intention of WARP was to replace SigKit, and as such it needs to run the entire available dataset on all functions automatically which limits the amount of time we can spend. The fuzzy matching likely will, like the network component of WARP, require user interaction (atleast in the default settings).
You will want to follow the issue #6105 for updates, it is definitely high on our TO-DO list as indicated by the number of 👍 and user feedback.