SAILR deoptimizations ranked by agresiveness #14

mahaloz · 2024-04-09T05:13:09Z

Background

As SAILR continues to gain popularity, more decompiler devs have been asking which deoptimizations are the least aggressive. Indeed, some deoptimizations are more aggressive than others and still have room to be ironed out. In this issue, I attempt to order the deoptimizations we know by difficulty, link to their implementation, and link to possible ways they can be verified by others.

Below is each deoptimization, starting with the least aggressive and moving towards the more aggressive ones. You can test that you've done a deoptimization correctly by consulting the binaries and their source shown in #7 (we have examples for each below).

If you use this documentation in any way, please cite us in your code or paper :).

1. Switch Lowering & Clustering

Background

Switch Lowering & Clustering is by far the most traditional and least aggressive deoptimization we have implemented in SAILR. Don't let that fool you though, since Switch case reconstruction is quite a big improvement on the readability of decompilation. This deoptimization is described in Section 5.3 of the paper and has to do with fixing switches.

The gist of this deoptimization is that you'll be doing a fine-grained pattern match on assignments with a specific schema and deleting some unneeded if-stmts in the final output. Additionally, you'll merge Switches that can be put back together (from clustering).

Implementation

You can find our implementation for Switch Lowering reversion here:
https://github.com/angr/angr/blob/0c2837abad59f9a99958a518557143a7860b2940/angr/analyses/decompiler/optimization_passes/lowered_switch_simplifier.py#L160

You can find our implementation for Switch Cluster reversion here:
https://github.com/angr/angr/blob/0c2837abad59f9a99958a518557143a7860b2940/angr/analyses/decompiler/region_simplifiers/switch_cluster_simplifier.py#L70

Together, these implement some nice recovery for switches. For testing cases, see #7.

Re-verify

We based these implementations on two locations. First, this talk by a GCC developer: "Switch lowering improvements". Second, we used the GCC Switch Lowering code, from GCC 9.

Recently, we've also found this blog post by Xoranth to be a great breakdown of the Switch Lowering code and its heuristics for understanding when to lower a switch.

Together, you should be able to rediscover what we did and perhaps find an even better algorithm.

2. ISD Deoptimizations (Jump Threading motivated)

Background

Our ISD Deoptimizations were highly motivated by the single Jump Threading optimizations. However, after we implemented our initial prototype, we found it could be extended to others. This deoptimization is discussed in Section 5.1 of the paper.

This deoptimization is less conservative than the last because it actually deletes code from the output in decompilation. The deletion is safe, and it's done by merging two calls duplicated from the source. However, some may not like this since it does not explicitly match the assembly you see (you see two calls to foo in asm, but only one in decomp).
Additionally, you need knowledge of gotos to do this deoptimization. I.e., you need to run a schema-based structuring algorithm at least once or know where gotos will appear.

This is the most complicated deoptimization to implement from the paper since it requires knowledge of reaching conditions, block similarity (exact matching on the IR level), and a lot of complicated KMP-based graph matching. However, it is more conservative than the ISC deoptimizations.

Implementation

This deoptimization is still being ported to angr master and will be updated with the better version here when it is done. For now, here is the frozen version from the paper:
https://github.com/mahaloz/angr-sailr/blob/be3855762a84983137696aa14efe2431a86a7e97/angr/analyses/decompiler/optimization_passes/duplication_reverter.py#L1247

Re-verify

This deoptimization was implemented by reading this blog, reading the GCC 9 source, and testing on the true_localcharset binary (see the test cases).

3. Return Duplication (ISC 1/2)

Background

Similar to the ISD deoptimizations, this deoptimization is more aggressive because its decompilation output differs from that of the assembly. Instead of having 1 return statement, you may have 3. Since special functions that cause an exit are treated as returns, you may duplicate those as well.

Implementing this can lead to instances where you have more returns than that of the original code. To combat this, we implemented a series of heuristic checks that reduce the likelihood of this happening. Things like how many calls are next to a return, how far away the return is from the head, how many incoming edges, is it in a region... things like this.

Currently, some decompilers implement a much simpler version of this deoptimization:

when you see a return with two edges to it, with the right amount of scopes, duplicate

In our implementation, we take it much further and instead use goto information to know wether to apply it or not.

Implementation

The best way to understand all of these checks and how we balance duplication and goto reduction, take a look at our code here:
https://github.com/angr/angr/blob/0c2837abad59f9a99958a518557143a7860b2940/angr/analyses/decompiler/optimization_passes/return_duplicator_low.py#L15

Re-verify

We based many of our heuristics and our implementation on the tail-merge optimization code from GCC 9. Additionally, we based some other metrics on optimizations seen in the Coreutils dataset, also found in #7.

4. Single-successor Duplication (ISC 2/2)

Background

This deoptimization is the most aggressive one and is thus run last in our pipeline. In the event that we have some gotos that jump to a node with only a single successor, we may duplicate that node and every other node of that pattern in the same chain.

Implementing more heuristics (like duplication limit and type) reduces this occurrence, as it was in the case of the first ISC deoptimizations.

Implementation

You can find our implementation here:
https://github.com/angr/angr/blob/0c2837abad59f9a99958a518557143a7860b2940/angr/analyses/decompiler/optimization_passes/cross_jump_reverter.py#L13

It may also be useful to take note of when it is run relative to everyone else:
https://github.com/angr/angr/blob/0c2837abad59f9a99958a518557143a7860b2940/angr/analyses/decompiler/optimization_passes/__init__.py#L50

Re-verify

This deoptimization was mostly based on observation and the cross jumping optimization code found in GCC 9.

In total, that is all the deoptimization we implemented for SAILR. It is likely that a few more deoptimization cases exist out there, and there are many ways to make our prototype implementations better. I hope you find the above info useful :), feel free to email me at [email protected].

The text was updated successfully, but these errors were encountered:

mahaloz added the discussion Discussions on SAILR label Apr 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SAILR deoptimizations ranked by agresiveness #14

SAILR deoptimizations ranked by agresiveness #14

mahaloz commented Apr 9, 2024 •

edited

Loading

SAILR deoptimizations ranked by agresiveness #14

SAILR deoptimizations ranked by agresiveness #14

Comments

mahaloz commented Apr 9, 2024 • edited Loading

Background

1. Switch Lowering & Clustering

Background

Implementation

Re-verify

2. ISD Deoptimizations (Jump Threading motivated)

Background

Implementation

Re-verify

3. Return Duplication (ISC 1/2)

Background

Implementation

Re-verify

4. Single-successor Duplication (ISC 2/2)

Background

Implementation

Re-verify

mahaloz commented Apr 9, 2024 •

edited

Loading