Skip to content
/ CDSSA Public

This repo contains the implementation of a number of algorithms for computing control dependencies, weak control closure, and static single assignment (SSA) program transformation.

Notifications You must be signed in to change notification settings

anm-spa/CDSSA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

60 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Control dependency, SSA transformation, and their duality

This repository contains the following two subdirectories:

  1. WCCPhiAnal - It is a Clang-based program analysis tool that computes control dependencies, weak control closure, and phi nodes (required for SSA computation)
  2. SSA - It is implemented as an LLVM pass to transform any LLVM IR program to the SSA form

Both are implemented and tested in LLVM/Clang version 11.

Setup and Installation of SSA and WCCPhiAnal

WCCPhiAnal can run as a standalone tool built using the Libtooling library support. To build WCCPhiAnal, do the following:

  1. Copy WCCPhiAnal in (your llvm directory)/clang/tools directory,
  2. Add the instruction add_clang_subdirectory(WCCPhiAnal) in the CMakeLists.txt file in the tools directory,
  3. Build the WCCPhiAnal project by building LLVM (or selectively build projects in LLVM) which will generate the wccphigen executable. See the details of how to build Clang/LLVM.

SSA is an LLVM pass which can be built as follows:

  1. Copy SSA to the directory (your llvm directory)/llvm/lib/Transforms directory and rename SSA by SmartSSA (just to resolve naming conflict),
  2. Add the instruction add_subdirectory(SmartSSA) in the CMakeLists.txt file in the Transforms directory,
  3. Build SSA by building LLVM (or selectively build projects in LLVM) which will generate the SmartSSA.dylib library in the Debug/lib/ directory of your LLVM's build directory.

Usage of WCCPhiAnal (wccphigen executable)

This tool performs intraprocedural analysis of source code written in C language and can be used to compute the following:
  1. Compute the weak control closure of each procedure with respect to a given subset of CFG nodes. These subset of CFG nodes can be taken randomly or given as input. WCC can be computed using various state-of-the-art methods and a novel duality-based algorithm.
    • Compute WCC using the method of Danicic et al.[1]. Usage option: -alg=CDDanicic
    • Computte WCC using reaching definition-based analysis [2,3]. Usage option: -alg=CDSAS
    • Computte WCC using novel duality-based algorithm [4]. Usage option: -alg=CDDual
    • Compare the computation of WCC using all the above three methods. Usage option: -alg=CompCD
  2. Compute the set of PHI nodes from the CFG of each procedure and for each program variable.
    • Compute Phi nodes using the dominance frontier based method (Cytron's algorithm [5]). Usage option: -alg=PhiDF
    • Compute Phi nodes using RD-based method discussed in SCAM 2019 [6] and JSS 2020 [7]. Usage option: -alg=PhiRD
    • Compute Phi nodes using the duality-based fixpoint method [4]. Usage option: -alg=PhiDual
    • Compare the computation of Phi nodes using all the above methods. Usage option: -alg=CompPhi
  3. Use the option -proc=(Procedure name) if you want to analyze a specific procedure. Then, you can use -cfg or -cdg to see the CFG or the control dependency information of the analyzed procedure.
  4. Usually, WCC is computed with respect to a set of CFG nodes. By default, this set is obtained randomly from the set of CFG nodes. However, this option can be overridden by using the option -np which will then allow to get a subset of CFG nodes from console.
  5. Dominance frontier based computation (such as [5]) of Phi nodes assumes that all program variables are defined at the beginning and introduces spuriousness in computing phi nodes. However, theis assumption is relaxed in [4,6,7] by assuming that only global variables and formal parameters are defined at the beginning. Thus, the implementation of the methods in [4,6,7] produces precise phi nodes. However, the option -defs-at-start forces these methods to assume that all program variables are definied at the beginning, and then the set of phi nodes in all computations are same.

Limitations:

  • The computation of phi nodes does not consider the liveness of program variables.
  • It generates only phi nodes, but does not transform high-level code to its SSA form. However, the above limitations are resolved in SmartSSA which transforms any LLVM IR code to its SSA form.

Usage of SmartSSA

SmartSSA is implemented as an LLVM pass that can be invoked using the LLVM's opt tool. It extends the original mem2reg pass of LLVM and include the implementation of duality-based phi node computation. This tool can be used using the following command line option:

opt -load (path to your ...Debug/lib/)SmartSSA.dylib -S -smartssa source.ll -o sourceOutput.ll Options

where, source.ll is the source file in LLVM IR format (generated by Clang) and sourceOutput.ll is the LLVM IR output in SSA form. The Options in the above command include the following:
  • -m2rlive Convert source.ll to sourceOutput.ll using the mem2reg pass of LLVM and use the liveness of variables into account to compute pruned SSA form

  • -duallive Convert source.ll to sourceOutput.ll using the SmartSSA pass and use the liveness of variables into account to compute pruned SSA form

  • -m2r Convert source.ll to sourceOutput.ll using the mem2reg pass of LLVM and without using the liveness of variables into account ( thus, it computes minimal SSA form)

  • -dual Convert source.ll to sourceOutput.ll using the SmartSSA pass and without using the liveness of variables into account to compute pruned SSA form

  • -defstart option can be used with -duallive or -dual to state that all program variables are defined at the beginning. The effect will be that the generated LLVM IR will be equal to the LLVM IR of mem2reg pass.

Reference

[1] Danicic, Sebastian et al. “A unifying theory of control dependence and its application to arbitrary program structures.” Theor. Comput. Sci. 412 (2011): 6809-6842.

[2] Abu Naser Masud. Efficient computation of minimal weak and strong control closure. J. Syst. Softw. 184, C (Feb 2022). DOI:https://doi.org/10.1016/j.jss.2021.111140

[3] Abu Naser Masud. Simple and Efficient Computation of Minimal Weak Control Closure. in proceeding of the 27th Intl. conf. in static analysis (SAS) (2020).

[4] Abu Naser Masud. The Duality in Computing SSA Programs and Control Dependency, under review in transactions on software engineering (TSE).

[5] Ron Cytron, Jeanne Ferrante, Barry K. Rosen, Mark N. Wegman, and F. Kenneth Zadeck. Efficiently computing static single as- signment form and the control dependence graph. ACM Trans. Program. Lang. Syst., 13(4):451–490, October 1991.

[6] A. N. Masud and F. Ciccozzi. Towards constructing the ssa form using reaching definitions over dominance frontiers. In 2019 19th International Working Conference on Source Code Analysis and Manipulation (SCAM), pages 23–33, 2019.

[7] Abu Naser Masud and Federico Ciccozzi. More precise con- struction of static single assignment programs using reaching definitions. Journal of Systems and Software, 166:110590, 2020.

About

This repo contains the implementation of a number of algorithms for computing control dependencies, weak control closure, and static single assignment (SSA) program transformation.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published