You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently we manually instrument recursive functions to collect the problem size and depth of recursion at the beginning of each iteration.
We also instrument the recursive function and helper functions (if available) to collect the size and number of iterations per recursive step as shown in the example below:
Descriptive example
def bubble_sort(arr, n, depth, trace_file):
# L1: trace problem size and depth of recursion
with open(trace_file, 'a') as f:
print("{};{}".format(depth, n), file=f)
if n==1:
return arr
for i in range(n-1):
if arr[i]>arr[i+1]:
# L2: trace loop iterations
if depth == 0: #we only count iterations for one recursion step.
global counter
counter = counter + 1
arr[i], arr[i+1] = arr[i+1], arr[i]
return bubble_sort(arr, n-1, depth+1, trace_file)
#Driver to collect all traces
counter = 0
def main():
global counter
counter = 0
size = random.randint(1, 500)
arr = random_list(size)
depth = 0
path = "./bubble_sort"
try:
os.mkdir(path)
except OSError as error:
pass
file = "./bubble_sort/output-{}".format(size) #naming convention
bubble_sort(arr, size, depth, file)
# L3: trace total loop iteration and per problem size
with open("./bubble_sort/traces", 'a') as f:
print("{};{}".format(size, counter), file=f)
counter = 0
if __name__ == '__main__':
for i in range(100): #run the bubblesort a few times to collect enough traces
main()
In the example above, we manually collect bubblesort execution traces by manually instrumenting bubblesort function at L1 to collect size of input list (i.e. problem size) and depth of recursion and store these traces in a trace_file. We also track loop iteration at L2 by incrementing the a loop counter and storing this loop counter and original size of input list in a different trace_file at L3 following some file naming convention (explained in the README).
Problem Description
In this issue we would like to improve dynaplex in two ways: automate the code instrumentation process and parametrize what kind of traces to be collected.
Automated code instrumentation
We can automate the above process by inserting code statements into the Abstract Syntax Tree (AST) of the program (ex: bubblesort). There are different libraries or framework to obtain and edit the AST of a program depending on the programming language. In this example, I will suggest using CIL which helps to access and manipulate the AST of a C program. However, this is not mandatory as there are other frameworks that accomplish the same task for different programming languages you are comfortable with (ex: ast module for python programs).
In order to use CIL you need some knowledge of Ocaml. Fortunately, I have a few examples to jump start this task. The APR-input-rectification repo contains my PhD qualifying exam from 2019. In this repo, I attempt to automatically repair C programs by rectifying the inputs. The repo contains different Ocaml programs to insert, edit or remove program statement using CIL. We can modify these programs to insert statement to track necessary traces as explained above. To simplify this problem there is a trace collection library (kshlib.h) that provides API for trace collection (problem size, loop counters, depth).
Parametrize trace collection
So far Dynaplex only infers bounds on time complexity therefore it collects traces related to problem size. We would like to expand Dynaplex to also infer bounds on other arbitrary resource consumption. For instance, we would like Dynaplex to infer the upper bounds on specific API calls. In order to accomplish this task we augment our code instrumentation above to allow the user to specify a feature (time, API or space) on which to infer an upper bound.
In the example below, we would like to infer an upper bound on the number of times API1 is called in terms of input size.
def foo(arr, depth, api1_counter, trace_file):
# L1: trace api1_counter and depth of recursion
with open(trace_file, 'a') as f:
print("{};{}".format(depth, api1_counter), file=f)
if len(arr)==1:
return arr
for i in range(len(arr)-1):
# L2: trace loop iterations
if depth == 0: #we only count iterations for one recursion step.
counter = counter + 1
api1(arr)
api1_counter += 1
return foo(arr, depth+1, api1_counter, trace_file)
We introduce a counter variable to track API1 calls and instrument foo() to trace API1_counter instead of problem size and infer bounds on API1_counter Note: We still need to create a list of features that can be handled by Dynaplex and devise a way to collect necessary traces for the bound inference.
The text was updated successfully, but these errors were encountered:
Trace collection
Currently we manually instrument recursive functions to collect the problem size and depth of recursion at the beginning of each iteration.
We also instrument the recursive function and helper functions (if available) to collect the size and number of iterations per recursive step as shown in the example below:
Descriptive example
In the example above, we manually collect bubblesort execution traces by manually instrumenting bubblesort function at L1 to collect size of input list (i.e. problem size) and depth of recursion and store these traces in a trace_file. We also track loop iteration at L2 by incrementing the a loop counter and storing this loop counter and original size of input list in a different trace_file at L3 following some file naming convention (explained in the README).
Problem Description
In this issue we would like to improve dynaplex in two ways: automate the code instrumentation process and parametrize what kind of traces to be collected.
Automated code instrumentation
We can automate the above process by inserting code statements into the Abstract Syntax Tree (AST) of the program (ex: bubblesort). There are different libraries or framework to obtain and edit the AST of a program depending on the programming language. In this example, I will suggest using CIL which helps to access and manipulate the AST of a C program. However, this is not mandatory as there are other frameworks that accomplish the same task for different programming languages you are comfortable with (ex: ast module for python programs).
In order to use CIL you need some knowledge of Ocaml. Fortunately, I have a few examples to jump start this task. The APR-input-rectification repo contains my PhD qualifying exam from 2019. In this repo, I attempt to automatically repair C programs by rectifying the inputs. The repo contains different Ocaml programs to insert, edit or remove program statement using CIL. We can modify these programs to insert statement to track necessary traces as explained above. To simplify this problem there is a trace collection library (kshlib.h) that provides API for trace collection (problem size, loop counters, depth).
Parametrize trace collection
So far Dynaplex only infers bounds on time complexity therefore it collects traces related to problem size. We would like to expand Dynaplex to also infer bounds on other arbitrary resource consumption. For instance, we would like Dynaplex to infer the upper bounds on specific API calls. In order to accomplish this task we augment our code instrumentation above to allow the user to specify a feature (time, API or space) on which to infer an upper bound.
In the example below, we would like to infer an upper bound on the number of times API1 is called in terms of input size.
We introduce a counter variable to track API1 calls and instrument foo() to trace API1_counter instead of problem size and infer bounds on API1_counter
Note: We still need to create a list of features that can be handled by Dynaplex and devise a way to collect necessary traces for the bound inference.
The text was updated successfully, but these errors were encountered: