dynaroars / dynaplex Goto Github PK

A tool for inferring program complexity, e.g., O(nlogn), O(n^2), O(2^n). Dynaplex uses dynamic analysis to infer recurrence relations and then solve them for closed form solutions representing program complexity bounds.

License: MIT License

OCaml 3.37% Makefile 0.41% Python 20.70% Shell 1.18% C 10.43% C++ 63.90%

complexity dynamic-analysis invariant-generation

dynaplex's People

Contributors

Stargazers

Watchers

Forkers

wcphkust

dynaplex's Issues

Complexity Analysis Benchmark

We want to perform a systematic survey of complexity analysis papers. From each paper, we collect all the programs used to evaluate the resulting tool. We will convert those programs into SV-COMP format and discard duplicates.
We can start with papers used in Didier's comprehensive exam report. Note that some tools only work on cost models or intermediate languages instead of working implementations. If the cost models aren't automatically generated from working implementations by the tool we will ignore their benchmark programs. We want to focus on papers which explicitly aims at inferring complexity bounds as opposed to numerical invariant in general.
Preliminary list of complexity analysis papers

As an example program in figure 4.4 of SPEED was transformed into this example for 2019 Complexity Analysis Competition in SV-COMP. More examples can be found at the same SV-COMP.

Automated Trace Collection

Trace collection

Currently we manually instrument recursive functions to collect the problem size and depth of recursion at the beginning of each iteration.
We also instrument the recursive function and helper functions (if available) to collect the size and number of iterations per recursive step as shown in the example below:

Descriptive example

def bubble_sort(arr, n, depth, trace_file):
    # L1: trace problem size and depth of recursion
    with open(trace_file, 'a') as f:
        print("{};{}".format(depth, n), file=f)
    if n==1:
        return arr
    for i in range(n-1):
        if arr[i]>arr[i+1]:
            # L2: trace loop iterations
            if depth == 0: #we only count iterations for one recursion step.
                global counter
                counter = counter + 1
            arr[i], arr[i+1] = arr[i+1], arr[i]
        return bubble_sort(arr, n-1, depth+1, trace_file) 

#Driver to collect all traces
counter = 0
def main():
    global counter
    counter = 0
    size = random.randint(1, 500)
    arr = random_list(size)
    depth = 0
    path = "./bubble_sort"
    try:
        os.mkdir(path)
    except OSError as error:
        pass
    file = "./bubble_sort/output-{}".format(size) #naming convention 
    bubble_sort(arr, size, depth, file)
   # L3: trace total loop iteration and per problem size
    with open("./bubble_sort/traces", 'a') as f:
        print("{};{}".format(size, counter), file=f)
    counter = 0
    
if __name__ == '__main__':
    for i in range(100): #run the bubblesort a few times to collect enough traces
       main()

In the example above, we manually collect bubblesort execution traces by manually instrumenting bubblesort function at L1 to collect size of input list (i.e. problem size) and depth of recursion and store these traces in a trace_file. We also track loop iteration at L2 by incrementing the a loop counter and storing this loop counter and original size of input list in a different trace_file at L3 following some file naming convention (explained in the README).

Problem Description

In this issue we would like to improve dynaplex in two ways: automate the code instrumentation process and parametrize what kind of traces to be collected.

Automated code instrumentation

We can automate the above process by inserting code statements into the Abstract Syntax Tree (AST) of the program (ex: bubblesort). There are different libraries or framework to obtain and edit the AST of a program depending on the programming language. In this example, I will suggest using CIL which helps to access and manipulate the AST of a C program. However, this is not mandatory as there are other frameworks that accomplish the same task for different programming languages you are comfortable with (ex: ast module for python programs).

In order to use CIL you need some knowledge of Ocaml. Fortunately, I have a few examples to jump start this task. The APR-input-rectification repo contains my PhD qualifying exam from 2019. In this repo, I attempt to automatically repair C programs by rectifying the inputs. The repo contains different Ocaml programs to insert, edit or remove program statement using CIL. We can modify these programs to insert statement to track necessary traces as explained above. To simplify this problem there is a trace collection library (kshlib.h) that provides API for trace collection (problem size, loop counters, depth).

Parametrize trace collection

So far Dynaplex only infers bounds on time complexity therefore it collects traces related to problem size. We would like to expand Dynaplex to also infer bounds on other arbitrary resource consumption. For instance, we would like Dynaplex to infer the upper bounds on specific API calls. In order to accomplish this task we augment our code instrumentation above to allow the user to specify a feature (time, API or space) on which to infer an upper bound.
In the example below, we would like to infer an upper bound on the number of times API1 is called in terms of input size.

def foo(arr, depth, api1_counter, trace_file):
      # L1: trace api1_counter and depth of recursion
      with open(trace_file, 'a') as f:
          print("{};{}".format(depth, api1_counter), file=f)
      if len(arr)==1:
          return arr
      for i in range(len(arr)-1):
           # L2: trace loop iterations
           if depth == 0: #we only count iterations for one recursion step.
               counter = counter + 1
           api1(arr)
           api1_counter += 1
          return foo(arr, depth+1, api1_counter, trace_file)

We introduce a counter variable to track API1 calls and instrument foo() to trace API1_counter instead of problem size and infer bounds on API1_counter
Note: We still need to create a list of features that can be handled by Dynaplex and devise a way to collect necessary traces for the bound inference.

Converting SV-COMP (complexity competition) programs.

Converting 2019 Complexity competition programs to recursive structure.

There are around 500 imperative programs with nontrivial loops in the complexity competition. We want to convert these iterative programs to recursive programs.
Most of these functions with loops have the following structure:

foo_iterative(params){
    header
    while(condition){
        loop_body
    }
    return tail
}

the header: statements before the loop. This may declare some new variables
the loop condition: which indicates when to stop the loop
the loop body: which changes some header variables and/or function parameters. This is the main part we want to eliminate or convert into recursion.
the tail; statements after the loop.

We transform the above foo_iterative function to recursion as follows:

foo_recursive(params){
    header
    return foo_recursion(params, header_vars)
}

foo_recursion(params, header_vars){
    if(!condition){
        return tail
    }

    loop_body
    return foo_recursion(params, modified_header_vars)
}

Example: Bangalore_v4_true-termination.c

int foo(int x, int y){
if (y > x) {
	while (x >= 0) {
	    	x = x - y;
    	}
	}
return 0;
}

Converting foo() into a recursive function:

header: if (y > x)
loop condition: (x >= 0)
loop body: x = x - y;
tail: return 0;

Resulting recursive foo program:

int foo(int x, int y){
    if (y > x){
      return rec_foo(x, y);
    }
    return 0;
}

int rec_foo(int x, int y){
  rec_trace(x);
  if(!(x >= 0)){
      return 0;
  }
  x = x - y;
  return rec_foo(x, y);
}

Remarks:
Although these steps help to transform a lot of iterative functions into recursion it is not comprehensive. There are examples where it would be difficult to follow the above steps. Fortunately, termination competition have a lot of programs and many of them can be converted by just using the above technique. We want to start with those easier programs. We are not trying to remove all loops, just the main loops. For example, in case of nested loops we want to transform the outer loop and consider the inner loop as loop body (step 3). More examples are given in recursive_programs/c (dev branch); the file names matches the file names given in the complexity termination to avoid confusion.
Note: I haven't figured out how to handle sequential loops. Do not commit any changes to the master branch.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.