ftsrg / gazer Goto Github PK

An LLVM-based formal verification frontend for C programs.

CMake 0.93% Dockerfile 0.13% C++ 75.02% Python 0.79% LLVM 0.74% Shell 0.03% C 21.01% Perl 1.36%

gazer's Introduction

About

Gazer is formal a verification frontend for C programs. It provides a user-friendly end-to-end verification workflow, with support for multiple verification engines.

Currently we support two verification backends:

gazer-theta leverages the power of the theta model checking framework.
- Currently, v2.10.0 is tested, but newer releases might also work.
gazer-bmc is gazer's built-in bounded model checking engine.

Furthermore, it is also possible to run multiple backends with different options as a portfolio. See doc/Portfolio.md for more information.

Gazer also participates in SV-COMP, see general report and our publication.

Usage

Consider the following C program (example.c):

#include <stdio.h>

extern int ioread32(void);

int main(void) {
    int k = ioread32();
    int i = 0;
    int j = k + 5;
    while (i < 3) {
        i = i + 1;
        j = j + 3;
    }

    k = k / (i - j);

    printf("%d\n", k);

    return 0;
}

The program above may attempt to divide by zero for certain values of k. We can verify this program by using either verification backend:

$ gazer-bmc example.c

$ gazer-theta example.c

The verifier will return the following verification verdict:

Verification FAILED.
  Divison by zero in example.c at line 15 column 11.

Traces

Detailed counterexample traces may be acquired by using the -trace option:

$ gazer-bmc -trace example.c
...
Verification FAILED.
  Divison by zero in example.c at line 15 column 11.
Error trace:
------------
#0 in function main():
  call ioread32() returned -11		 at 7:13
  k := -11	(0b11111111111111111111111111110101)	 at 7:13
  i := 0	(0b00000000000000000000000000000000)
  j := -11	(0b11111111111111111111111111110101)	 at 7:13
  j := ???
  i := 0	(0b00000000000000000000000000000000)
  i := 1	(0b00000000000000000000000000000001)	 at 11:15
  j := ???
  j := ???
  i := 1	(0b00000000000000000000000000000001)
  i := 2	(0b00000000000000000000000000000010)	 at 11:15
  j := ???
  j := ???
  i := 2	(0b00000000000000000000000000000010)
  i := 3	(0b00000000000000000000000000000011)	 at 11:15
  j := ???
  j := ???
  i := 3	(0b00000000000000000000000000000011)
  i := 3	(0b00000000000000000000000000000011)
  j := 3	(0b00000000000000000000000000000011)	 at 10:5

The trace lists each step along an errenous execution path, with the values of each program variable.

Note: Gazer attempts to speed up verification by optimizing and reducing the input program, and in doing so, it may strip away some of the variables it proves to be irrelevant to the verification task. However, such removed variables may still show up in the trace as undefined values (indicated by ???). This behavior can be turned off using the -no-optimize and -elim-vars=off flags.

As we can see, the errenous behavior may occur if ioread32 returns -11. We can request an executable test harness by using the -test-harness=<file> option:

gazer-bmc -trace -test-harness=harness.ll example.c

This generated test harness contains the definition of each function which was declared but not defined within the input module. Linking this test harness against the original module yields an executable which may be used to replay the counterexample.

$ clang example.c harness.ll -o example_test
$ ./example_test
[1]    19948 floating point exception (core dumped)  ./example_test

Citation

To cite Gazer as a tool, please use the following paper.

@incollection{gazer-tacas2021,
    author     = {\'Adam, Zs\'ofia and Sallai, Gyula and Hajdu, \'Akos},
    title      = {{G}azer-{T}heta: {LLVM}-based Verifier Portfolio with {BMC}/{CEGAR} (Competition Contribution)},
    year       = {2021},
    booktitle  = {Tools and Algorithms for the Construction and Analysis of Systems},
    editor     = {Jan Friso Groote and Kim G. Larsen},
    series     = {Lecture Notes in Computer Science},
    volume     = {12652},
    publisher  = {Springer},
    pages      = {435--439},
    doi        = {10.1007/978-3-030-72013-1_27},
}

gazer's People

Contributors

Stargazers

Watchers

Forkers

zuxichen sallaigy leventebajczi adamzsofi radl97

gazer's Issues

Travis build environment is not C++17 compliant

In CMakeLists.txt the C++ standard is set to C++17.

gazer/CMakeLists.txt

Line 9 in ec1994d

set(CMAKE_CXX_STANDARD 17)

However, when building the code with Travis, the building process might fail with an error (header <charconv> is part of the C++17 standard):
https://travis-ci.com/github/ftsrg/gazer/builds/183224034

[ 87%] Building CXX object tools/gazer-theta/CMakeFiles/GazerBackendTheta.dir/lib/ThetaVerifier.cpp.o
/home/travis/build/ftsrg/gazer/tools/gazer-theta/lib/ThetaVerifier.cpp:33:10: fatal error: 'charconv' file not found
#include <charconv>
         ^~~~~~~~~~
1 error generated.

Add support for the SV-Comp witness format

https://github.com/sosy-lab/sv-witnesses

Support SV-COMP ReachSafety-Floats with Theta

Since BMC already supports it with quite good results, this is not a top priority. Rather, #60 should be prioritized to eliminate incorrect results.

Support SV-COMP ReachSafety-Combined with BMC

I'll need to take a look at exactly what the problem is here.
It could be a performance issue, but it probably can be handled so that this issue is resolved.
It may also have something to do with inlining.

Discussion: Upgrading LLVM framework?

Someone mentioned that we could upgrade LLVM framework (current version is 9). There are many different aspects to this, so I think it is useful to start a discussion to even consider using this.

On the pro side, I think newer versions might have

newer passes added,
Passes are better optimized,
Modern C++ versions are probably better supported.
Easier install... Most Linux distributions have moved to newer versions of LLVM, and older versions are hard to wire to the system. (More so, if one wants to install other LLVM-based tools). For example ArchLinux users have a hard time installing the old LLVM, but it is possible nonetheless.

On the con side, we should consider what changes will it cost. I haven't yet done any research. But this might mean

handling new instruction, if any (thinking of instructions like extract-value which are very specific and only helpful for lowering to machine code), or
Some feature API changed (We are already using a legacy pass manager... It could be that it was removed).

One might even consider creating LLVM version-independent code (I did not find anything in the LLVM documentations that were changed across 9.0 and latest. This might mean it just works with any LLVM). Note that the code is independent, but I'm not speaking of binary-compatibility.

What are your thoughts?

@AdamZsofi and @sallaigy (if you have time).

Intrinsics using pointless special pointer operations

Pointers are not only alloca'd and loaded and stored on simple programs, which would be good for a simple implementation (see 4b44803 ).

Intrinsics use _point_er casts on function entry, which at this point seems _point_less. (hehe)
I think this can be good if there is at least one cast already for the pointer, but until then, we should use the original type overloaded intrinsics (it seems that gazer.function.entry is already overloaded).

This later would introduce special case for analysis if the simple memory model can handle a value.

is it alloca'd or global? Yes
is it GEP'd? No
is it bitcasted? Yes
is the bitcast only in intrinsic functions? yes
then proceed to pass to simple memory model (else fail or handover to another memory model, etc.)

Portfolio script as entry point of the docker file

Since the portfolio seems to work quite well, why don't we make it as an entry point to the docker image? This way, the docker image could be called more easily (see cfa-cli in Theta).

Support 'malloc' in flat memory model

Currently we do not have support 'malloc' (or dynamic memory allocation in general) in the flat memory model. Even though parts of the required API exist, we should add full, user-level support for it.

Support SV-COMP ReachSafety-Sequentialized with Theta

FIsNaN is translated incorrectly

Currently FIsNaN queries are translated incorrectly in the frontend - they are translated to FIsInf this leads to false-positives and false-negatives.

LLVM IR test case to reproduce:

; This test failed due to an incorrect translation of FIsNaN queries to FIsInf

; RUN: %bmc -bound 1 -skip-pipeline "%s" | FileCheck "%s"
; CHECK: Verification {{SUCCESSFUL|BOUND REACHED}}

define dso_local i32 @main() local_unnamed_addr {
bb:
  %b = alloca float, align 4
  store float 0.000000e+00, float* %b, align 4
  %tmp2 = bitcast float* %b to i32*
  store i32 2143289344, i32* %tmp2, align 4
  br i1 true, label %bb8, label %bb4

bb4:                                              ; preds = %bb
  br label %bb8

bb8:                                              ; preds = %bb, %bb4
  %tmp9 = load float, float* %b, align 4
  %tmp10 = fcmp uno float %tmp9, 0.000000e+00
  %tmp11 = zext i1 %tmp10 to i32
  br i1 %tmp10, label %bb12, label %error

bb12:                                             ; preds = %bb8
  ret i32 0

error:                                            ; preds = %bb8
  call void @gazer.error_code(i16 2)
  unreachable
}

declare void @gazer.error_code(i16) local_unnamed_addr

Store float values in the flat memory model

When using the flat memory model, float values are not stored in the memory array. Attempt to retrieve floats results in an undefined value, leading to false-positives. We should add support for storing and loading floating-point values.

LLVM IR test case:

; This test failed due to unsupported memory writes for floats in the float memory model

; RUN: %bmc -bound 1 -skip-pipeline -memory=flat "%s" | FileCheck "%s"
; CHECK: Verification SUCCESSFUL

define i32 @main() {
bb:
  %b = alloca float, align 4
  store float 0.000000e+00, float* %b, align 4
  br i1 false, label %bb2, label %bb6

bb2:                                              ; preds = %bb
  br label %bb6

bb6:                                              ; preds = %bb2, %bb
  %tmp7 = load float, float* %b, align 4
  %tmp8 = fadd float %tmp7, -4.251000e+03
  %tmp9 = fcmp une float %tmp8, 0.000000e+00
  br i1 %tmp9, label %bb10, label %error

bb10:                                             ; preds = %bb6
  ret i32 0

error:                                            ; preds = %bb6
  call void @gazer.error_code(i16 2)
  unreachable
}

declare void @gazer.error_code(i16)

Test harnesses should mock unused functions

Consider the following code:

#include <assert.h>

int nondet1();
int nondet2();

int nevercalled() {
    return nondet2();
}

int main(void) {
    int x = nondet1();
    assert(x != 0);

    return 0;
}

The analysis produces a test harness which is incomplete and cannot be linked against the original program. This is due to the way the analysis works: as main is the entry point, nevercalled will never be executed and thus the nondeterministic call to nondet2 is not part of the counterexample, so it is not part of the trace. However, it should be part of the generated test harness, otherwise we will not be able to compile it and link it against the original program.

Task with floats: bmc incorrect false, with somehow correct test harness

Describe the bug
gazer-bmc produces an incorrect false result when run on the following SV-Comp task (besides other, similar tasks) :
float-no-simp8.i
float-no-simp8.c
float-no-simp8.yml

As stated in the YAML task definition, the task is true for the unreach error property, which means, that the function reach_error() should not be reachable.
Somehow bmc is able to produce a test harness
float-no-simp8.i.ll.txt (should be .ll, but github only supports txt), which, if compiled together with float-no-simp8.i outputs:

float-no-simp8: float-no-simp8.c:3: void reach_error(): Assertion `0' failed.
Aborted (core dumped)

(The body of reach_error contains an assert(0))

It is possible, that the task definition is wrong, but if it is, we should prove it somehow (but this is unlikely).
I should note, that clang produces lots of warnings (-Wunknown-attributes), but this is probably not connected to the issue.

To Reproduce
Gazer version used: 1.2.0
Running gazer to get the test harness: gazer/scripts/gazer_starter.py float-no-simp8.i
Compiling and running: clang float-no-simp8.i.ll float-no-simp8.i -o float-no-simp8 && ./float-no-simp8

Expected behavior
The verification should be reported as successful

Version:
Gazer v1.2.0

Add global variables to AutomataSystem

As far as I can understand, static global variables, which is not taken an address of (or an address of any supertype) could be used as a single variable, and no memory model magic is needed. (Which is a pain at the moment when preparing gazer-theta for XCFA formalism)
Note that recursive functions would need a non-local variable.

int global_variable; // not static -> other code might use the variable through extern int global_variable;

// if the code contains &global, then variable x cannot be inlined
struct X {
  int x;
} global_struct;

// if the code contains &array[3] or array+i, etc., then "array[5]" should not be inlined(?)
int array[10];

I'm thinking of writing it as an LLVM pass + a modification to AutomataSystem, but I don't know:
1) ... whether this property (the variable is not taken an address of) is enough. Am I missing something?
e.g. SROA can actually help, probably only primitive types will be interesting for me
2) ... how to prevent memory modelling pass to "kick in" for these vars (annotate the memory object?), which is probably better than undoing the MemSSA. Is there, by chance, something already doing this?
3) ... how hard it is for AutomataSystem to support globals (a variable used in more than one function). Are there anything to keep in mind? (I'm thinking that adding a variable store directly to the AutomataSystem should be enough)

Can someone help me with answering these questions?

Segmentation fault in a program with return in a nested loop

Translating the following program results in a segmentation fault:

int *a;
int b;
int c(int e, int *g) {
  for (;;)
    for (b = 1; e; b++)
      if (a[b]) {
        *g = b;
        return 1;
      }
}
int main() { int d, f = c(d, &f); }

This is caused by the value of variable being null in ModuleToAutomata.cpp:448

Support SV-COMP ReachSafety-Sequentialized with BMC

Add an option to generate C programs as test harnesses

Currently we can only write test harnesses in the LLVM bitcode or LLVM IR assembly formats. We could extend this with a more human-readable C program output.

Test Docker image with GH Actions

It would be nice to have a CI action that actually builds the docker image and runs it on a simple example. This could help us detect cases where we forget to update something in the docker image.

For Theta, we have a separate workflow. It consists of 4 jobs because Theta has 4 docker images, but here we'd need only one.

Support SV-COMP ReachSafety-Heap with Theta

This might be a longer-term plan. First step would be to have some partial support where at least we can parse and don't give wrong results (or harnesses can filter them).

Erroneous CFA generation vol 2

For input problem:

extern void abort(void); 
void reach_error(){}

extern char __VERIFIER_nondet_char(void);
void __VERIFIER_assert(int cond) {
  if (!(cond)) {
    ERROR: {reach_error();abort();}
  }
  return;
}
signed char gcd_test(signed char a, signed char b)
{
    signed char t;

    if (a < 0) a = -a;
    if (b < 0) b = -b;

    while (b != 0) {
        t = b;
        b = a % b;
        a = t;
    }
    return a;
}


int main()
{
    signed char x = __VERIFIER_nondet_char();
    signed char y = __VERIFIER_nondet_char();
    signed char g;

    if (y > 0 && x % y == 0) {
        g = gcd_test(x, y);

        __VERIFIER_assert(g == y);
    }

    return 0;
}

GAZER generates an erroneous CFA with command (that uses GAZER version in #39 ):

gazer-theta --domain PRED_CART --refinement NWT_IT_WP <file>

The slip-up happens during the translation of the variable swapping

t = b;
b = a % b;
a = t;

The corresponding generated CFA is:

loc24 -> loc18 {
        main_main_bb13__12_i_inlined0 := (((main_main_bb13__1_i_inlined0 bv_sign_extend bv[32]) bvsrem (main_main_bb13__12_i_inlined0 bv_sign_extend bv[32])))[8:0]
        main_main_bb13__1_i_inlined0 := main_main_bb13__12_i_inlined0
}

Which corresponds to:

b = a % b;
a = b (!!!!!!!!!!);

Detect irreducible control flow

Our CFA construction algorithm assumes that each function has a reducible control flow, i.e. each loop has a single dominating loop header, otherwise the CFA generation process will crash.

While a large majority of programs are reducible, it is possible to remove irreducible control flow using LLVM's StructurizeCFGPass, but this pass is rather expensive. Currently the pass is opt-in (through --structurize), but we could write a lightweight analysis which checks the program and applies StructurizeCFG if it is irreducible.

Error when using CMake with ninja

When CMake uses ninja as backend instead of UNIX Makefiles, the building process fails because it cannot download Z3.

The Z3 download process is configured as:

gazer/src/SolverZ3/CMakeLists.txt

Lines 7 to 16 in 0215081

 ExternalProject_Add(z3_download 

 PREFIX vendor 

 # Download 

 URL https://github.com/Z3Prover/z3/releases/download/z3-4.8.6/z3-4.8.6-x64-ubuntu-16.04.zip 

 URL_HASH SHA1=1663a7825c45f24642045dfcc5e100d65c1b6b1e 

 CONFIGURE_COMMAND "" 

 BUILD_COMMAND "" 

 UPDATE_COMMAND "" 

 INSTALL_COMMAND "" 

 )

It is a known issue, that countermeasures has to be made due to ninja's dependency scanning mechanism: https://stackoverflow.com/questions/50400592/using-an-externalproject-download-step-with-ninja

This issue should either be resolved, or marked as wontfix, and the documentation extended accordingly.

Use environment variables for configuration-related options

It is hard to set up a development environment for Gazer and Theta. Of course, one must do it only once, but it would be nice to improve this :)

The problem I've come across is: When running make check-functional, there is no way of passing where are the libraries and jar file. If, as it is for $GAZER_TOOLS_DIR, an environment variable was used, one would easily be able to use make check-functional and tools/gazer-theta without a bloated parameter list.

Developing theta and gazer side-by-side is a pain (as one must use that --theta-path parameter, which is impossible sometimes).

Create environment variables that are passed to Theta, instead of passing them as arguments.
This removes coupling to the certain versions of Theta (which might become a problem), and also helps configuring a workspace for developing Theta and Gazer.

This is related to only configuration, e.g. where libz3 resides or where to find the jar file.

Also, I'm okay with an let's do both way, that configuration passed as environment variables are overriden by the parameters. (Setting both --theta-path and (e.g.) THETA_JAR envvar the --theta-path takes precedence).

Limitation due to semantics of undef causes SVComp incorrect false results

I came across more than one SVComp task, that gives back incorrect false values, because the call of the error function depends on the value of a global variable, which wasn't explicitly initialized to any value.
The initialization of these static storage values is in the C11 standard, under 6.7.9/10:

"If an object that has automatic storage duration is not initialized explicitly, its value is
indeterminate. If an object that has static or thread storage duration is not initialized
explicitly, then:
— if it has pointer type, it is initialized to a null pointer;
— if it has arithmetic type, it is initialized to (positive or unsigned) zero;
— if it is an aggregate, every member is initialized (recursively) according to these rules,
and any padding is initialized to zero bits;
— if it is a union, the first named member is initialized (recursively) according to these
rules, and any padding is initialized to zero bits;"

Here's an example, which should be handled as a correct program, but gazer-bmc gives back a fail with the trace below:

int a;
void reach_error();
int main() {
  int b;
  if (b)
    b = a;
  if (b)
    reach_error();
}

Error trace:
------------
#0 in function main():
  Undefined behavior (read of an undefined value: #0bv32)	
  b := 1	(0b00000000000000000000000000000001)	 at 5:7
  Undefined behavior (read of an undefined value: #1bv32)

The other examples I've seen use global char arrays, but the issue is the same as here.

Start versioning and releasing

We should consider starting some versioning for Gazer. I suggest semantic versioning (e.g., v1.2.3), similarly to Boogie or Theta. What do you think @sallaigy?

Requesting traces with floats results in a crash

Consider the following program:

int b;
void __VERIFIER_error();
float a();
int main() {
    float c = a();
    b = c != c; // CHECK: b := 0
    if (!b)
        __VERIFIER_error();
}

Running it with gazer-bmc -trace results in a crash due to an unhandled expression in ExprEvaluator:

Iteration 1                                                                                                                                                                                                                                  
  Under-approximating.                                                                                                                                                                                                                       
    Transforming formula...                                                                                                                                                                                                                  
    Running solver...                                                                                                                                                                                                                        
      Elapsed time: 0 s                                                                                                                                                                                                                      
  Under-approximated formula is SAT.
Unhandled expression type in ExprEvaluatorBase
UNREACHABLE executed at /home/gyula/projects/gazer/src/Core/Expr/ExprEvaluator.cpp:48!

Basic format of the configuration description

Currently, the portfolio (scripts/gazer_starter.py) is hardcoded to 3 analyses. It would be a really nice feature to have a generic portfolio executor that can read the components of the portfolio from a configuration file (e.g. yml). For example, the current hardcoded analysis could be described something like this:

- bmc
  - timeout: 150
  - flags: "--inline all", "--bound 1000000"
  - harness: true
- theta
  - timeout: 100
  - flags: "--domain EXPL", "--search ERR", ...
  - harness: true
- theta
  - flags: "--domain PRED", "--inline all", ...
  - harness: true

The syntax can of course be improved, but you get the main idea. This tells the portfolio engine to start with a BMC with 150s timeout and the given flags, and in the end, also validate the harness. Then, if the result is inconclusive, run Theta with 100s timeout, and so on.

CPAchecker can also be a good example to look at.

LLVM optimization breaking Gazer with --math-int

Describe the bug

a < 0 || b < 0 becomes BvSlt (BvOr(a,b), 0) (in C it would be (a|b) < 0), which --math-int cannot handle too well...

Haven't tested Theta, but BMC gives a bad result.

To Reproduce

tools/gazer-bmc/gazer-bmc gcd1.c --memory=havoc --math-int --trace --print-final-module=output

#include <assert.h>
int gcd(int a, int b) {
        while (b != 0) {
                int t = b;
                b = a%b;
                a = t;
        }
        return a;
}

int __VERIFIER_nondet_int(void);

int main(void) {
        int a = __VERIFIER_nondet_int();
        int b = __VERIFIER_nondet_int();
        if (a < 0) {
                return 0;
        }
        if (b < 0) {
                return 0;
        }
        assert(gcd(a,b) == gcd(b,a));
}

module output:

define dso_local i32 @main() local_unnamed_addr #0 !dbg !27 {
bb:
  call void @gazer.function.entry(metadata !27)
  %tmp = call i32 @__VERIFIER_nondet_int() #4
  call void @llvm.dbg.value(metadata i32 %tmp, metadata !31, metadata !DIExpression())
  %tmp3 = call i32 @__VERIFIER_nondet_int() #4
  call void @llvm.dbg.value(metadata i32 %tmp3, metadata !32, metadata !DIExpression())
  %tmp4 = or i32 %tmp, %tmp3, !dbg !36 ; !!!
  %tmp5 = icmp slt i32 %tmp4, 0, !dbg !36
  br i1 %tmp5, label %bb10, label %bb6, !dbg !36

bb6:
  %tmp7 = call fastcc i32 @gcd(i32 %tmp, i32 %tmp3)
  call void @gazer.function.call_returned(metadata !27)
  %tmp8 = call fastcc i32 @gcd(i32 %tmp3, i32 %tmp)
  call void @gazer.function.call_returned(metadata !27)
  %tmp9 = icmp eq i32 %tmp7, %tmp8
  br i1 %tmp9, label %bb10, label %error

bb10:
  call void @gazer.function.return_value.i32(metadata !27, i32 0)
  ret i32 0

error:
  %error_phi = phi i16 [ 2, %bb6 ]
  call void @gazer.error_code(i16 %error_phi)
  unreachable
}

BMC sets a=-1 and b=1, which should have meant a successful result (a<0, so return).

This happens only without the --no-optimize flag.

Expected behavior

LLVM shouldn't merge the two if's... a<0 || b<0 is equivalent to (a|b) < 0, but with mathematical integers, the latter does not make sense...

Version (please complete the following information):

Tested on Gazer master branch

Assignment line number in trace wrong in a special case

In the case of expressions similar to this:
if (p2 != 0) {
lk2 = 1;
} else {}
the line number in trace will be the line number of the condition instead of the assignment.

Support SV-COMP ReachSafety-Heap with BMC

This might be tricky, but Gazer does have some sort of memory models

Erroneous Theta CFA generation

For input verif/memory/globals2.c (the relevant excerpt):

int a = __VERIFIER_nondet_int();
__VERIFIER_assume(a < INT_MAX - 1);

the generated CFA is the following (the relevant excerpt):

loc2 -> loc8 {
        assume (main_tmp bvslt 32'b01111111111111111111111111111110)
        main_Memory_0_mem := [<bv[64]>default <- 8'd0]
        havoc main_tmp
}

The command used to invoke gazer-theta: tools/gazer-theta/gazer-theta --domain EXPL --refinement NWT_IT_WP --maxenum 10 -no-inline-globals test/theta/verif/memory/globals2.c --trace

As it can be seen, the in the CFA the order of __VERIFIER_assume and __VERIFIER_nondet_int is swapped, which alters the meaning of the source.

Upgrade to Theta v1.6.0

Theta v1.6.0 contains some bugfixes and a new value for the --initprec argument. There are no breaking changes so it should be an easy upgrade.

Rethink building procedure

Linking with LLVM is heavy operation and for development, it would be good to not link LLVM to GazerLLVM module, rather only link (of course, still dynamically) to the main binary.
Plus this allows us to use LLVM's opt utility which is useful for debugging. (opt already links LLVM: the error is similar to this: ziglang/zig#67 )

Gitignore for portfolio

Executing Portfolio test created a file that is not ignored by git.

It seems that the file test/portfolio/0 is supposed to be a logfile of some kind. Maybe this file is created in test/portfolio/ instead of inside test/portfolio/Output?

Target type and integer cast error on several tasks

Describe the bug
Gazer is unable to produce test harnesses on several SSH SVComp tasks. Fails on the following assert:
gazer/src/LLVM/Trace/TestHarnessGenerator.cpp:32: llvm::Constant* exprToLLVMValue(gazer::ExprRef<gazer::AtomicExpr>&, llvm::LLVMContext&, llvm::Type*): Assertion targetTy->isIntegerTy() || (!expr->getType().isBvType() && !expr->getType().isIntType() && !expr->getType().isBoolType()) && "Bitvectors, integers, and booleans may only have integer as their target type!"' failed.

When using --no-optimize with the same command the run fails earlier when generating the CFA with an Invalid integer cast type!

Output from both:
no-optimize-output.txt
test_harness_output.txt

My only guess is that it has to do something with really big unsigned integer values, as those are frequently used in these tasks.

To Reproduce
Run gazer/tools/gazer-theta/gazer-theta --inline all --search ERR --domain PRED_CART --refinement BW_BIN_ITP --initprec EMPTY --trace -test-harness=s3_clnt.blast.01.i.cil-2.c.ll sv-benchmarks/c/ssh/s3_clnt.blast.01.i.cil-2.c
using this task.
(Take into account, that the sv-benchmarks repo is huge, so cloning it only for this task might not be practical.)

Expected behavior
Generating a test harness.

Version:
Gazer master branch, Theta 2.5

Additional context
I only included one example, but this happens with most of these tasks and with tasks from other sets as well.

Support SV-COMP ReachSafety-Combinations with Theta

See #61

Support SV-COMP ReachSafety-Arrays with BMC

I don't know if it actually works or not, we should try first.

Enable SonarCloud analysis

Using constant zero-aggregate initializers results in a crash

LLVM represents zero-initialized aggregate values with the ConstantAggregateZero class, for which do not have support yet in the memory models, causing a crash.

LLVM IR to reproduce:

; RUN: %bmc -skip-pipeline -bound 1 "%s" | FileCheck "%s"
; CHECK: Verification SUCCESSFUL

@a = internal unnamed_addr global [2 x double] zeroinitializer, align 16

define dso_local i32 @main() {
bb:
  %tmp = load i64, i64* bitcast ([2 x double]* @a to i64*), align 16
  store i64 %tmp, i64* bitcast (double* getelementptr inbounds ([2 x double], [2 x double]* @a, i64 0, i64 1) to i64*), align 8
  ret i32 0
}

Exhibited behavior:

Unhandled value type
UNREACHABLE executed at /home/gyula/projects/gazer/src/LLVM/Automaton/InstToExpr.cpp:898!

Support large int literals in Theta backend

Previously Theta could only parse 32-bit integer literals, but with 1.7 (see #29) it should not be a limitation. We should check if Gazer has to be modified internally to be able to produce larger literals.

Missing support for Rem expressions during Theta CFA generation

The expression transformer in tools/gazer-theta/lib/ThetaExpr.cpp does not handle Rem expressions, resulting in warnings and invalid Theta CFAs, e.g.:

Unhandled expr Int Rem(Int main/main/bb1165/a17.0_inlined0,71)
Unhandled expr Int Rem(Int main/main/bb1165/a17.0_inlined0,299872)
Unhandled expr Int Rem(Int Add(Int main/main/bb1165/a2.0_inlined0,252809),45)
[...]
Verification INTERNAL ERROR.
  Theta returned unrecognizable output. Raw output is:
Exception occurred, message: Could not parse CFA: Line 10006 col 48: Identifier '__UNHANDLED_EXPR__' cannot be resolved

Download fixed Theta binary instead of clone and build

Since Theta is now versioned and binary releases are published, we can just wget the (jar and so) binary files instead of cloning and building Theta. This applies both to Travis CI and the Docker image. Besides shorter build time, this would also have the advantage that from now, Gazer would use a fixed version of Theta, avoiding breaking Gazer if there is a breaking change in the current master status of Theta. What do you think @sallaigy? This should be an easy fix, I can do it, but first I'd like your opinion.

Upgrade to Theta v1.3.0

Theta v1.3.0 has a new flag --prunestrategy.

Support SV-COMP ReachSafety-BitVectors with Theta

See #39

Boost fs not found in docker build

Building the docker image gives the following error:

[ 70%] Building CXX object tools/gazer-theta/CMakeFiles/gazer-theta.dir/gazer-theta.cpp.o
/home/user/gazer/tools/gazer-theta/gazer-theta.cpp:107:9: error: no member named 'fs' in namespace 'boost::dll'; did you mean 'llvm::sys::fs'?
        boost::dll::fs::error_code ec;
        ^~~~~~~~~~~~~~
        llvm::sys::fs
/usr/lib/llvm-9/include/llvm/Support/raw_ostream.h:34:11: note: 'llvm::sys::fs' declared here
namespace fs {
          ^
/home/user/gazer/tools/gazer-theta/gazer-theta.cpp:107:25: error: no type named 'error_code' in namespace 'llvm::sys::fs'
        boost::dll::fs::error_code ec;
                    ~~~~^
2 errors generated.
make[2]: *** [tools/gazer-theta/CMakeFiles/gazer-theta.dir/gazer-theta.cpp.o] Error 1
tools/gazer-theta/CMakeFiles/gazer-theta.dir/build.make:62: recipe for target 'tools/gazer-theta/CMakeFiles/gazer-theta.dir/gazer-theta.cpp.o' failed
CMakeFiles/Makefile2:825: recipe for target 'tools/gazer-theta/CMakeFiles/gazer-theta.dir/all' failed
make[1]: *** [tools/gazer-theta/CMakeFiles/gazer-theta.dir/all] Error 2
make: *** [all] Error 2
Makefile:129: recipe for target 'all' failed
The command '/bin/sh -c cmake -DCMAKE_CXX_COMPILER=clang++-9 -DGAZER_ENABLE_UNIT_TESTS=On -DCMAKE_BUILD_TYPE=Debug -DCMAKE_EXPORT_COMPILE_COMMANDS=On . && make' returned a non-zero code: 2

Support SV-COMP ReachSafety-Arrays with Theta

Theta does have some support for arrays thanks to @as3810t, but I'm not sure if translation in Gazer is implemented.

Jumping out of a loop from two different blocks to a same exit block breaks the translation of PHI values

If a loop has two exiting blocks (that is, blocks that are still inside the loop) branching to a single exit block (which is outside), the CFA translation process does not handle PHI nodes properly in this case, leading to faulty verification engine behavior.

LLVM IR test case to reproduce:

; RUN: %bmc -bound 1 -skip-pipeline "%s" | FileCheck
; CHECK: Verification {{SUCCESSFUL|BOUND REACHED}}

define internal void @a(i32 %arg) {
bb:
  %tmp = icmp ne i32 %arg, 0
  call void @llvm.assume(i1 %tmp)
  ret void
}

define i32 @main() {
bb:
  br label %bb3

bb3:                                              ; preds = %.split.split, %bb
  %b.0 = phi float [ 1.600000e+01, %bb ], [ %tmp4, %.split.split ]
  %tmp = fmul float %b.0, 3.000000e+00
  %tmp4 = fmul float %tmp, 2.500000e-01
  %tmp5 = call i1 @gazer.dummy_nondet.i1()
  br i1 %tmp5, label %a_fail, label %.split

.split:                                           ; preds = %bb3
  call void @a(i32 12)
  %tmp6 = fptosi float %tmp4 to i32
  %tmp7 = call i1 @gazer.dummy_nondet.i1()
  br i1 %tmp7, label %a_fail, label %.split.split

.split.split:                                     ; preds = %.split
  call void @a(i32 %tmp6)
  br label %bb3

a_fail:                                           ; preds = %.split, %bb3
  %tmp8 = phi i32 [ 12, %bb3 ], [ %tmp6, %.split ]
  %tmp9 = icmp eq i32 %tmp8, 0
  br i1 %tmp9, label %error1, label %bb10

bb10:                                             ; preds = %a_fail
  br label %UnifiedUnreachableBlock

error1:                                           ; preds = %a_fail
  call void @gazer.error_code(i16 1)
  br label %UnifiedUnreachableBlock

UnifiedUnreachableBlock:                          ; preds = %error1, %bb10
  unreachable
}

declare void @gazer.error_code(i16) local_unnamed_addr

declare void @gazer.dummy.void() local_unnamed_addr

declare i1 @gazer.dummy_nondet.i1() local_unnamed_addr

declare void @llvm.assume(i1)

Can I get the generated CFA file?

Dear Developer.
I noticed your tool by reading the paper "Gazer-Theta: LLVM-based Verier Portfolio with BMC/CEGAR (Competition Contribution)" and I found that the LLVM IR is converted into a CFA.
I would like to know how do I get the CFA file?
Also, why does your tool not support C++? Isn't it based on IR?

Move CI to GH Actions

We could move to GH Actions from Travis, as it provides a tighter integration

	ExternalProject_Add(z3_download
	PREFIX vendor
	# Download
	URL https://github.com/Z3Prover/z3/releases/download/z3-4.8.6/z3-4.8.6-x64-ubuntu-16.04.zip
	URL_HASH SHA1=1663a7825c45f24642045dfcc5e100d65c1b6b1e
	CONFIGURE_COMMAND ""
	BUILD_COMMAND ""
	UPDATE_COMMAND ""
	INSTALL_COMMAND ""
	)