Giter VIP home page Giter VIP logo

sea-dsa's Introduction

SeaDsa: A Points-to Analysis for Verification of Low-level C/C++

SeaDsa is a context-, field-, and array-sensitive unification-based points-to analysis for LLVM bitcode inspired by DSA. SeaDsa is an order of magnitude more scalable and precise than Dsa and a previous implementation of SeaDsa thanks to improved handling of context sensitivity, addition of partial flow-sensitivity, and type-awareness.

Although SeaDsa can analyze arbitrary LLVM bitcode, it has been tailored for use in program verification of C/C++ programs. It can be used as a stand-alone tool or together with the SeaHorn verification framework and its analyses.

This branch supports LLVM 14.

Requirements

SeaDsa is written in C++ and uses the Boost library. The main requirements are:

  • C++ compiler supporting c++14
  • Boost >= 1.65
  • LLVM 14

To run tests, install the following packages:

  • sudo pip install lit OutputCheck
  • sudo easy_install networkx
  • sudo apt-get install libgraphviz-dev
  • sudo easy_install pygraphviz

Project Structure

  1. The main Points-To Graph data structures, Graph, Cell, and Node, are defined in include/Graph.hh and src/Graph.cc.
  2. The Local analysis is in include/Local.hh and src/DsaLocal.cc.
  3. The Bottom-Up analysis is in include/BottomUp.hh and src/DsaBottomUp.cc.
  4. The Top-Down analysis is in include/TopDown.hh and src/DsaTopDown.cc.
  5. The interprocedural node cloner is in include/Cloner.hh and src/Clonner.cc.
  6. Type handling code is in include/FieldType.hh, include/TypeUtils.hh, src/FieldType.cc, and src/TypeUtils.cc.
  7. The allocator function discovery is in include/AllocWrapInfo.hh and src/AllocWrapInfo.cc.

Compilation and Usage

Program Verification benchmarks

Instructions on running program verification benchmarks, together with recipes for building real-world projects and our results, can be found in tea-dsa-extras.

Integration in other C++ projects (for users)

SeaDsa contains two directories: include and src. Since SeaDsa analyzes LLVM bitcode, LLVM header files and libraries must be accessible when building with SeaDsa.

If your project uses cmake then you just need to add in your project's CMakeLists.txt:

 include_directories(seadsa/include)
 add_subdirectory(seadsa)

Standalone (for developers)

If you already installed llvm-14 on your machine:

mkdir build && cd build
cmake -DCMAKE_INSTALL_PREFIX=run -DLLVM_DIR=__here_llvm-14__/share/llvm/cmake  ..
cmake --build . --target install

Otherwise:

mkdir build && cd build
cmake -DCMAKE_INSTALL_PREFIX=run ..
cmake --build . --target install

To run tests:

cmake --build . --target test-sea-dsa

Visualizing Memory Graphs and Complete Call Graphs

Consider a C program called tests/c/simple.c:

#include <stdlib.h>

typedef struct S {
  int** x;
  int** y;  
} S;

int g;

int main(int argc, char** argv){

  S s1, s2;

  int* p1 = (int*) malloc(sizeof(int));
  int* q1 = (int*) malloc(sizeof(int));  
  s1.x = &p1;
  s1.y = &q1;    
  *(s1.x) = &g;
  
  return 0;
}   
  1. Generate bitcode:

     clang -O0 -c -emit-llvm -S tests/c/simple.c -o simple.ll
    

The option -O0 is used to disable clang optimizations. In general, it is a good idea to enable clang optimizations. However, for trivial examples like simple.c, clang simplifies too much so nothing useful would be observed. The options -c -emit-llvm -S generate bitcode in human-readable format.

  1. Run sea-dsa on the bitcode and print memory graphs to dot format:

     seadsa -sea-dsa=butd-cs -sea-dsa-type-aware -sea-dsa-dot  simple.ll
    

The options -sea-dsa=butd-cs -sea-dsa-type-aware enable the analysis implemented in our FMCAD'19 paper (see References). This command will generate a FUN.mem.dot file for each function FUN in the bitcode program. In our case, the only function is main and thus, there is one file named main.mem.dot. The file is generated in the current directory. If you want to store the .dot files in a different directory DIR then add the option -sea-dsa-dot-outdir=DIR

  1. Visualize main.mem.dot by transforming it to a pdf file:

     dot -Tpdf main.mem.dot -o main.mem.pdf
     open main.mem.pdf  // replace with you favorite pdf viewer 
    

Example of a memory graph

In our memory model, a pointer value is represented by a cell which is a pair of a memory object and offset. Memory objects are represented as nodes in the memory graph. Edges are between cells.

Each node field represents a cell (i.e., an offset in the node). For instance, the node fields <0,i32**> and <8,i32**> pointed by %6 and %15, respectively are two different cells from the same memory object. The field <8,i32**> represents the cell at offset 8 in the corresponding memory object and its type is i32**. Black edges represent points-to relationships between cells. They are labeled with a number that represents the offset in the destination node. Blue edges connect formal parameters of the function with a cell. Purple edges connect LLVM pointer variables with cells. Nodes can have markers such as S (stack allocated memory), H (heap allocate memory), M (modified memory), R (read memory), E (externally allocated memory), etc. If a node is red then it means that the analysis lost field sensitivity for that node. The label {void} is used to denote that the node has been allocated but it has not been used by the program.

sea-dsa can also resolve indirect calls. An indirect call is a call where the callee is not known statically. sea-dsa identifies all possible callees of an indirect call and generates a LLVM call graph as output.

Consider this example in tests/c/complete_callgraph_5.c:

struct class_t;
typedef int (*FN_PTR)(struct class_t *, int);
typedef struct class_t {
  FN_PTR m_foo;
  FN_PTR m_bar;
} class_t;

int foo(class_t *self, int x)
{
  if (x > 10) {
    return self->m_bar(self, x + 1);
  } else
    return x;
}

int bar (class_t *self, int y) {
  if (y < 100) {
    return y + self->m_foo(self, 10);
  } else
    return y - 5;
}

int main(void) {
  class_t obj;
  obj.m_foo = &foo;
  obj.m_bar = &bar;
  int res;
  res = obj.m_foo(&obj, 42);
  return 0;
}

Type the commands:

clang -c -emit-llvm -S tests/c/complete_callgraph_5.c  -o ex.ll
sea-dsa --sea-dsa-callgraph-dot ex.ll

It generates a .dot file called callgraph.dot in the current directory. Again, the .dot file can be converted to a .pdf file and opened with the commands:

dot -Tpdf callgraph.dot -o callgraph.pdf
open callgraph.pdf  

Example of a call graph

sea-dsa can also print some statistics about the call graph resolution process (note that you need to call clang with -g to print file,line, and column information):

sea-dsa --sea-dsa-callgraph-stats ex.ll


=== Sea-Dsa CallGraph Statistics === 
** Total number of indirect calls 0
** Total number of resolved indirect calls 3

%16 = call i32 %12(%struct.class_t* %13, i32 %15) at tests/c/complete_callgraph_5.c:14:12
RESOLVED
Callees:
  i32 bar(%struct.class_t*,i32)
  
%15 = call i32 %13(%struct.class_t* %14, i32 10) at tests/c/complete_callgraph_5.c:23:16
RESOLVED
Callees:
  i32 foo(%struct.class_t*,i32)
  
%11 = call i32 %10(%struct.class_t* %2, i32 42) at tests/c/complete_callgraph_5.c:36:9
RESOLVED
Callees:
  i32 foo(%struct.class_t*,i32)

Dealing with C/C++ library and external calls

The pointer semantics of external calls can be defined by writing a wrapper that calls any of these functions defined in seadsa/seadsa.h:

  • extern void seadsa_alias(const void *p, ...);
  • extern void seadsa_collapse(const void *p);
  • extern void seadsa_mk_seq(const void *p, unsigned sz);

seadsa_alias unifies all argument's cells, seadsa_collapse tells sea-dsa to collapse (i.e., loss of field-sensitivity) the cell pointed by p, and seadsa_mk_seq tells sea-dsa to mark as sequence the node pointed by p with size sz.

For instance, consider an external call foo defined as follows:

extern void* foo(const void*p1, void *p2, void *p3);

Suppose that the returned pointer should be unified to p2 but not to p1. In addition, we would like to collapse the cell corresponding to p3. Then, we can replace the above prototype of foo with the following definition:

#include "seadsa/seadsa.h"
void* foo(const void*p1, void *p2, void*p3) {
	void* r = seadsa_new();
	seadsa_alias(r,p2);
	seadsa_collapse(p3);
	return r;
}

References

  1. "A Context-Sensitive Memory Model for Verification of C/C++ Programs" by A. Gurfinkel and J. A. Navas. In SAS'17. (Paper) | (Slides)

  2. "Unification-based Pointer Analysis without Oversharing" by J. Kuderski, J. A. Navas and A. Gurfinkel. In FMCAD'19. (Paper)

sea-dsa's People

Contributors

adrianherrera avatar agurfinkel avatar caballa avatar danblitzhou avatar ianamason avatar igcontreras avatar kuhar avatar pietroborrello avatar priyasiddharth avatar pwaller avatar sei-eschwartz avatar shaobo-he avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sea-dsa's Issues

Segmentation fault when running examples on the dev9 branch

Hi sea-dsa developers,

I ran into a segmentation fault when running the example in the README on the dev9 branch. The error message I got is,

Pass 'SeaHorn Dsa graph printer' is not initialized.
Verify if there is a pass dependency cycle.
Required Passes:
/usr/lib/llvm-9/lib/libLLVM-9.so.1(_ZN4llvm3sys15PrintStackTraceERNS_11raw_ostreamE+0x1f)[0x7f167d78035f]
/usr/lib/llvm-9/lib/libLLVM-9.so.1(_ZN4llvm3sys17RunSignalHandlersEv+0x22)[0x7f167d77e752]
/usr/lib/llvm-9/lib/libLLVM-9.so.1(+0xa38761)[0x7f167d780761]
/lib/x86_64-linux-gnu/libc.so.6(+0x3efd0)[0x7f167c60cfd0]
/usr/lib/llvm-9/lib/libLLVM-9.so.1(_ZN4llvm17PMTopLevelManager12schedulePassEPNS_4PassE+0x19b)[0x7f167d87a53b]
run/bin/seadsa(+0x13e57)[0x563387f99e57]
Stack dump:
0.      Program arguments: run/bin/seadsa -sea-dsa=butd-cs -sea-dsa-type-aware -sea-dsa-dot simple.ll
Segmentation fault (core dumped)

Null graph node in context insensitive analysis

Hello,

I am using SeaDsa's context-insensitive analysis in a project. When I analyze the program below, a null node in the graph from the local analysis pass is produced for the function @fun. This causes a segmentation fault in the context-insensitive analysis pass.

The crash occurs because of the i8* undef in the returned value.

Is there a reason this pass should produce a null graph?

Thanks,
Mark

; Original function
define internal { i8*, i8* } @fun(i8* %0) {
  ret { i8*, i8* } { i8* null, i8* undef }
}

; As transformed by tool
; define internal { i8*, i8* } @fun(i8* %0) {
;   %2 = alloca { i8*, i8* }
;   %3 = getelementptr { i8*, i8* }, { i8*, i8* }* %2, i32 0, i32 0
;   store i8* null, i8** %3
;   %4 = getelementptr { i8*, i8* }, { i8*, i8* }* %2, i32 0, i32 1
;   store i8* undef, i8** %4
;   %5 = getelementptr { i8*, i8* }, { i8*, i8* }* %2, i32 0, i32 0
;   %6 = load i8*, i8** %5
;   %7 = insertvalue { i8*, i8* } undef, i8* %6, 0
;   %8 = getelementptr { i8*, i8* }, { i8*, i8* }* %2, i32 0, i32 1
;   %9 = load i8*, i8** %8
;   %10 = insertvalue { i8*, i8* } %7, i8* %9, 1
;   ret { i8*, i8* } %10
; }

define dso_local i32 @main() {
  %1 = call { i8*, i8* } @fun(i8* null)
  ret i32 0
}

Oversharing with offset collapsed but types known

Program: oversharing.txt -- uploaded as txt because they dont allow .c or .i.

Steps to reproduce:

$ mv oversharing.txt oversharing.i
$ sea smt -O2 --horn-nondet-undef --sea-dsa-type-aware --dsa=sea-cs --horn-stats --horn-singleton-aliases=true --horn-ignore-calloc=true --horn-ignore-memset=true --horn-shadow-mem-optimize=false oversharing.i --oll=oversharing.ll
$ seadsa --sea-dsa-type-aware --sea-dsa=cs --sea-dsa-color-func-sim-dot --sea-dsa-dot oversharing.ll 
$ dot -Tpdf tulip_down.TD.mem.dot -o tulip_down.TD.mem.dot.pdf
$ dot -Tpdf tulip_down.BU.mem.dot -o tulip_down.BU.mem.dot.pdf

In the 2 pdfs produced, nodes with the same color in the BU are simulated by a node with the same color in the TD graph. The gray nodes in the TD graphs are not reachable if exploring it via the BU graph with the simulation.

Bottom-up graph:
tulip_down.BU.mem.dot.pdf

Top-down graph:
tulip_down.TD.mem.dot.pdf

Better support for external calls

The current assumption is that an external call cannot allocate memory or unify pointers. If the external call returns a pointer, the address is a non-deterministic value.

We propose to add support for two c functions:

  • sea_dsa_alias(p1,...,pn)
  • sea_dsa_collapse(p)

To change the above default assumption for an external call to foo, the user must provide a c file where foo is only defined in terms of the two above functions (sea_dsa_alias and sea_dsa_collapse) along with the input file(s).

In that way, the front-end will link together the bc file with the definition of foo and the input file(s) where foo is called so that foo will be no longer an external call.

  • Support default implementations for common C/C++ functions without modifying the user files. Look at projects like musl and uClibc

Upgrading to LLVM 11?

Hello sea-dsa developers,

I'm seeing a need to upgrade SMACK to LLVM 11 which would require sea-dsa to do it as well. So I was wondering if you have plans to address this matter. If not, I'm interested in creating a PR.

Btw, based on the experiments I've done so far, the most relevant change is to replace LLVM's CallSite with some other class (presumably CallBase) since the former has been removed.

Question about the result on one program

Hello SeaDsa developers,

Sorry to bother you again. I also attempted to use SeaDsa (the llvm-8.0 branch) on another program (this time I compiled it with -O0) and the resulting graphs seem spurious, too. For example,

#include <assert.h> 

int* A;

typedef struct {
  int* a;
  long b;
} S;

S foo() {
  S x = {A, 2L};
  assert(1);
  return x;
}

int main() {
  int a = 1;
  A = &a;
  S y = foo();
  assert(*(y.a) == 1);
  return 0;
}

The graph is obtained using the following commands,

clang -O0 -c -emit-llvm -S test2.c
~/sea-dsa/build/run/bin/seadsa -sea-dsa=butd-cs -sea-dsa-type-aware -sea-dsa-dot test2.ll  -sea-dsa-dot-outdir=test2
(cd test2 ; dot -Tpdf main.mem.dot -o main.mem.pdf)

main.mem.pdf

It seems *A and the pointer extracted from the return value of foo point to different memory objects. If I understand it correctly, they should point to the same object.

Please let me know if I missed anything or this is expected or not.

Imprecise call graph

Hello sea-dsa developers,

I used the dev10 branch to produce the call graph for the following program. I think it appears incorrect since main should call either one of the two functions indirectly.

int __incr(int x) { return ++x; }

int __decr(int x) { return --x; }

#ifdef __MACH__
int (*incr)(int) = __incr;
int (*decr)(int) = __decr;
#else
int incr(int) __attribute__((alias("__incr")));
int decr(int) __attribute__((alias("__decr")));
#endif

int main(void) {
  int (*fp)(int);
  int x = 1, y = 1;

  if (y > 0) {
    fp = incr;
  } else {
    fp = decr;
  }
  x = fp(x);

  return (x == 2);
}

To reproduce, please use the following commands,

clang func_ptr_alias1.c -c -emit-llvm -S
seadsa --sea-dsa-callgraph-dot func_ptr_alias1.ll

This is the call graph I got,

callgraph.pdf

FlatGraph does not support offset collapsing

Under FlatGraph, there is only one dsa node. (i.e., mkNode() always returns the same object).

However, Node::collapseOffsets() calls Node::pointTo, and pointTo requires two distinct nodes. Thus every attempt to call collapseOffsets with FlatGraph causes an assertion failure.

Missing nodes for pointer variables

Hello sea-dsa developers,

While I was debugging a soundness issue when using sea-dsa for SMACK, I couldn't find a node that corresponds to a pointer variable. In demo.ll, %24 of function _ZN9lru_cacheC2Ev is used as the pointer argument to a store instruction,

%24 = getelementptr inbounds %"class.std::__1::__libcpp_compressed_pair_imp.4", %"class.std::__1::__libcpp_compressed_pair_imp.4"* %23, i32 0, i32 0, !dbg !11275, !verifier.code !6341
%25 = bitcast %"struct.std::__1::__hash_node"*** %24 to i8*, !dbg !11275      
%26 = bitcast i8* inttoptr (i64 8 to i8*) to i8*, !dbg !11275                 
call void @__SMACK_check_memory_safety(i8* %25, i8* %26), !dbg !11275         
store %"struct.std::__1::__hash_node"** null, %"struct.std::__1::__hash_node"*** %24, align 8, !dbg !11275, !verifier.code !6341

I would expect it to show up in the memory graph (i.e., sgx_fopen.mem.dot),

% grep "%24 = " sgx_fopen.mem.dot
	Node0x1c4baa0 [shape=plaintext, label ="  %24 = call i8* @strncpy(i8* %22, i8* %2, i64 %23) #15, !dbg !6375, !verifier.code !6339"];
	Node0x1c6e738 [shape=plaintext, label ="  %24 = getelementptr inbounds %struct._key_request_t, %struct._key_request_t* %0, i32 0, i32 9, !dbg !6379, !verifier.code !6344"];
	Node0x201e0e8 [shape=plaintext, label ="  %24 = getelementptr inbounds %\"struct.std::__1::__list_node_base\", %\"struct.std::__1::__list_node_base\"* %23, i32 0, i32 1, !dbg !6374, !verifier.code !6339"];
	Node0x1dd98f8 [shape=plaintext, label ="  %24 = getelementptr inbounds %class.protected_fs_file, %class.protected_fs_file* %0, i32 0, i32 13, !dbg !6349, !verifier.code !6341"];

I'm using the dev10 branch and the following command to generate the memory graph (I had to change lib/seadsa/DsaPrinter.cc to let it find the graph since the entrypoint is sgx_fopen instead of `main).

seadsa -sea-dsa=ci -sea-dsa-dot demo.ll

Please let me know if it's expected or I'm missing anything. Thank you.

debug.zip

compilation issues with llvm 3.8 and ubuntu trusty

I am using fresh ubuntu trusty (14.04 x86_64) on vagrant.

Install the necessary dependencies.

$ sudo aptitude install build-essential bash-completion git subversion libboost-all-dev cmake3 libtinfo-dev

The need for subversion is a bug in the build system

Get our pre-build LLVM. If not using the binaries, make sure to follow
the compilation recipe in our CMakeLists.txt!

$ wget https://ece.uwaterloo.ca/~agurfink/llvm/LLVM-3.8.1-Linux.tar.gz
$ tar xvf https://ece.uwaterloo.ca/~agurfink/llvm/LLVM-3.8.1-Linux.tar.gz

Get sea-dsa from github

$ git clone [email protected]:seahorn/sea-dsa.git

Configure and build debug and release modes

$ cd sea-dsa
$ mkdir build && cd build
$ cmake -DLLVM_DIR=[YOURPATH]/LLVM-3.8.1-Linux/share/llvm/cmake/ -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=run ../
$ make install
$ cd ../
$ mkdir debug && cd debug
$ cmake -DLLVM_DIR=[YOURPATH]/LLVM-3.8.1-Linux/share/llvm/cmake/ -DCMAKE_BUILD_TYPE=Debug -DCMAKE_INSTALL_PREFIX=run ../
$ make install
$ cd ..

I know that it is often a bad idea to use different build parameters
for an LLVM project (like sea-dsa) and LLVM itself. In particular,
the include headers would be interpreted with debug info enabled even
though the libraries are compiled without debug information
enabled. However, I believe we have tested it with clean debug builds
as well so this should not be an issue here (but it could!).

Run release version

$ ./build/run/bin/seadsa --sea-dsa=ci -dsa-stats ./tests/test-1.ci.ll

Output (bad)

WARNING: f has not Dsa graph
WARNING: g has not Dsa graph
WARNING: main has not Dsa graph
 ========== Begin SeaHorn Dsa info  ==========
 --- Memory access information
	0 number of read or modified nodes.
	0 number of non-trivial memory accesses (load/store/memcpy/memset/memmove).
	 Summary of the 5 nodes with more accesses:
 --- Type information
	0 number of typed nodes.
	0 number of untyped nodes.
	0 number of collapsed nodes.
 --- Allocation site information
	0 number of allocation sites
	   0 max number of allocation sites in a node
	0 number of nodes without allocation site
	0 number of memory accesses without allocation site
	0 number of nodes with multi-typed allocation sites
 ========== End SeaHorn Dsa info  ==========
$ ./debug/run/bin/seadsa --sea-dsa=ci -dsa-stats ./tests/test-1.ci.ll

Output (good)

========== Begin SeaHorn Dsa info  ==========
 --- Memory access information
	1 number of read or modified nodes.
	2 number of non-trivial memory accesses (load/store/memcpy/memset/memmove).
	 Summary of the 5 nodes with more accesses:
	  [Node Id 11] 2 (100%) of total memory accesses
 --- Type information
	1 number of typed nodes.
	0 number of untyped nodes.
	0 number of collapsed nodes.
 --- Allocation site information
	4 number of allocation sites
	   4 max number of allocation sites in a node
	0 number of nodes without allocation site
	0 number of memory accesses without allocation site
	0 number of nodes with multi-typed allocation sites
 ========== End SeaHorn Dsa info  ==========

The problem seems to be at DsaGlobal.cc:93 since the SCC appears
empty in release build and non-empty in debug.

Need to create a simpler example that only constructs a call graph and
illustrates the problem with less dependencies on our code.

Missing the case when constant expression gep with an inttoptr as base pointer

Assertion violation:

Cell not found for gep:	i8* getelementptr (i8, i8* inttoptr (i64 256 to i8*), i64 -2401263026318606336)
	ptr: i8* inttoptr (i64 256 to i8*)Assertion failed: (false && "No cell for gep'd ptr"), function visitGep, file ../src/DsaLocal.cc, line 770.

Test case provided by Jack Huang.

Command to reproduce:

/seadsa -sea-dsa=butd-cs -sea-dsa-type-aware -sea-dsa-dot nvme_block.bc

nvme_block.bc.zip

Print the dot file for a graph

Hello sea-dsa developers,

I'd like to print the dot file for a graph in an internal pass of SMACK. It seems that the only proper API to do this is createDsaPrinterPass. Using this function requires the pass manager to be available, which I don't know how to get within an LLVM pass. So I was wondering if you could share some pointers about how I should proceed. I think that declaring the pass createDsaPrinterPass returns in a header file should work.

Question about external nodetype

Hello sea-dsa developers,

Consider the following program,

#include <stdlib.h>
#include <stdbool.h>

void __VERIFIER_assert(bool);

void foo(int *);
int *bar();

int main() {
  int *x = malloc(4);
  int *y;

  foo(x);
  y = bar();

  *x = 1;
  *y = 2;

  __VERIFIER_assert(x != y);
}
clang-8 -O0 -c -emit-llvm -S -Xclang -disable-O0-optnone extern_mem_fail.c
opt-8 extern_mem_fail.ll -mem2reg -S -o extern_mem_fail.ll
seadsa -sea-dsa=ci -sea-dsa-type-aware -sea-dsa-dot extern_mem_fail.ll

main.mem.pdf

The malloced node is not labeled as an external node as opposed to what llvm-dsa does. I was wondering if it should be since x is passed to an external function foo, which may store this pointer elsewhere such that bar can obtain it and return it.

empty labels in the complete call graph

Hi all, I using the sea-dsa analysis framework as a library and it works fine. I came across an issue where when printing CompleteCallGraph into a dot format, it creates empty labels. This does not seem to be the case in the given examples in the "tests/expected_graphs" directory. I am using MacOs and LLVM 8 to run the analysis. Is this problem related to this settings?

Selecting program entry point

Hello sea-dsa developers,

I noticed that the entry point of the program to analyze is supposed to be main. For example, the following lines indicate that global variables are only analyzed when main is found,

  if (F.getName().equals("main")) {
    GlobalBuilder globalBuilder(F, g, m_dl, tli, m_allocInfo);
    globalBuilder.initGlobalVariables();
  }

However, the applications I'm working on don't have main as the entry points. So I was wondering if it's possible to add an option like --entry-point to select the program entry point.

Question about building sea-dsa

Hello sea-dsa developers,

I encountered errors in building sea-dsa inside SMACK. I simply added add_subdirectory(sea-dsa) to SMACK's CMakeLists.txt file and cmake will complain something like Unknown CMake command "add_llvm_library".. I think this is because sea-dsa requires definitions from LLVM, which are provided using these commands (https://github.com/seahorn/sea-dsa/blob/master/CMakeLists.txt#L87-L92). However, these commands only take effects when sea-dsa is built at the top level. So I was wondering if these lines should be moved below the if block where they are currently in.

ShadowMem is creating a badref

Cmd: seadsa --sea-dsa-shadow-mem --log=shadow_verbose ../issue104.ll

.split:                                           ; preds = %22
  %23 = load i32, i32* <badref>
  call void @shadow.mem.out(i32 19, i32 %23, i32 0, i8* null), !shadow.mem !5, !shadow.mem.use !5
  call void @shadow.mem.in(i32 11, i32 %sm6, i32 1, i8* bitcast (i32* @y to i8*)), !shadow.mem !5, !shadow.mem.use !5
  call void @shadow.mem.out(i32 11, i32 %shadow.mem.11.0, i32 1, i8* bitcast (i32* @y to i8*)), !shadow.mem !5, !shadow.mem.use !5
  call void @shadow.mem.in(i32 15, i32 %sm7, i32 2, i8* bitcast (i32* @x to i8*)), !shadow.mem !5, !shadow.mem.use !5
  call void @shadow.mem.out(i32 15, i32 %shadow.mem.15.0, i32 2, i8* bitcast (i32* @x to i8*)), !shadow.mem !5, !shadow.mem.use !5
  call void @shadow.mem.out(i32 8, i32 %sm, i32 3, i8* null), !shadow.mem !5, !shadow.mem.use !5
  call void @shadow.mem.out(i32 9, i32 %sm1, i32 4, i8* null), !shadow.mem !5, !shadow.mem.use !5
  call void @shadow.mem.out(i32 10, i32 %sm2, i32 5, i8* null), !shadow.mem !5, !shadow.mem.use !5
  ret i32 %.0

issue104.zip

What is the status of LLVM 9 and LLVM 10 support?

We are looking to upgrade SMACK to LLVM 9 now, and also to LLVM 10 in a few months. I started working on it and updating SeaDSA along the way, but then @shaobo-he pointed out your dev9 branch.

Is dev9 branch where I can find stable support for LLVM 9?

More generally, what is your strategy in terms of LLMV updates?
For example, at this point your master branch is really outdated it seems. Any plans to merge e.g. dev8 into master at some point?

Btw, we've been really happy with SeaDSA since we made the switch! So thanks again for developing it and for your help.

Unnecessary nodes created when finding the cell of a global in a graph

Obtaining a node in a graph that corresponds to a global variable introduces an empty node in the graph that is not linked to anything.

short2.pp.ms.o.bc.zip

To reproduce this:

$ seadsa --sea-dsa-type-aware --sea-dsa=cs --sea-dsa-color-func-sim-dot --sea-dsa-dot short2.pp.ms.o.bc
$ dot -Tpdf main.TD.mem.dot -o main.TD.mem.pdf
$ dot -Tpdf main.mem.dot -o main.mem.pdf

main.TD.pdf and main.pdf represent the same graph, main.TD.pdf is colored according to the simulation relation of the bottom-up graph. When coloring, the same graph is used for both:

colorGraphsFunction(F, *buG, *G, colorBuG, colorG);
But main.pdf is printed before the simulation and main.TD.pdf after.

main.pdf
main.TD.pdf

Port to LLVM 5.0

  • ported code in branch llvm-5.0
  • pass all tests
  • Update .travis to download llvm/clang 5.0

Crashes on SV-COMP benchmarks

Hello sea-dsa developers,

I'm testing our sea-dsa integration on SV-COMP benchmarks and encountered some crashes. I'll list them gradually in this issue.

To reproduce these crashes, please run,

seadsa -sea-dsa=ci -sea-dsa-type-aware -sea-dsa-dot <.ll file>
  1. Cell not found for gep

This crash manifests on the following .ll file.
https://drive.google.com/open?id=1Xqi8l9ZmekSPqSUfvPTNCHlBYXRgl4B0

The output is as follows,

WARNING: inttoptr @ addr 0x55f1b5d8ea68 is allocating a new cell.
Sea-Dsa type aware!
WARNING: inttoptr @ addr 0x55f1b5dd42f8 is allocating a new cell.
Cell not found for gep:	  %369 = getelementptr inbounds [100 x %struct._TRACK_DATA], [100 x %struct._TRACK_DATA]* getelementptr inbounds (%struct._CDROM_TOC, %struct._CDROM_TOC* null, i32 0, i32 3), i32 0, i32 %368, !dbg !3203
		in CdAudioNECDeviceControl

	ptr: [100 x %struct._TRACK_DATA]* getelementptr inbounds (%struct._CDROM_TOC, %struct._CDROM_TOC* null, i32 0, i32 3)No cell for gep'd ptr
UNREACHABLE executed at /mnt/local/smack-project/smack/sea-dsa/src/DsaLocal.cc:708!
/usr/lib/llvm-8/lib/libLLVM-8.so.1(_ZN4llvm3sys15PrintStackTraceERNS_11raw_ostreamE+0x1f)[0x7f363d2ac59f]
/usr/lib/llvm-8/lib/libLLVM-8.so.1(_ZN4llvm3sys17RunSignalHandlersEv+0x22)[0x7f363d2aa9c2]
/usr/lib/llvm-8/lib/libLLVM-8.so.1(+0x9719a8)[0x7f363d2ac9a8]
/lib/x86_64-linux-gnu/libc.so.6(+0x3ef20)[0x7f363bc49f20]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0xc7)[0x7f363bc49e97]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x141)[0x7f363bc4b801]
/usr/lib/llvm-8/lib/libLLVM-8.so.1(+0x8e04b1)[0x7f363d21b4b1]
smack-project/smack/sea-dsa/build/run/bin/seadsa(+0xd93a2)[0x55f1b4e393a2]
smack-project/smack/sea-dsa/build/run/bin/seadsa(+0xd9b8b)[0x55f1b4e39b8b]
smack-project/smack/sea-dsa/build/run/bin/seadsa(+0xeb275)[0x55f1b4e4b275]
smack-project/smack/sea-dsa/build/run/bin/seadsa(+0xe8333)[0x55f1b4e48333]
smack-project/smack/sea-dsa/build/run/bin/seadsa(+0xe53af)[0x55f1b4e453af]
smack-project/smack/sea-dsa/build/run/bin/seadsa(+0xe1b00)[0x55f1b4e41b00]
smack-project/smack/sea-dsa/build/run/bin/seadsa(+0xde3ba)[0x55f1b4e3e3ba]
smack-project/smack/sea-dsa/build/run/bin/seadsa(+0xf3422)[0x55f1b4e53422]
smack-project/smack/sea-dsa/build/run/bin/seadsa(+0x251c5)[0x55f1b4d851c5]
/usr/lib/llvm-8/lib/libLLVM-8.so.1(_ZN4llvm6legacy15PassManagerImpl3runERNS_6ModuleE+0x2ee)[0x7f363d39fcae]
smack-project/smack/sea-dsa/build/run/bin/seadsa(+0x177e8)[0x55f1b4d777e8]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x7f363bc2cb97]
smack-project/smack/sea-dsa/build/run/bin/seadsa(+0x14b0a)[0x55f1b4d74b0a]
Stack dump:
0.	Program arguments: smack-project/smack/sea-dsa/build/run/bin/seadsa -sea-dsa=ci -sea-dsa-type-aware -sea-dsa-dot cdaudio.i.cil-1.ll 
1.	Running pass 'SeaHorn Dsa analysis: entry point for all clients' on module 'cdaudio.i.cil-1.ll'.
Aborted

Combination of O3 and UBSan makes seadsa crash on input analysis

It seems a bitcode file compiled with the Undefined Behavior Sanitizer and highest optimization level (-O3) makes seadsa analysis to crash.

Environment

  • llvm 10
  • sea-dsa: current dev10 branch

To Reproduce

main.cc:

#include <unistd.h>
#include <stdint.h>
#include <stdlib.h>

// Function extracted from https://github.com/GrokImageCompression/grok/blob/7e232bfecc9efd13c4fc8872cff7cc11c3e72af9/src/lib/jp2/point_transform/mct.cpp#L659
bool decompress_custom(uint8_t* mct_matrix, uint64_t n, uint8_t** pData, uint32_t num_comps,
							uint32_t is_signed)
{
	auto data = (float**)pData;

	(is_signed);

	auto pixel = new float[2 * num_comps];
	auto current_pixel = pixel + num_comps;

	for(uint64_t i = 0; i < n; ++i)
	{
		auto Mct = (float*)mct_matrix;
		for(uint32_t j = 0; j < num_comps; ++j)
		{
			pixel[j] = (float)(*(data[j]));
		}
		for(uint32_t j = 0; j < num_comps; ++j)
		{
			current_pixel[j] = 0;
			for(uint32_t k = 0; k < num_comps; ++k)
				current_pixel[j] += *(Mct++) * pixel[k];
			*(data[j]++) = (float)(current_pixel[j]);
		}
	}
	delete[] pixel;
	return true;
}

int main(int argc) {
	auto data = new uint8_t*[10];
	if (data == nullptr) return 0;
	float *matrix = (float*) malloc(8);
	if (matrix == nullptr) return 0;
	return decompress_custom((uint8_t*)matrix, 3, data , 10, 0);
}

And then to compile and analyze with:

$> clang++ -fsanitize=null -O3 -emit-llvm -c main.cc -o main.bc
$> seadsa -sea-dsa=butd-cs --sea-dsa-aa-eval ./main.bc

seadsa crashes with the following stack trace:

 #0 0x0000000000e1f64f llvm::sys::PrintStackTrace(llvm::raw_ostream&) (../../sea-dsa/build/run/bin/seadsa+0xe1f64f)
 #1 0x0000000000e1d922 llvm::sys::RunSignalHandlers() (../../sea-dsa/build/run/bin/seadsa+0xe1d922)
 #2 0x0000000000e1fb85 SignalHandler(int) (../../sea-dsa/build/run/bin/seadsa+0xe1fb85)
 #3 0x00007f4f0e6cb980 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x12980)
 #4 0x0000000000a95c82 seadsa::Node::isOffsetCollapsed() const [...]sea-dsa/include/seadsa/Graph.hh:682:54
 #5 0x0000000000a95c82 seadsa::Node::addAccessedType(unsigned int, llvm::Type*) [...]sea-dsa/lib/seadsa/Graph.cc:182:9
 #6 0x0000000000aad695 (anonymous namespace)::IntraBlockBuilder::visitLoadInst(llvm::LoadInst&) [...]sea-dsa/lib/seadsa/DsaLocal.cc:701:8
 #7 0x0000000000aad695 llvm::InstVisitor<(anonymous namespace)::IntraBlockBuilder, void>::visitLoad(llvm::LoadInst&) /usr/lib/llvm-10/include/llvm/IR/Instruction.def:172:1
 #8 0x0000000000aad695 llvm::InstVisitor<(anonymous namespace)::IntraBlockBuilder, void>::visit(llvm::Instruction&) /usr/lib/llvm-10/include/llvm/IR/Instruction.def:172:1
 #9 0x0000000000aa89a7 void llvm::InstVisitor<(anonymous namespace)::IntraBlockBuilder, void>::visit<llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::Instruction, false, false, void>, false, false> >(llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::Instruction, false, false, void>, false, false>, llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::Instruction, false, false, void>, false, false>) /usr/lib/llvm-10/include/llvm/IR/InstVisitor.h:0:37
#10 0x0000000000aa89a7 llvm::InstVisitor<(anonymous namespace)::IntraBlockBuilder, void>::visit(llvm::BasicBlock&) /usr/lib/llvm-10/include/llvm/IR/InstVisitor.h:106:5
#11 0x0000000000aa89a7 seadsa::LocalAnalysis::runOnFunction(llvm::Function&, seadsa::Graph&) [...]sea-dsa/lib/seadsa/DsaLocal.cc:1865:18
#12 0x0000000000ac3f8a seadsa::BottomUpAnalysis::runOnModule(llvm::Module&, llvm::DenseMap<llvm::Function const*, std::shared_ptr<seadsa::Graph>, llvm::DenseMapInfo<llvm::Function const*>, llvm::detail::DenseMapPair<llvm::Function const*, std::shared_ptr<seadsa::Graph> > >&) [...]sea-dsa/lib/seadsa/DsaBottomUp.cc:0:0
#13 0x0000000000ab5d8d seadsa::BottomUpTopDownGlobalAnalysis::runOnModule(llvm::Module&) [...]sea-dsa/lib/seadsa/DsaGlobal.cc:651:7
#14 0x0000000000a8f98b std::__uniq_ptr_impl<seadsa::BottomUpTopDownGlobalAnalysis, std::default_delete<seadsa::BottomUpTopDownGlobalAnalysis> >::_M_ptr() const /usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/unique_ptr.h:147:42
#15 0x0000000000a8f98b std::unique_ptr<seadsa::BottomUpTopDownGlobalAnalysis, std::default_delete<seadsa::BottomUpTopDownGlobalAnalysis> >::get() const /usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/unique_ptr.h:332:21
#16 0x0000000000a8f98b std::unique_ptr<seadsa::BottomUpTopDownGlobalAnalysis, std::default_delete<seadsa::BottomUpTopDownGlobalAnalysis> >::operator bool() const /usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/unique_ptr.h:346:16
#17 0x0000000000a8f98b seadsa::SeaDsaAAResult::alias(llvm::MemoryLocation const&, llvm::MemoryLocation const&, llvm::AAQueryInfo&) [...]sea-dsa/lib/seadsa/SeaDsaAliasAnalysis.cc:157:8
#18 0x0000000000453f33 llvm::AAResults::alias(llvm::MemoryLocation const&, llvm::MemoryLocation const&, llvm::AAQueryInfo&) (../../sea-dsa/build/run/bin/seadsa+0x453f33)
#19 0x0000000000466d27 llvm::BasicAAResult::aliasCheck(llvm::Value const*, llvm::LocationSize, llvm::AAMDNodes, llvm::Value const*, llvm::LocationSize, llvm::AAMDNodes, llvm::AAQueryInfo&, llvm::Value const*, llvm::Value const*) (../../sea-dsa/build/run/bin/seadsa+0x466d27)
#20 0x000000000046626b llvm::BasicAAResult::alias(llvm::MemoryLocation const&, llvm::MemoryLocation const&, llvm::AAQueryInfo&) (../../sea-dsa/build/run/bin/seadsa+0x46626b)
#21 0x0000000000453e92 llvm::AAResults::alias(llvm::MemoryLocation const&, llvm::MemoryLocation const&) (../../sea-dsa/build/run/bin/seadsa+0x453e92)
#22 0x000000000045b4f6 llvm::AAEvaluator::runInternal(llvm::Function&, llvm::AAResults&) (../../sea-dsa/build/run/bin/seadsa+0x45b4f6)
#23 0x000000000045ea13 llvm::AAEvalLegacyPass::runOnFunction(llvm::Function&) (../../sea-dsa/build/run/bin/seadsa+0x45ea13)
#24 0x00000000008fd1e6 llvm::FPPassManager::runOnFunction(llvm::Function&) (../../sea-dsa/build/run/bin/seadsa+0x8fd1e6)
#25 0x00000000008fd463 llvm::FPPassManager::runOnModule(llvm::Module&) (../../sea-dsa/build/run/bin/seadsa+0x8fd463)
#26 0x00000000008fd910 llvm::legacy::PassManagerImpl::run(llvm::Module&) (../../sea-dsa/build/run/bin/seadsa+0x8fd910)
#27 0x0000000000450eb5 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::empty() const /usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/basic_string.h:1022:29
#28 0x0000000000450eb5 main [...]sea-dsa/tools/seadsa.cc:234:26
#29 0x00007f4f0d0d3bf7 __libc_start_main /build/glibc-S9d2JN/glibc-2.27/csu/../csu/libc-start.c:344:0
#30 0x00000000004502fa _start (../../sea-dsa/build/run/bin/seadsa+0x4502fa)
Stack dump:
0.	Program arguments: ../../sea-dsa/build/run/bin/seadsa -sea-dsa=butd-cs --sea-dsa-aa-eval ./main.bc 
1.	Running pass 'Function Pass Manager' on module './main.bc'.
2.	Running pass 'Exhaustive Alias Analysis Precision Evaluator' on function '@_Z17decompress_customPhmPS_jj'
[1]    16382 segmentation fault  ../../sea-dsa/build/run/bin/seadsa -sea-dsa=butd-cs --sea-dsa-aa-eval 

This is a release version, but the tool would have crashed on:

assert(!base.isNull());

Analysis

It seems that the crash is caused by the tool analysing a load instruction that accesses a %23 = getelementptr inbounds i8*, i8** null, i64 %22 that has a null pointer as base. This instruction is caused by a combination of the NULL checker in UBSan (that checks the Use of a null pointer or creation of a null reference) and the (LoopUnswitch)[https://llvm.org/docs/Passes.html#loop-unswitch-unswitch-loops] pass executed by -O3 that creates the alternative slow path in the loop with the GEP using null. Notice that the program has no path in which pData may be null, but the check is still present due to UBSan.

Fail to run an example

I have successfully built seadsa with LLVM-3.8, and got the executable program.
I want to run the example is tests/test-ext-1.ll, but it fails:

$ seadsa -dsa-stats test-ext-1.ll 
#0 0x0000000000e868f5 llvm::sys::PrintStackTrace(llvm::raw_ostream&) (/usr/local/bin/seadsa+0xe868f5)
#1 0x0000000000e84916 llvm::sys::RunSignalHandlers() (/usr/local/bin/seadsa+0xe84916)
#2 0x0000000000e84b15 SignalHandler(int) (/usr/local/bin/seadsa+0xe84b15)
#3 0x00007ff1f98ec340 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x10340)
#4 0x00000000009ade60 llvm::ValueHandleBase::AddToExistingUseList(llvm::ValueHandleBase**) (/usr/local/bin/seadsa+0x9ade60)
#5 0x000000000068e0f7 llvm::ValueHandleBase::ValueHandleBase(llvm::ValueHandleBase::HandleBaseKind, llvm::ValueHandleBase const&) (/usr/local/bin/seadsa+0x68e0f7)
#6 0x00000000006f732d llvm::WeakVH::WeakVH(llvm::WeakVH const&) (/usr/local/bin/seadsa+0x6f732d)
#7 0x00000000006ff58b std::pair<llvm::WeakVH, llvm::CallGraphNode*>::pair(std::pair<llvm::WeakVH, llvm::CallGraphNode*> const&) (/usr/local/bin/seadsa+0x6ff58b)
#8 0x00000000006ff5c9 llvm::mapped_iterator<__gnu_cxx::__normal_iterator<std::pair<llvm::WeakVH, llvm::CallGraphNode*>*, std::vector<std::pair<llvm::WeakVH, llvm::CallGraphNode*>, std::allocator<std::pair<llvm::WeakVH, llvm::CallGraphNode*> > > >, std::pointer_to_unary_function<std::pair<llvm::WeakVH, llvm::CallGraphNode*>, llvm::CallGraphNode*> >::operator*() const (/usr/local/bin/seadsa+0x6ff5c9)
#9 0x00000000006fdbb1 llvm::scc_iterator<llvm::CallGraph*, llvm::GraphTraits<llvm::CallGraph*> >::DFSVisitChildren() (/usr/local/bin/seadsa+0x6fdbb1)
#10 0x00000000006faeb3 llvm::scc_iterator<llvm::CallGraph*, llvm::GraphTraits<llvm::CallGraph*> >::GetNextSCC() (/usr/local/bin/seadsa+0x6faeb3)
#11 0x00000000006fd709 llvm::scc_iterator<llvm::CallGraph*, llvm::GraphTraits<llvm::CallGraph*> >::scc_iterator(llvm::CallGraphNode*) (/usr/local/bin/seadsa+0x6fd709)
#12 0x00000000006fab8b llvm::scc_iterator<llvm::CallGraph*, llvm::GraphTraits<llvm::CallGraph*> >::begin(llvm::CallGraph* const&) (/usr/local/bin/seadsa+0x6fab8b)
#13 0x00000000006f84f8 llvm::scc_iterator<llvm::CallGraph*, llvm::GraphTraits<llvm::CallGraph*> > llvm::scc_begin<llvm::CallGraph*>(llvm::CallGraph* const&) (/usr/local/bin/seadsa+0x6f84f8)
#14 0x00000000007068ec sea_dsa::BottomUpAnalysis::runOnModule(llvm::Module&, llvm::DenseMap<llvm::Function const*, std::shared_ptr<sea_dsa::Graph>, llvm::DenseMapInfo<llvm::Function const*>, llvm::detail::DenseMapPair<llvm::Function const*, std::shared_ptr<sea_dsa::Graph> > >&) (/usr/local/bin/seadsa+0x7068ec)
#15 0x00000000006f5411 sea_dsa::ContextSensitiveGlobalAnalysis::runOnModule(llvm::Module&) (/usr/local/bin/seadsa+0x6f5411)
#16 0x00000000006e6074 sea_dsa::DsaAnalysis::runOnModule(llvm::Module&) (/usr/local/bin/seadsa+0x6e6074)
#17 0x0000000000990fa0 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/usr/local/bin/seadsa+0x990fa0)
#18 0x000000000068ce04 main (/usr/local/bin/seadsa+0x68ce04)
#19 0x00007ff1f8d18ec5 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21ec5)
#20 0x000000000068bf6c _start (/usr/local/bin/seadsa+0x68bf6c)
Stack dump:
0.	Program arguments: seadsa -dsa-stats test-ext-1.ll 
1.	Running pass 'SeaHorn Dsa analysis: entry point for all Dsa clients' on module 'test-ext-1.ll'.
Segmentation fault (core dumped)

How to solve it?

By the way, can sea-dsa analyze the LLVM bytecode of a Linux kernel driver?
I want to use sea-dsa to print the function referenced by a function pointer in the driver code. Can sea-dsa do it?

Thanks in advance!

Caller not simulated by callee fails on llvm-dis

Error message:

ERROR 3: callee is not simulated by caller at   call fastcc void @_ZNK4llvm6DIFile14getRawChecksumEv(%"class.llvm::Optional.296"* sret %_call3, %"class.llvm::DIFile"* %0) #2
	Formal param %"class.llvm::Optional.296"* %0
	Actual param   %_call3 = getelementptr inbounds %"struct.llvm::MDNodeKeyImpl.445", %"struct.llvm::MDNodeKeyImpl.445"* %_3, i64 0, i32 2
	Callee cell=<24, Node 0xc6a8af0: flags=[SMR] size=41 types={40:i8} links=[]  alloca sites=[]>
	Caller cell=<24, Node 0xc6cd560: flags=[SAMR] size=8 types={0:i32} links=[<0, i8*>->(0,0xc6ccf10)]  alloca sites=[  %_3 = alloca %"struct.llvm::MDNodeKeyImpl.445", align 8]>

Repro (seahorn branch brunch):

cd points-to-bitcode/5.0/clang
unpack llvm-dis.zip
seapp ./temps/llvm-dis.n.pp.bc -o ./temps/llvm-dis.n.pp.smc.bc --smc --sea-dsa-type-aware=true --smc-analyze-only --smc-check-threshold=400 --sea-dsa=butd-cs --smc-pta=dsa --print-smc-summary --log=brunch --log=brunch-progress.off --log=smc.off --log=dsa.off --log=alloc-sites.off

llvm-dis.zip

Question about inconsistency before/after LLVM's mem2reg optimization

Hello sea-dsa developers,

I still observed some crashes in SV-COMP benchmarks even after the recent fixes. So I used c-reduce to produce a minimal C program for debugging. I observed some inconsistency with respect to LLVM's mem2reg optimization. Consider the following C program that c-reduce generates (let's temporary ignore the undefineness of this program and focus on the relevant part)

struct a {
  long b
};
union c {
  struct a d
};
e;
struct f {
  union c g
} h() {
  struct f *i;
  if (i->g.d.b)
    e = h;
}

The following LLVM-IR program is produced by the command: clang-8 -O0 -c -emit-llvm -S -Xclang -disable-O0-optnone floppy_true.i.cil.c,

; ModuleID = 'floppy_true.i.cil.c'
source_filename = "floppy_true.i.cil.c"
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-pc-linux-gnu"

%struct.f = type { %union.c }
%union.c = type { %struct.a }
%struct.a = type { i64 }

@e = common dso_local global i32 0, align 4

; Function Attrs: noinline nounwind uwtable
define dso_local i64 @h() #0 {
  %1 = alloca %struct.f, align 8
  %2 = alloca %struct.f*, align 8
  %3 = load %struct.f*, %struct.f** %2, align 8
  %4 = getelementptr inbounds %struct.f, %struct.f* %3, i32 0, i32 0
  %5 = bitcast %union.c* %4 to %struct.a*
  %6 = getelementptr inbounds %struct.a, %struct.a* %5, i32 0, i32 0
  %7 = load i64, i64* %6, align 8
  %8 = icmp ne i64 %7, 0
  br i1 %8, label %9, label %10

; <label>:9:                                      ; preds = %0
  store i32 ptrtoint (i64 ()* @h to i32), i32* @e, align 4
  br label %10

; <label>:10:                                     ; preds = %9, %0
  %11 = getelementptr inbounds %struct.f, %struct.f* %1, i32 0, i32 0
  %12 = getelementptr inbounds %union.c, %union.c* %11, i32 0, i32 0
  %13 = getelementptr inbounds %struct.a, %struct.a* %12, i32 0, i32 0
  %14 = load i64, i64* %13, align 8
  ret i64 %14
}

attributes #0 = { noinline nounwind uwtable "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }

!llvm.module.flags = !{!0}
!llvm.ident = !{!1}

!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{!"clang version 8.0.1- (branches/release_80)"}

sea-dsa's release build works fine for this program. However, if I enable the mem2reg optimization using the command: opt-8 floppy_true.i.cil.ll -mem2reg -S -o floppy_true.i.cil.ll, sea-dsa crashes on the following optimized program,

; ModuleID = 'floppy_true.i.cil.ll'
source_filename = "floppy_true.i.cil.c"
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-pc-linux-gnu"

%struct.f = type { %union.c }
%union.c = type { %struct.a }
%struct.a = type { i64 }

@e = common dso_local global i32 0, align 4

; Function Attrs: noinline nounwind uwtable
define dso_local i64 @h() #0 {
  %1 = alloca %struct.f, align 8
  %2 = getelementptr inbounds %struct.f, %struct.f* undef, i32 0, i32 0
  %3 = bitcast %union.c* %2 to %struct.a*
  %4 = getelementptr inbounds %struct.a, %struct.a* %3, i32 0, i32 0
  %5 = load i64, i64* %4, align 8
  %6 = icmp ne i64 %5, 0
  br i1 %6, label %7, label %8

; <label>:7:                                      ; preds = %0
  store i32 ptrtoint (i64 ()* @h to i32), i32* @e, align 4
  br label %8

; <label>:8:                                      ; preds = %7, %0
  %9 = getelementptr inbounds %struct.f, %struct.f* %1, i32 0, i32 0
  %10 = getelementptr inbounds %union.c, %union.c* %9, i32 0, i32 0
  %11 = getelementptr inbounds %struct.a, %struct.a* %10, i32 0, i32 0
  %12 = load i64, i64* %11, align 8
  ret i64 %12
}

attributes #0 = { noinline nounwind uwtable "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }

!llvm.module.flags = !{!0}
!llvm.ident = !{!1}

!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{!"clang version 8.0.1- (branches/release_80)"}

It appears to me mem2reg does a fairly reasonable optimization here by converting access to uninitialized location into undef. So I'd expect sea-dsa not to crash on the resulting program.

Handling of programs compiled with higher optimization levels

Hello SeaDsa developers,

I attempted to use SeaDsa (the llvm-8.0 branch) on a program that I compiled with -O1 and the resulting graphs seem spurious. For example,

#include <assert.h> 

int* A;

typedef struct {
  int* a;
  long b;
} S;

S* foo() {
  S* x = (S*)malloc(sizeof(S));
  x->a = A;
  assert(1);
  return x;
}

int main() {
  int a = 1;
  A = &a;
  S* y = foo();
  assert(*(y->a) == 1);
  return 0;
}

The graph is obtained using the following commands

clang -O1 -c -emit-llvm -S test1.c
~/sea-dsa/build/run/bin/seadsa -sea-dsa=butd-cs -sea-dsa-type-aware -sea-dsa-dot test1.ll  -sea-dsa-dot-outdir=test1
(cd test1 ; dot -Tpdf main.mem.dot -o main.mem.pdf)

main.mem.pdf

If I understand it correctly, *A, %1, and %5 in the main function should point to the same memory object whereas they are not in the graph.

If I compile the file using O0, then the graph generated seems correct as *A and the GEPed pointer returned from foo both points to the same object.
main.mem.pdf

Please let me know if you can reproduce it or this is expected.

Wrong handling of global pointers in constant structs

Consider the following llvm program:

source_filename = "/tmp/st.c"
target datalayout = "e-m:o-p:32:32-f64:32:64-f80:128-n8:16:32-S128"
target triple = "i386-apple-macosx10.13.0"

%struct.s = type { i16*, i32 } ; struct with a ptr to i16 and a i32 int
@g.ptr =  internal unnamed_addr global i16 42, align 4
declare void @verifier.assume.not(i1)
declare void @seahorn.fail()


; Function Attrs: noreturn
declare void @verifier.error() #0

; Function Attrs: noinline nounwind ssp uwtable
define i32 @main() local_unnamed_addr #2 {
entry:
  %v1 = load i16, i16* @g.ptr
  %v2.p = extractvalue %struct.s {i16* @g.ptr, i32 0}, 0
  %v2 = load i16, i16* %v2.p
  %cond = icmp eq i16 %v1, %v2
  call void @verifier.assume.not(i1 %cond)
  br label %verifier.error

verifier.error:                                   ; preds = %entry
  call void @seahorn.fail()
  br label %verifier.error.split

verifier.error.split:                             ; preds = %verifier.error
  ret i32 42
}

attributes #0 = { noreturn }
attributes #1 = { argmemonly nounwind }
attributes #2 = { noinline nounwind ssp uwtable "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="penryn" "target-features"="+cx16,+fxsr,+mmx,+sse,+sse2,+sse3,+sse4.1,+ssse3,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #3 = { nounwind }

!llvm.module.flags = !{!0, !1, !2}
!llvm.ident = !{!3}

!0 = !{i32 1, !"NumRegisterParameters", i32 0}
!1 = !{i32 1, !"wchar_size", i32 4}
!2 = !{i32 7, !"PIC Level", i32 2}
!3 = !{!"clang version 5.0.2 (tags/RELEASE_502/final)"}

g.ptr must alias v2.p, but sea-dsa says it does not.
Seems like the initialization of constant struct is ignored.
For convenience, the memory graph looks like this:

digraph unnamed {
        graph [center=true, ratio=true, bgcolor=lightgray, fontname=Helvetica];
        node  [fontname=Helvetica, fontsize=11];

        Node0x1516560 [shape=record,label="{\{0:i16\}:R}"];
        Node0x152a240 [shape=record,label="{\{0:i16*\}:R|{<s0>\<0, i16*\>}}"];
        Node0x152af40 [shape=record,label="{\{0:i16\}:SR}"];
        Node0x15120e8 [shape=plaintext, label ="v2.p"];
        Node0x15120e8 -> Node0x152af40[arrowtail=tee,label="0",fontsize=8,color=purple];
        Node0x1511ea0 [shape=plaintext, label ="%struct.s \{ i16* @g.ptr, i32 0 \}"];
        Node0x1511ea0 -> Node0x152a240[arrowtail=tee,label="0",fontsize=8,color=purple];
        Node0x1510f68 [shape=plaintext, label ="g.ptr"];
        Node0x1510f68 -> Node0x1516560[arrowtail=tee,label="0",fontsize=8,color=purple];
        Node0x152a240:s0 -> Node0x152af40[arrowtail=tee,label="0, i16*",fontsize=8];
}

shadow mem instruction missing for read-only nodes when only reachable from ret cell

$ seahorn --sea-dsa-stats --horn-nondet-undef --keep-shadows=true --sea-dsa-type-aware --sea-dsa=cs -horn-inter-proc -horn-sem-lvl=mem --horn-step=large --horn-solve --horn-global-constraints=true --horn-stats --horn-singleton-aliases=true --horn-pdr-contexts=600 --horn-ignore-calloc=true --horn-ignore-memset=true --horn-shadow-mem-optimize=false --horn-iuc=1 --horn-iuc-arith=1 --horn-weak-abs=true --horn-use-mbqi=true only-ret.bc --sim-dot
$ dot -Tpdf find_anchor_wl_entry.mem.dot -o find_anchor_wl_entry.mem.dot.pdf

In that graph you can see a node only reachable from the return cell that is only read in function find_anchor_wl_entry.
Screenshot 2020-09-10 at 09 46 00

But no shadow mem annotation:

_1502:                                            ; preds = %_1496
  %_1503 = getelementptr inbounds %struct.ubi_device, %struct.ubi_device* %.0460, i64 0, i32 36
  %_1504 = tail call fastcc %struct.ubi_wl_entry* @find_anchor_wl_entry() #9
  %_1505 = icmp eq %struct.ubi_wl_entry* %_1504, null
  br i1 %_1505, label %ubi_wl_get_fm_peb.exit969, label %_1506

Sources: only-ret.zip

Turn recent repros into test cases

We have recently improved handling of corner cases in sea-dsa. These included dealing with inttoptr and ptrtoint instructions, working around memcpy introduced by clang for copy-by-value of aggregate types, and working around memcpy introduced to move large fields (i.e., fixed sized array fields).

For each one, we had a small example that illustrated the problem. These should be added as test cases to DSA.

Question about ptrtoint nodetype

Hello sea-dsa developers,

Consider the following example,

#include <stdint.h>
#include <stdlib.h>
#include <stdbool.h>

void __VERIFIER_assert(bool);

int main(void) {
  int *a = (int *)malloc(sizeof(int));
  intptr_t b;

  b = (intptr_t)a;
  *((int *)b) = 1;
  __VERIFIER_assert(*a == 1);
  free(a);
  return 0;
}

The attached graph is generated using the following commands,

clang-8 -O0 -c -emit-llvm -S -Xclang -disable-O0-optnone pointers5.c
opt-8 pointers5.ll -mem2reg -S -o pointers5.ll
seadsa -sea-dsa=ci -sea-dsa-type-aware -sea-dsa-dot pointers5.ll

main.mem.pdf

In this graph, the node %4 points to is labeled as SMP whereas the node %1 points to is not considered as a node having the ptrtoint type. Although I know that we should probably try to eliminate ptrtoint/inttoptr instructions first, I was wondering if the node %1 points to should have ptrtoint type.

Add a way to assign a label for sequential memory in memory graph

When I created a memory graph for a program that uses strcat, the memory graph looks different when the definition for strcat is linked and unlinked. strcat takes two strings (s1 and s2), adds s2 to s1 and outputs s1. As a result, in the memory graph, the output points to the same box as the input and the box is labeled "Sequence SM". I was able to get %2(s1) and %8(output) to point to the same box by creating the following stub function:

char *strcat_stub(char *restrict dest, const char *restrict src){
	void* s = sea_dsa_new();
	sea_dsa_alias(s,(void*)dest);
	return (char*) s;
}

However, the box is still labeled SM instead of Sequence SM. There should be a way to assign this label so that the stub can properly model the way strcat uses memory.

Here are the memory graphs:
linked.pdf
unlinked.pdf

stub.pdf
Corresponding .ll file for stub.pdf:
strcat_stub.zip

Disable warning messages

Hello sea-dsa developers,

I'm seeing warning messages like WARNING: inttoptr @ addr 0x27b7cc8 is (unsoundly) assumed to point to a fresh memory region. printed out. I was wondering if it's possible to disable them since it's not very useful for non-developers.

Sea-Dsa specs compilation does not work on a mac

Mac uses AppleClang and requires special switches to compile C code.
The current recipe in tools/CMakeLists.txt embeds Linux specific compilation options and fails on other architectures.

CMake already knows what are the right switches. It should be used to set the compilation command

sea-dsa build failed with llvm-3.8

Hi
I have failed to build sea-dsa with llvm-3.8. Here is my system configuration:

OS: 4.4.0-112-generic #135-Ubuntu SMP x86_64 x86_64 x86_64 GNU/Linux

clang/llvm: http://releases.llvm.org/3.8.0/llvm-3.8.0.src.tar.xz, http://releases.llvm.org/3.8.0/cfe-3.8.0.src.tar.xz

They have been build and install successfully.
image

Next, I have done:
cmake -DCMAKE_INSTALL_PREFIX=\~/static-cfg/llvm-dsa/sea-dsa/install -DLLVM_DIR=~/static-cfg/build/share/llvm/cmake ..
and when try the following:
cmake --build . --target install
It failed as following:
image
Here is the error:
**undefined reference to typeinfo for llvm::cl::GenericOptionValue'**`

details:
CMakeFiles/seadsa.dir/seadsa.cc.o:(.data.rel.ro._ZTIN4llvm2cl3optIN7sea_dsa12SeaDsaLogOptELb1ENS0_6parserINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEEEE[_ZTIN4llvm2cl3optIN7sea_dsa12SeaDsaLogOptELb1ENS0_6parserINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEEEE]+0x18): undefined reference to typeinfo for llvm::cl::Option' CMakeFiles/seadsa.dir/seadsa.cc.o:(.data.rel.ro._ZTIN4llvm2cl15OptionValueBaseIN7sea_dsa12SeaDsaLogOptELb1EEE[_ZTIN4llvm2cl15OptionValueBaseIN7sea_dsa12SeaDsaLogOptELb1EEE]+0x10): undefined reference to typeinfo for llvm::cl::GenericOptionValue'
CMakeFiles/seadsa.dir/seadsa.cc.o:(.data.rel.ro._ZTIN4llvm2cl15OptionValueCopyIbEE[_ZTIN4llvm2cl15OptionValueCopyIbEE]+0x10): undefined reference to typeinfo for llvm::cl::GenericOptionValue' CMakeFiles/seadsa.dir/seadsa.cc.o:(.data.rel.ro._ZTIN4llvm2cl15OptionValueCopyINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEE[_ZTIN4llvm2cl15OptionValueCopyINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEE]+0x10): undefined reference to typeinfo for llvm::cl::GenericOptionValue'
../lib/libSeaDsaAnalysis.a(DsaStats.cc.o):(.data.rel.ro._ZTIN7sea_dsa17DsaPrintStatsPassE[_ZTIN7sea_dsa17DsaPrintStatsPassE]+0x10): undefined reference to typeinfo for llvm::ModulePass' ../lib/libSeaDsaAnalysis.a(DsaPrinter.cc.o):(.data.rel.ro._ZTIN7sea_dsa9DsaViewerE[_ZTIN7sea_dsa9DsaViewerE]+0x10): undefined reference to typeinfo for llvm::ModulePass'
../lib/libSeaDsaAnalysis.a(DsaPrinter.cc.o):(.data.rel.ro._ZTIN7sea_dsa10DsaPrinterE[_ZTIN7sea_dsa10DsaPrinterE]+0x10): undefined reference to typeinfo for llvm::ModulePass' ../lib/libSeaDsaAnalysis.a(DsaInfo.cc.o):(.data.rel.ro._ZTIN7sea_dsa11DsaInfoPassE[_ZTIN7sea_dsa11DsaInfoPassE]+0x10): undefined reference to typeinfo for llvm::ModulePass'
../lib/libSeaDsaAnalysis.a(DsaAnalysis.cc.o):(.data.rel.ro._ZTIN4llvm2cl3optIN7sea_dsa18GlobalAnalysisKindELb0ENS0_6parserIS3_EEEE[_ZTIN4llvm2cl3optIN7sea_dsa18GlobalAnalysisKindELb0ENS0_6parserIS3_EEEE]+0x18): undefined reference to typeinfo for llvm::cl::Option' ../lib/libSeaDsaAnalysis.a(DsaAnalysis.cc.o):(.data.rel.ro._ZTIN4llvm2cl6parserIN7sea_dsa18GlobalAnalysisKindEEE[_ZTIN4llvm2cl6parserIN7sea_dsa18GlobalAnalysisKindEEE]+0x10): undefined reference to typeinfo for llvm::cl::generic_parser_base'
../lib/libSeaDsaAnalysis.a(DsaAnalysis.cc.o):(.data.rel.ro._ZTIN4llvm2cl15OptionValueCopyIN7sea_dsa18GlobalAnalysisKindEEE[_ZTIN4llvm2cl15OptionValueCopyIN7sea_dsa18GlobalAnalysisKindEEE]+0x10): undefined reference to typeinfo for llvm::cl::GenericOptionValue' ../lib/libSeaDsaAnalysis.a(DsaAnalysis.cc.o):(.data.rel.ro._ZTIN7sea_dsa11DsaAnalysisE[_ZTIN7sea_dsa11DsaAnalysisE]+0x10): undefined reference to typeinfo for llvm::ModulePass'
../lib/libSeaDsaAnalysis.a(NameValues.cc.o):(.data.rel.ro._ZTIN7sea_dsa10NameValuesE[_ZTIN7sea_dsa10NameValuesE]+0x10): undefined reference to typeinfo for llvm::ModulePass' ../lib/libSeaDsaAnalysis.a(RemovePtrToInt.cc.o):(.data.rel.ro._ZTIN7sea_dsa14RemovePtrToIntE[_ZTIN7sea_dsa14RemovePtrToIntE]+0x10): undefined reference to typeinfo for llvm::FunctionPass'
../lib/libSeaDsaAnalysis.a(DsaGlobal.cc.o):(.data.rel.ro._ZTIN7sea_dsa13DsaGlobalPassE[_ZTIN7sea_dsa13DsaGlobalPassE]+0x10): undefined reference to typeinfo for llvm::ModulePass' ../lib/libSeaDsaAnalysis.a(DsaBottomUp.cc.o):(.data.rel.ro._ZTIN7sea_dsa8BottomUpE[_ZTIN7sea_dsa8BottomUpE]+0x10): undefined reference to typeinfo for llvm::ModulePass'
../lib/libSeaDsaAnalysis.a(DsaLocal.cc.o):(.data.rel.ro._ZTIN7sea_dsa5LocalE[_ZTIN7sea_dsa5LocalE]+0x10): undefined reference to `typeinfo for llvm::ModulePass'
collect2: error: ld returned 1 exit status
tools/CMakeFiles/seadsa.dir/build.make:121: recipe for target 'tools/seadsa' failed
make[2]: *** [tools/seadsa] Error 1
CMakeFiles/Makefile2:1213: recipe for target 'tools/CMakeFiles/seadsa.dir/all' failed
make[1]: *** [tools/CMakeFiles/seadsa.dir/all] Error 2
Makefile:129: recipe for target 'all' failed
make: *** [all] Error 2
Any suggestion will be appreciated.

sea-dsa crashes on a small example

command line:

seadsa ./oggenc2.ll --oll=/tmp/out.ll --dsa-stats --sea-dsa=cs --sea-dsa-stats=true -log=dsa-global --log=dsa-bu

Input file oggenc2.ll:

; ModuleID = './oggenc.ll'
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

@optarg = external global i8*, align 8
@nextchar = internal unnamed_addr global i32* null, align 8
@llvm.used = appending global [4 x i8*] [i8* bitcast (void ()* @seahorn.fail to i8*), i8* bitcast (void (i1)* @verifier.assume to i8*), i8* bitcast (void (i1)* @verifier.assume.not to i8*), i8* bitcast (void ()* @verifier.error to i8*)], section "llvm.metadata"

; Function Attrs: nounwind uwtable
define internal fastcc i32 @fn1() unnamed_addr #0 {
_1:
  %_2 = call i32 @verifier.nondet.1() #3
  %_3 = load i8*, i8** @optarg, align 8, !tbaa !1
  %_ret = tail call i32 (i8*, ...) bitcast (i32 (...)* @optstrdup to i32 (i8*, ...)*)(i8* %_3) #3
  ret i32 %_2
}

declare i32 @optstrdup(...) #1

; Function Attrs: nounwind uwtable
define internal fastcc i32 @fn2() unnamed_addr #0 {
_1:
  %_2 = call i32 @verifier.nondet.1() #3
  %_3 = tail call i32 (...) @fn3() #3
  %_4 = load i32*, i32** @nextchar, align 8, !tbaa !1
  %_5 = sext i32 %_3 to i64
  %_6 = getelementptr inbounds i32, i32* %_4, i64 %_5
  %_store = getelementptr inbounds i32, i32* %_6, i64 1
  store i32* %_store, i32** @nextchar, align 8, !tbaa !1
  store i32* %_store, i32** bitcast (i8** @optarg to i32**), align 8, !tbaa !1
  ret i32 %_2
}

declare i32 @fn3(...) #1

define i32 @main() {
_store:
  store i32* null, i32** @nextchar, align 8
  %_2 = call fastcc i32 @fn1()
  %_ret = call fastcc i32 @fn2()
  ret i32 42
}

declare void @verifier.assume(i1)

declare void @verifier.assume.not(i1)

declare void @seahorn.fail()

; Function Attrs: noreturn
declare void @verifier.error() #2

declare i32 @verifier.nondet.1()

attributes #0 = { nounwind uwtable "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+fxsr,+mmx,+sse,+sse2" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+fxsr,+mmx,+sse,+sse2" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #2 = { noreturn }
attributes #3 = { nounwind }

!llvm.ident = !{!0}

!0 = !{!"clang version 3.8.1 (tags/RELEASE_381/final)"}
!1 = !{!2, !2, i64 0}
!2 = !{!"any pointer", !3, i64 0}
!3 = !{!"omnipotent char", !4, i64 0}
!4 = !{!"Simple C/C++ TBAA"}

Segment fault on GlobalValue.h:575:32

Hi, for the following bitcode (LLVM 10)

error.bc.zip

sea-dsa throws a segment fault

[]$ /home/legend/sea-dsa/build/bin/seadsa --sea-dsa=butd-cs  --sea-dsa-aa-eval err.bc
 #0 0x00005627bc485fab llvm::sys::PrintStackTrace(llvm::raw_ostream&) (/home/legend/sea-dsa/build/bin/seadsa+0xe23fab)
 #1 0x00005627bc483b54 llvm::sys::RunSignalHandlers() (/home/legend/sea-dsa/build/bin/seadsa+0xe21b54)
 #2 0x00005627bc483ca9 SignalHandler(int) (/home/legend/sea-dsa/build/bin/seadsa+0xe21ca9)
 #3 0x00007efd458840f0 __restore_rt (/usr/lib/libpthread.so.0+0x140f0)
 #4 0x00005627bbf97881 llvm::GlobalValue::getParent() /usr/include/llvm/IR/GlobalValue.h:575:32
 #5 0x00005627bbf97881 getModuleFromQuery /home/legend/sea-dsa/lib/seadsa/SeaDsaAliasAnalysis.cc:50:33
 #6 0x00005627bbf97881 seadsa::SeaDsaAAResult::alias(llvm::MemoryLocation const&, llvm::MemoryLocation const&, llvm::AAQueryInfo&) /home/legend/sea-dsa/lib/seadsa/SeaDsaAliasAnalysis.cc:144:39
 #7 0x00005627bb7448ab llvm::AAResults::alias(llvm::MemoryLocation const&, llvm::MemoryLocation const&, llvm::AAQueryInfo&) (/home/legend/sea-dsa/build/bin/seadsa+0xe28ab)
 #8 0x00005627bb760b5e llvm::BasicAAResult::aliasCheck(llvm::Value const*, llvm::LocationSize, llvm::AAMDNodes, llvm::Value const*, llvm::LocationSize, llvm::AAMDNodes, llvm::AAQueryInfo&, llvm::Value const*, llvm::Value const*) (/home/legend/sea-dsa/build/bin/seadsa+0xfeb5e)
 #9 0x00005627bb763243 llvm::BasicAAResult::alias(llvm::MemoryLocation const&, llvm::MemoryLocation const&, llvm::AAQueryInfo&) (/home/legend/sea-dsa/build/bin/seadsa+0x101243)
#10 0x00005627bb7448ab llvm::AAResults::alias(llvm::MemoryLocation const&, llvm::MemoryLocation const&, llvm::AAQueryInfo&) (/home/legend/sea-dsa/build/bin/seadsa+0xe28ab)
#11 0x00005627bb744bcf llvm::AAResults::alias(llvm::MemoryLocation const&, llvm::MemoryLocation const&) (/home/legend/sea-dsa/build/bin/seadsa+0xe2bcf)
#12 0x00005627bb74f5f0 llvm::AAEvaluator::runInternal(llvm::Function&, llvm::AAResults&) (/home/legend/sea-dsa/build/bin/seadsa+0xed5f0)
#13 0x00005627bb751e14 llvm::AAEvalLegacyPass::runOnFunction(llvm::Function&) (/home/legend/sea-dsa/build/bin/seadsa+0xefe14)
#14 0x00005627bbd64ca8 llvm::FPPassManager::runOnFunction(llvm::Function&) (/home/legend/sea-dsa/build/bin/seadsa+0x702ca8)
#15 0x00005627bbd65e2d llvm::FPPassManager::runOnModule(llvm::Module&) (/home/legend/sea-dsa/build/bin/seadsa+0x703e2d)
#16 0x00005627bbd661c0 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/home/legend/sea-dsa/build/bin/seadsa+0x7041c0)
#17 0x00005627bb7066f3 main /home/legend/sea-dsa/tools/seadsa.cc:234:3
#18 0x00007efd452f4152 __libc_start_main (/usr/lib/libc.so.6+0x28152)
#19 0x00005627bb73dabe _start (/home/legend/sea-dsa/build/bin/seadsa+0xdbabe)
Stack dump:
0.	Program arguments: /home/legend/sea-dsa/build/bin/seadsa --sea-dsa=butd-cs --sea-dsa-aa-eval /tmp/diffpts/crash/diff_input-1_4.c.bc
1.	Running pass 'Function Pass Manager' on module '/tmp/diffpts/crash/diff_input-1_4.c.bc'.
2.	Running pass 'Exhaustive Alias Analysis Precision Evaluator' on function '@func_29'
Segmentation fault (core dumped)

Question: sea-dsa does not produce memory graph when main function has internal linkage type.

Hello sea-dsa developers,

I tried to run sea-dsa on a file that is processed with the LLVM internalization pass (opt flag -internalize). No memory graphs are produced. When I removed the internal keyword in the definition of main function, it can generate a correct one. So I was wondering if sea-dsa should reject this type of program since something is probably wrong with main function having internal linkage type. SMACK seems to have a need for the internalization pass so we hope there's a way to let sea-dsa process this type of program.

Problem installing sea-dsa

I've been following the instructions (at least I think so). I installed LLVM 3.8 from their official package.
Then I run the following:

cmake -DCMAKE_INSTALL_PREFIX=/home/vagrant/sea-dsa/install/ -DLLVM_DIR=/usr/share/llvm-3.8/cmake ..

This leads to the following error:

-- The C compiler identification is GNU 4.8.4
-- The CXX compiler identification is GNU 4.8.4
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Compiling sea-dsa without sanity checks
-- Boost version: 1.55.0
-- Could NOT find Subversion (missing:  Subversion_SVN_EXECUTABLE) 
CMake Error at /usr/share/cmake-2.8/Modules/ExternalProject.cmake:1241 (message):
  error: could not find svn for checkout of llvm
Call Stack (most recent call first):
  /usr/share/cmake-2.8/Modules/ExternalProject.cmake:1805 (_ep_add_download_command)
  CMakeLists.txt:51 (ExternalProject_Add)


-- Configuring incomplete, errors occurred!

It seems that LLVM was not found for some reason. Any ideas why?

Problematic copy of metadata in RemovePtrToIntPass

Hello sea-dsa developers,

It seems the copying of metadata in RemovePtrToIntPass is problematic. Specifically, load instructions can carry range metadata (https://llvm.org/docs/LangRef.html#range-metadata). So, if the source of the inttoptr instruction is a load instruction with such metadata, this pass would generate an invalid load instruction since the metadata is intended for an integer type. I have no idea why a compiler would produce such code but rustc does it.

How to query DsaInfo for Context Sentitive information

Hello,
what is the correct way to query context-sensitive information from DsaInfo in an llvm pass?
I'm currently querying DsaInfo through a wrapper similar to seahorn::SeaDsa in SimpleMemoryCheck.cc

For each pointer ptr accessing memory in a function F, I call DsaInfo::getDsaGraph(F), and then I obtain the corresponding Cell with getCell(ptr) from the graph, which gives me the Node and finally the AllocSites.

However, I would like to be able to query pointer information of ptr in F with different contexts. But I didn't find a way to specify the caller of F under which I would like to retrieve pointer information.

sample input

int glob1 = 0;
int glob2 = 0;

__attribute_noinline__ int foo(int* ptr) {
    return *ptr + 1;
}

__attribute_noinline__ int func1(int* ptr) {
    return foo(ptr);
}

__attribute_noinline__ int func2(int* ptr) {
    return foo(ptr);
}

int main(int argc, char** argv) {
    return func1(&glob1) + func2(&glob2);
}

Here for example I would be able to differentiate that the memory access in foo points to glob1 when called from func1, and to glob2 when called from func2.

However, compiling the example with -O1 -flto -g and analysing it in the custom pass (which either uses BottomUpTopDownGlobalAnalysis or ContextSensitiveGlobalAnalysis) printing the node that corresponds to *ptr in foo I get:

Node 0x...: flags=[tCSR] size=4 types={0:i32} links=[]  alloca sites=[@glob2 = internal global i32 0, align 4, !dbg !6,@glob1 = internal global i32 0, align 4, !dbg !0]

that contains both glob1 and glob2, which I think is expected.

But if I try to get the Graph of func1 or func2 with DsaInfo::getDsaGraph, there is no cell that corresponds to the llvm value of ptr. So how should I specify the callsite information while querying pointer information?

version

  • sea-dsa: current dev10 branch
  • llvm: 10

callee not simulated by caller

These examples produce a callee not simulated by caller error (each with a different number) callee-caller-errs.zip

seahorn options:
seahorn --sea-dsa-stats --horn-nondet-undef --keep-shadows=true --sea-dsa-type-aware --sea-dsa=cs -horn-inter-proc -horn-sem-lvl=mem --horn-step=large --horn-solve --horn-global-constraints=true --horn-stats --horn-singleton-aliases=true --horn-pdr-contexts=600 --horn-ignore-calloc=true --horn-ignore-memset=true --horn-shadow-mem-optimize=false --horn-iuc=1 --horn-iuc-arith=1 --horn-weak-abs=true --horn-use-mbqi=true --horn-fp-internal-writer=false <errX.bc>

Upgrade sea-dsa to support LLVM 12

Hello sea-dsa developers,

I'm trying to upgrade SMACK to support LLVM 12. So I'm also working on upgrading sea-dsa to support LLVM 12. Is it possible that you create a dev12 branch based on dev11 so I can submit a PR?

Thanks,
Shaobo

Behavior of `accessedTypes` with respect to memset/memcpy operations

Hello sea-dsa developers,

I encountered an issue where accessedTypes does not contain i8 for node that involves store a i64 and memset (please see the C program below). I was wondering if it's expected. The accessedTypes only contains a 'i64'.

#include <string.h>
#include "smack.h"

int main(void) {
  unsigned long long x = 0;
  memset(&x, 1, sizeof(x));
  assert(x == 1);
  return 0;
}

I also attached a bitcode file generated for this program. I used llvm8.

memset1.txt

Thanks,
Shaobo

Imprecise call graph

Hello sea-dsa developers,

sea-dsa of dev10 branch appears to generate an imprecise call graph for the attached C file. Specifically, parse_input should call at least c1. But the call graph produced and attached only shows that it calls strcmp.

To reproduce the call graph, please use the following commands,

clang -c -emit-llvm -S bftpd_2.i
seadsa --sea-dsa-callgraph-dot bftpd_2.ll
dot -Tpdf callgraph.dotdot -o callgraph.pdf

bftpd_2.txt
callgraph.pdf

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.