Giter VIP home page Giter VIP logo

Comments (26)

ivg avatar ivg commented on May 8, 2024

We do not support OCaml versions less than 4.01.0. What concerning the build errors, please follow instructions in the README. In particular you need to perform this steps:

$ oasis setup
$ ./configure --prefix=$(opam config var prefix)
$ make
$ make install

from bap.

belluru avatar belluru commented on May 8, 2024

Hi,

Thanks for your help.
I have installed oasis. I have followed the steps

oasis setup
./configure
make

When I run make command, I get an error

${llvm_mainlib}  variable is not set.

Kindly let me know what I might be missing.
I have installed llvm-3.4 using following command

"sudo yum install llvm-3.4"

from bap.

ivg avatar ivg commented on May 8, 2024

This "error" says that build system can't find llvm installation (I'm currently working on improving the actual message).

Build system relies on llvm-config executable to find out where the llvm is installed. This executable actually can be named differently on different systems, so that default value of llvm-config-3.4 doesn't work always. I do not know what is the name of llvm-config on your system, but if it is just an llvm-config then you can try:

$ ./configure --with-llvm-config=llvm-config

from bap.

belluru avatar belluru commented on May 8, 2024

Hi,

Your patch worked for me.
When I run make I get following error. I have searched on google, but I was not able to understand the proposed solution. Could you please help me in resolving the following error.

make
ocaml setup.ml -quiet -build 
Finished, 0 targets (0 cached) in 00:00:00.
File "_tags", line 560, characters 36-44:
Warning: the tag "pkg_core" is not used in any flag declaration, so it will have no effect; it may be a typo. Otherwise use `mark_tag_used` in your myocamlbuild.ml to disable this warning.
File "_tags", line 561, characters 36-50:
Warning: the tag "pkg_core_bench" is not used in any flag declaration, so it will have no effect; it may be a typo. Otherwise use `mark_tag_used` in your myocamlbuild.ml to disable this warning.
File "_tags", line 586, characters 25-33:
Warning: the tag "pkg_core" is not used in any flag declaration, so it will have no effect; it may be a typo. Otherwise use `mark_tag_used` in your myocamlbuild.ml to disable this warning.
File "_tags", line 587, characters 25-39:
Warning: the tag "pkg_core_bench" is not used in any flag declaration, so it will have no effect; it may be a typo. Otherwise use `mark_tag_used` in your myocamlbuild.ml to disable this warning.
+ /bin/c++ -std=c++11 -I/usr/local/include  -DNDEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -O3 -fomit-frame-pointer -std=c++11 -fvisibility-inlines-hidden -fno-exceptions -fno-rtti -fPIC -ffunction-sections -fdata-sections -Wcast-qual -c lib/bap_disasm/llvm_disasm.cpp -o lib/bap_disasm/llvm_disasm.o
lib/bap_disasm/llvm_disasm.cpp:1:32: fatal error: llvm/ADT/OwningPtr.h: No such file or directory
 #include <llvm/ADT/OwningPtr.h>
                                ^
compilation terminated.
Command exited with code 1.
Compilation unsuccessful after building 288 targets (278 cached) in 00:00:01.
make: *** [build] Error 1

from bap.

maverickwoo avatar maverickwoo commented on May 8, 2024

At present, we support LLVM 3.4 only. (OwningPtr.h has been removed in LLVM 3.5.)

from bap.

ivg avatar ivg commented on May 8, 2024

It looks like you have llvm-3.5 or later. We only support llvm-3.4 for now.

from bap.

belluru avatar belluru commented on May 8, 2024

Hi,

Thanks for your help. i was able to install BAP successfully.

I am trying to lift a binary file using toil command. I get the following error.
My aim is to lift a binary file to IL and then generate CFG.
Kindly guide me.

toil
bash: toil: command not found

from bap.

ivg avatar ivg commented on May 8, 2024

There is no toil command in a new BAP. There is a bap-mc command that currently can disassemble binary data. At this time it will not output BIL. If you're programming in OCaml, then you can use Disasm.Arm module to work with BIL directly. Otherwise, you need to wait until issue #4 is resolved (you maybe also interested in #40 and wiki for discussion). Also, there is a readbin application, shipped with BAP, that you can play with. It will output BIL for all function in an ARM binary.

from bap.

nanzano avatar nanzano commented on May 8, 2024

Hi,
I am new with binary analysis, so I followed all the steps indicated here https://github.com/BinaryAnalysisPlatform/bap to install Opam, Ocaml, Ocaml-top(editor), bap and every things got installed successfully. My goal is actually to write a simple code in Ocaml, generate the binary code(the executable file), use bap to disassemble it and generate a data flow graph. But I have the following issues:
1-)when I compile the code simple code given,

open Bap.Std 
let main project = print_endline "Hello, World"
let () = Project.register_pass "hello-world" main

I get an Error: "Unbound module Bap" and using the plugin command $bapbuild hello_world.plugin the following message appears:

<< + ocamlfind ocamlc -c -g -annot -bin-annot -short-paths -thread -syntax camlp4o -package re.posix -package fileutils -package cmdliner -package ezjsonm -package ocamlgraph -package variantslib.syntax -package fieldslib.syntax -package comparelib.syntax -package sexplib.syntax -package core_kernel -package bap.plugins -package bap -o hello_world.cmo hello_world.ml
File "hello_world.ml", line 3, characters 45-49:
Error: This expression has type project -> unit
       but an expression was expected of type project -> project
       Type unit is not compatible with type project 
Command exited with code 2.
Hint: Recursive traversal of subdirectories was not enabled for this build,
  as the working directory does not look like an ocamlbuild project (no
  '_tags' or 'myocamlbuild.ml' file). If you have modules in subdirectories,
  you should add the option "-r" or create an empty '_tags' file.
  To enable recursive traversal for some subdirectories only, you can use the
  following '_tags' file:
      true: -traverse
      <dir1> or <dir2>: traverse      
Compilation unsuccessful after building 2 targets (1 cached) in 00:00:04. >>

But by using the code

open Bap.Std 
let main project =print_endline "Hello, World"

I still get the unbound module bap error message but I am able to execute the command $bapbuild hello_world.plugin and $ bap /bin/ls -lhello-world without error message.
2-) Another problem is that it is says that we an immediate access to the decompiled binary by using the command $ bap /bin/ls -lhello-world, my question is how? I actually when to the /bin/ls files and disassemble it from an online disassembler "ODA" but could get any information from it regarding my file.

3-) I wrote another simple code with just
print_string "Hello world!\n";;
compiled it successfully with $ocamlc -o hello hello.ml , the problem is that I tried to disassemble the executable hello file, but bap-mc reads strings not files and readbin just give information about the file. Is there a proper way with bap to do it since toil tools given in the Documentation doesn't exist anymore in the new version of Bap.

Sorry for the long comment, I realy need some answers regarding it.
Thank you

from bap.

ivg avatar ivg commented on May 8, 2024

1-)when I compile the code simple code given,

open Bap.Std 
let main project = print_endline "Hello, World"
let () = Project.register_pass "hello-world" main

First of all, this is a plugin, and it should be compiled using bapbuild hello_world.plugin. There was a small typo in the example (fixed now), a correct version would be:

open Bap.Std 
let main project = print_endline "Hello, World"
let () = Project.register_pass' "hello-world" main

Look at this ' (prime) symbol. Basically, when you register a pass, you can either register it just for a side-effect, or as a real transformation. In the former case you use a function that ends with a prime (') and your function must return a value of type unit. This is described in more detail here.

2-) Another problem is that it is says that we an immediate access to the decompiled binary by using the command $ bap /bin/ls -lhello-world, my question is how? I actually when to the /bin/ls files and disassemble it from an online disassembler "ODA" but could get any information from it regarding my file.

Can you rephrase the question? I'm not sure that I understand you.

3-) I wrote another simple code with just
print_string "Hello world!\n";;
compiled it successfully with $ocamlc -o hello hello.ml , the problem is that I tried to disassemble the executable hello file, but bap-mc reads strings not files and readbin just give information about the file. Is there a proper way with bap to do it since toil tools given in the Documentation doesn't exist anymore in the new version of Bap.

  1. To compile plugins you need to use bapbuild tool
  2. bap-mc is a different tool, you need just to use bap.

My goal is actually to write a simple code in Ocaml, generate the binary code(the executable file), use bap to disassemble it and generate a data flow graph

Here are some tips, on how to proceed. You need to write a function, that accepts a value of type project and returns a value of type unit. Everything you need is stored in a project data structure. It contains the disassembled program, all memory regions, a program lifted into IR, and many more. Start from reading documentation. Taking special attention to a project module, that describes interface to a project data type (You don't need to create an instance yourself, it is provided as an input argument to the function, that you're registering with Project.register_pass'). Compile your code with

$ bapbuild yourcode.plugin

(assuming that you stored your code in yourcode.ml). Run it over your prepared binary with

$ bap -lyourcode yourbinary 

If run like this, a function, that you've registered Project.register_pass will be called on a value of type project that is a result of disassembly, lifting and semantic analysis of binary stored in a file named yourbinary.

You can see examples of plugins in our plugin repository: https://github.com/BinaryAnalysisPlatform/bap-plugins

from bap.

nanzano avatar nanzano commented on May 8, 2024

Thank you very much for the quick answer, I am a little bit sick, but I will reply you as soon as I feel better and work on it.
Thank you

from bap.

nanzano avatar nanzano commented on May 8, 2024

Hi,
I have been able to successfully execute the small code above and get the output "Hello, world" on the shell by using the command $ bapbuild hello_world.plugin and $ bap /bin/ls -lhello-world
now I created a binary file hello_worldb.bin and tried the command $bap -lhello_world hello_worldb.bin
and got the following message:
Aborting because ("create image" (llvm (path (hello_worldb.bin)))).
Even giving the same name didn't change anything. I m not sure if I really understood what you mean by > Run it over your prepared binary with $ bap -lyourcode yourbinary
I also tried the print_sections.ml example, successfully got a binary output on the shell but got the same error above when creating a simple print_sections.bin file.

from bap.

ivg avatar ivg commented on May 8, 2024

By binary I meant a compiled program. This message means, that bap cannot recognize a format of the hello_worldb.bin. It should be a binary file, an executable.

from bap.

nanzano avatar nanzano commented on May 8, 2024

Hi,
is there a command in bap to view the plugin file like this?
https://lh3.googleusercontent.com/rwABqBJ7t2tN72sLD2PGiMFBtL3aC0xvNbGLdW9Et9U=w1455-h818-no
(I suppose that the "camlHello_world__code_begin" in the picture represents the disassembly code of my hello world code ? )

from bap.

ivg avatar ivg commented on May 8, 2024

there are --dump and --output options. The latter can be named --phoenix, if you version is not very recent.

Also, I assume that you understand the difference between a plugin and target binary? The plugin is written by you and contains a code that do the analysis. The target of analysis is an executable file, that you are trying to analyze.

from bap.

ghaffarinia avatar ghaffarinia commented on May 8, 2024

Hey ivg,

I am starting to work with bap. I can retrieve the BIL code of a binary code. It would be a lot more useful for me if I can retrieve the ocaml encoding of the BIL code given. Is there anyway to do so?

There is also an issue about lifting code to BIL. I wrote a stupid binary containing just one single instruction "lea (%eax), %eax", and it sounds that bap will not lift it. Could you please tell me what is the problem with it? If the problem is with disassembling, is it possible to give the bap the actual assembly instruction rather the binary code?

Thanks in advance.

from bap.

ivg avatar ivg commented on May 8, 2024

I am starting to work with bap. I can retrieve the BIL code of a binary code. It would be a lot more useful for me if I can retrieve the ocaml encoding of the BIL code given. Is there anyway to do so?

I'm not sure, that I understood you. Do you mean, that you would like to process binary using your own OCaml program, instead of just looking to an output? If yes, then it is covered in documentation.

In short, you can write a plugin, and do arbitrary transformation. Or, you can write a standalone project.

There is also an issue about lifting code to BIL. I wrote a stupid binary containing just one single instruction "lea (%eax), %eax", and it sounds that bap will not lift it. Could you please tell me what is the problem with it? If the problem is with disassembling, is it possible to give the bap the actual assembly instruction rather the binary code?

By default, bap utility assumes, that binary is a container in ELF, COFF or MACH-O format, i.e., a file that is produced and packed by a compiler. If you just assemble a small binary yourself, or used some other assembler, then you
have so called raw binary, i.e., a file that doesn't contain any meta informatio, only code. In that case, you need to use option --binary, to specify, that input is raw, e.g.,

bap --binary=x86 raw.bin -dbil

from bap.

ghaffarinia avatar ghaffarinia commented on May 8, 2024

Thanks for your quick response.
For the first part, I found that what I meant was IR code. Thanks again for your help.

For the second part, it was not a raw file. Actually, I used gcc to compile an assembly code with only one instruction and after linking I have an ELF file. However, bap will produce an empty function chunk. like the following:

begin(sub_400078)
begin(sub_400078_ENTRY) {

}
end(sub_400078_ENTRY)
end(sub_400078)

from bap.

KennethAdamMiller avatar KennethAdamMiller commented on May 8, 2024

Are you looking to work with IR in a language other than ocaml?
On Oct 27, 2015 6:52 PM, "Masoud Ghaffarinia" [email protected]
wrote:

Thanks for your quick response.
For the first part, I found that what I meant was IR code. Thanks again
for your help.

For the second part, it was not a raw file. Actually, I used gcc to
compile an assembly code with only one instruction and after linking I have
an ELF file. However, bap will produce an empty function chunk. like the
following:

begin(sub_400078)
begin(sub_400078_ENTRY) {

}
end(sub_400078_ENTRY)
end(sub_400078)


Reply to this email directly or view it on GitHub
#41 (comment)
.

from bap.

ghaffarinia avatar ghaffarinia commented on May 8, 2024

Yes. Not too far, Coq.

from bap.

ivg avatar ivg commented on May 8, 2024

However, bap will produce an empty function chunk.

It maybe because bap was unable to find code in your file. By default it uses algorithm that finds function starts, and mark them as roots of recursive disassembling process. If it doesn't look like a function start, then it will be skipped as data. lea (%eax), %eax looks like data, so it is skipped. You can either direct bap to this point, by specifying your own set of initial rules, or compile a normal binary from C code, and use -g option, so that it bap can easily find and correctly name your functions.

If you want to play on a low level, like specifying your own instructions, then you can use bap-mc or just use Disasm module directly. The linear_sweep function can help you.

from bap.

ghaffarinia avatar ghaffarinia commented on May 8, 2024

It sounds. Thanks a lot.

from bap.

ghaffarinia avatar ghaffarinia commented on May 8, 2024

or compile a normal binary from C code, and use -g option, so that it bap can easily find and correctly name your functions.

I tried it, but it did not work. The output of objdump is as follows:

00000000004004b6 <asm_func>:
4004b6:    55            push   %rbp
4004b7:    48 89 e5      mov    %rsp,%rbp
4004ba:    cc            int3
4004bb:    67 8d 00      lea    (%eax),%eax
4004be:    cc            int3
4004bf:    5d            pop    %rbp
4004c0:    c3            retq

however, bap produces the following for asm_func function by using bap <object_code> --dump=bil | sed -n '/begin(asm_func)/,/end(asm_func)/p':

begin(asm_func) 
  begin(asm_func_ENTRY) {
    RSP := RSP - 0x8:64
    mem64 := mem64 with [RSP, el]:u64 <- RBP
    RBP := RSP
    special (int3)
    special (int3)
    RBP := mem64[RSP, el]:u64
    RSP := RSP + 0x8:64
    ra_113 := mem64[RSP, el]:u64
    RSP := RSP + 0x8:64
    jmp ra_113
  }
  end(asm_func_ENTRY)
 end(asm_func)

As you see, there is no IR code for lea instruction between int 3 instructions.

from bap.

ivg avatar ivg commented on May 8, 2024

Yep, x86 lifter is not very complete, with lots of instructions missing. We're addressing this issue, by lifting most common instructions. lea is common of course, but 32-bit lea in x86-64 bit is uncommon. So it is not lifted. You can create a feature request, of course, if you really expect it very often in your problems.

echo "\x67\x8d\x00" | bap-mc --show-insn --show-bil
leal (%eax), %eax
{
  special (Lifter: ("Bap_disasm_x86_utils.Arch_exception(-583167065, \"disasm x86: unimplemented feature: to_ir: Lea\")"))
}

If you will use a 64-bit version of lea, e.g., lea (%rax), %rax then it will work:

$ echo "\x48\x8d\x00" | bap-mc --show-insn --show-bil
leaq (%rax), %rax
{
  RAX := low:64[RAX]
}

My personal advice (although it may not suit, since I don't know you tasks) is to stick to writing test programs in C. BAP is tailored for real world programs, not for handcrafted examples.

Anyway, we have an ongoing project, that will increase the completeness of x86 lifters significantly. But I can't give any estimates on when it will mature into something real.

Also, the ARM lifter is much more complete. And honestly speaking, playing with arm binaries is much more easier.

from bap.

ghaffarinia avatar ghaffarinia commented on May 8, 2024

Hi ivg,

I tried several codes and also read the BAP Handbook, however I could not find how BAP captures the side effect that EIP register is changed in executing any instruction. I can understand that labels handle the part of the program which is expected to be executed after a specific instruction, but if EIP is used in an instruction like "add eax, eip", then I think the EIP value is the value that the program started with.

Could you please explain how bap captures this side effect?

from bap.

ivg avatar ivg commented on May 8, 2024

Hi Masoud,

First of all I would like to stress, that BAP Handbook is written for legacy BAP and is outdated. So you shouldn't rely on information, that you get from there.

Next, the answer to your question highly depends on whom you're asking. From a static analysis perspective the program counter is just an address of instruction. From a dynamic analysis it is part of the environment.

In modern BAP the program counter register is implicit and hidden. Whenever you see an instruction like lea (%rip),%rax the value of rip will be substituted with current value of program counter. For example:

echo '0x48,0x8d,0x05,0x00,0x00,0x00,0x00' | bap-mc --arch=x86-64 --show-bil --show-insn --addr=0x400000
leaq (%rip), %rax
{
  RAX := low:64[(0x0:64 + 0x400007:64) + 0x0:64]
}

so, in %rax we load constant 0x400007 (I definitely need to add simplification pass on this output, it is too unreadable). As you may see, the dynamic point of view started to diverge from the static one. The value of program counter is not equal to the address of the instruction. The reason is because on x86 the ip holds the address of next instruction.

Contrary, in ARM the value stored in program counter, is equal to the address of current instruction plus 8, i.e., pc points to the instruction next to the next.

$ echo '0x0f,0x10,0xa0,0xe1' | bap-mc --arch=arm --show-bil --show-insn --addr=0x400000
mov r1, pc
{
  R1 := 0x400008:32
}

Basically, each architecture points program counter differently, so we need to resolve this issue on a lifter level (because, if pc will escape the lifter as it is, then the code, that will work on BIL will depend on architecture, i.e., depending on different architectures, the pc will have different values).

So the %eip is actually hidden in BAP. Whenever you want to get the address of current instruction, you can use Insn.addr function. All relative jumps are properly resolved to jmp <dst>, where <dst> is an absolute address, that is properly calculated with respect to each architectural caveats. So that from a user perspective everything looks nice.

This was a low level picture. From a more high-level perspective, the graph representation of a program uses abstract label type to reify control flow.

We can go even higher, and look at the formal semantics of BIL (section 4). This semantics is implemented in BIL interpreter.

I hope that this sheds some light on the topic :)

from bap.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.