Giter VIP home page Giter VIP logo

Comments (13)

ameily avatar ameily commented on May 30, 2024

I'm seeing an example of this in the cat coreutils program, in the fdadvise function:

C Code

void fdadvise (int fd, off_t offset, off_t len, fadvice_t advice)
{
#if HAVE_POSIX_FADVISE
  // ignore_value is a macro that expands to: (void)(X)
  ignore_value (posix_fadvise (fd, offset, len, advice));
#endif
}

Disassembly

   0x080495eb <main+859>:	call   0x804a3d0 <fdadvise>
   ; .....
   0x804a3d0 <fdadvise>:	jmp    0x8049200 <posix_fadvise64@plt>

Recovered Bitcode

; linked.ll

; fdadvise function
define i64 @Func_804A3D0(%struct.CPUX86State* %0) trailofbits/binrec-prerelease#3 {
entry:
  store i32 134521808, i32* @PC, align 4, !inststart !11
  store i32 134517248, i32* @PC, align 4, !inststart !11  ; address of posix_fadvise64
  ret i64 0, !lastpc !77
}

from binrec-tob.

ameily avatar ameily commented on May 30, 2024

This has been confirmed to be a limitation of binrec because jmp based indirect function calls are not implemented.

from binrec-tob.

ameily avatar ameily commented on May 30, 2024

The ln sample is failing to lift when performing a hard link because it does an indirect function call to linkat from rpl_linkat. These functions appear to be provided by gnulib

Dump of assembler code for function rpl_linkat:
   # ... instructions ....
   0x0804f12f <+479>:	pop    ebx
   0x0804f130 <+480>:	pop    esi
   0x0804f131 <+481>:	pop    edi
   0x0804f132 <+482>:	pop    ebp
   0x0804f133 <+483>:	jmp    0x8049160 <linkat@plt>

from binrec-tob.

ameily avatar ameily commented on May 30, 2024

I'm seeing this on dd as well, with the rpl_fclose function:

Dump of assembler code for function rpl_fclose:
   # ... instructions ...
   0x08050b34 <+132>:	pop    ebx
   0x08050b35 <+133>:	pop    esi
   0x08050b36 <+134>:	pop    edi
   0x08050b37 <+135>:	jmp    0x80490c0 <fclose@plt>

from binrec-tob.

ameily avatar ameily commented on May 30, 2024

This is also occurring on expand:

Dump of assembler code for function rpl_fflush:
   # ... instructions ...
   0x0804bc93 <+35>:	pop    ebx
   0x0804bc94 <+36>:	jmp    0x8048c80 <fflush@plt>

from binrec-tob.

ameily avatar ameily commented on May 30, 2024

Interestingly, with all the updated to S2E and binrec, the insert_calls pass is now correctly identifying indirect function calls and bailing early.

binrec.errors.BinRecLiftingError: failed to perform initial lifting of LLVM bitcode:
  cat: [insert_calls] false && "This is not implemented yet.
  Also, is metadata set correctly with indirect calls?"

Previously, this wasn't being detected and, instead, the recovered binary would crash.

from binrec-tob.

michaelbrownuc avatar michaelbrownuc commented on May 30, 2024

Probably because we replaced all the assert statements with honest to goodness errors.

from binrec-tob.

ameily avatar ameily commented on May 30, 2024

It looks like the function being indirectly called is removed in the recover_functions pass. My initial thought was to update insert_calls, which runs after recover_functions, to unconditionally branch to the target function / basic block. However, the target function/BB does not exist anymore.

from binrec-tob.

michaelbrownuc avatar michaelbrownuc commented on May 30, 2024

Do you know why it's being removed? IT may be because there is not trace info linking the calling block to the indirectly called function.

If so, we would need to figure out why the S2E plugins are not catching the indirect call (or if they are, why are they not adding this to the TraceInfo.)

from binrec-tob.

ameily avatar ameily commented on May 30, 2024

Do you know why it's being removed?

It looks like the function that is being called indirectly is merged into the body of the function performing the indirect call.

; 0x804a4d0 is the wrapper function which performs an indirect call
define internal void @Func_804A3D0() {
BB_804A3D0:
  store i32 134521808, i32* @PC, align 4, !inststart !26
  store i32 134517248, i32* @PC, align 4, !inststart !26
  ret void, !lastpc !278, !succs !279

; 0x8049200 is the target function that is indirectly called
; in this scenario, this function is posix_fadvise
BB_8049200:                                       ; No predecessors!
  store i32 134517248, i32* @PC, align 4, !inststart !26
  store i32 134517248, i32* @PC, align 4, !inststart !26
  %tmp0_v.i = call i32 @helper_ldl_mmu(%struct.CPUX86State* null, i32 134566148, i32 33, i8* null)
  store i32 %tmp0_v.i, i32* @PC, align 4
  ret void, !lastpc !280, !succs !72, !extern_symbol !281

BB_8049200_join:                                  ; No predecessors!
}

from binrec-tob.

ameily avatar ameily commented on May 30, 2024

I think the cat sample is very specific because the Func_8049200 function is actually a library function (posix_fadvise). So, the recover_functions pass is incorrectly merging the stub function, which performs an indirect function call, and the library function being indirectly called.

from binrec-tob.

ameily avatar ameily commented on May 30, 2024

I've confirmed that recover_functions appears to be treating jmp based indirect function calls as part of the function body, which is an issue when the function being indirectly called is in an external library. The cat sample has 3 instances of this:

  • rpl_fclose @ 0x0804fbc7 <+135>: jmp 0x8048ec0 <fclose@plt>
  • rpl_fflush @ 0x0804fc1f <+79>: jmp 0x8048e50 <fflush@plt>
  • fdadvise @ 0x0804a3d0 <+0>: jmp 0x8049200 <posix_fadvise64@plt>

The jmp target is listed in the function info entry_pc_to_bb_pcs list. For example, the rpl_fclose function has a BB list of:

Func_804FB40  # entry point
Func_804FB40
Func_804FB50
Func_804FB57
Func_804FB60
Func_804FB67
Func_804FB70
Func_804FBC0  # ends with "jmp fflush"
Func_8048EC0  # fflush in libc.so

The result is the entire rpl_fclose function being recovered, including the body of the library fflush function. So, instead of a call fflush, it's essentially statically compiled into the binary (albeit in a broken way because the function will fail to lift).

My hunch is that I need to first update recover_functions to not merge functions when one is external and one is not, which I believe would indicate a jmp based function "call".

from binrec-tob.

ameily avatar ameily commented on May 30, 2024

Binrec does not support lifting tail calls to external functions and, for the time being, this issue will not be addressed.

Example

The coreutils cat sample has a tail call to the libc function posix_fadvise.

C Code

void fdadvise (int fd, off_t offset, off_t len, fadvice_t advice) {
  posix_fadvise (fd, offset, len, advice);
}

Disassembly

Dump of assembler code for function fdadvise:
   0x0804a3d0 <+0>:	jmp    0x8049200 <posix_fadvise64@plt>
End of assembler dump.

Captured Bitcode

; linked.ll
; Function Attrs: alwaysinline
define i64 @Func_804A3D0(%struct.CPUX86State* %0) trailofbits/binrec-prerelease#3 {
entry:
  ; internal function: fdadvise()
  store i32 134521808, i32* @PC, align 4, !inststart !12
  ; 134517248 == 0x8049200
  ; jmp 0x8049200 (posix_fadvise)
  store i32 134517248, i32* @PC, align 4, !inststart !12
  ret i64 0, !lastpc !78
}

; Function Attrs: alwaysinline
define i64 @Func_8049200(%struct.CPUX86State* %0) trailofbits/binrec-prerelease#3 {
entry:
  ; libc posix_fadvise function
  store i32 134517248, i32* @PC, align 4, !inststart !12
  store i32 134517248, i32* @PC, align 4, !inststart !12
  %tmp0_v = call i32 @helper_ldl_mmu(%struct.CPUX86State* %0, i32 134566148, i32 33, i8* null)
  store i32 %tmp0_v, i32* @PC, align 4
  ret i64 0, !lastpc !79
}

After merging, the function looks like:

define i64 @Func_804A3D0(%struct.CPUX86State* %0) trailofbits/binrec-prerelease#3 {
entry:
  ; internal function: fdadvise()
  store i32 134521808, i32* @PC, align 4, !inststart !12
  ; 134517248 == 0x8049200
  ; jmp 0x8049200 (posix_fadvise)
  store i32 134517248, i32* @PC, align 4, !inststart !12
  ret i64 0, !lastpc !78

BB_8049200:
  ; libc posix_fadvise function
  store i32 134517248, i32* @PC, align 4, !inststart !12
  store i32 134517248, i32* @PC, align 4, !inststart !12
  %tmp0_v = call i32 @helper_ldl_mmu(%struct.CPUX86State* %0, i32 134566148, i32 33, i8* null)
  store i32 %tmp0_v, i32* @PC, align 4
  ret i64 0, !lastpc !79
}

The merged function fails to lift for multiple reasons.

Design Limitations and Gaps

There are multiple limitations and gaps within Binrec that would need to be addressed to support tail calls into libraries:

  1. S2E does not create a new translation block when encountering a jmp to a library function, so the target function body is being traced in this scenario.
  2. Binrec inlines functions early in the lifting process which causes the original callsite to lose track of the arguments passed in that would be forwarded to the target function.
  3. The target function would eventually be lifted to an extern and LLVM does not support branching to an extern function, the function must be called. However, the original arguments would've been inlined and potentially lost.
  4. The target function is not a successor in the recorded CFG

from binrec-tob.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.