Comments (13)
I'm seeing an example of this in the cat
coreutils program, in the fdadvise
function:
C Code
void fdadvise (int fd, off_t offset, off_t len, fadvice_t advice)
{
#if HAVE_POSIX_FADVISE
// ignore_value is a macro that expands to: (void)(X)
ignore_value (posix_fadvise (fd, offset, len, advice));
#endif
}
Disassembly
0x080495eb <main+859>: call 0x804a3d0 <fdadvise>
; .....
0x804a3d0 <fdadvise>: jmp 0x8049200 <posix_fadvise64@plt>
Recovered Bitcode
; linked.ll
; fdadvise function
define i64 @Func_804A3D0(%struct.CPUX86State* %0) trailofbits/binrec-prerelease#3 {
entry:
store i32 134521808, i32* @PC, align 4, !inststart !11
store i32 134517248, i32* @PC, align 4, !inststart !11 ; address of posix_fadvise64
ret i64 0, !lastpc !77
}
from binrec-tob.
This has been confirmed to be a limitation of binrec because jmp
based indirect function calls are not implemented.
from binrec-tob.
The ln
sample is failing to lift when performing a hard link because it does an indirect function call to linkat
from rpl_linkat
. These functions appear to be provided by gnulib
Dump of assembler code for function rpl_linkat:
# ... instructions ....
0x0804f12f <+479>: pop ebx
0x0804f130 <+480>: pop esi
0x0804f131 <+481>: pop edi
0x0804f132 <+482>: pop ebp
0x0804f133 <+483>: jmp 0x8049160 <linkat@plt>
from binrec-tob.
I'm seeing this on dd
as well, with the rpl_fclose
function:
Dump of assembler code for function rpl_fclose:
# ... instructions ...
0x08050b34 <+132>: pop ebx
0x08050b35 <+133>: pop esi
0x08050b36 <+134>: pop edi
0x08050b37 <+135>: jmp 0x80490c0 <fclose@plt>
from binrec-tob.
This is also occurring on expand
:
Dump of assembler code for function rpl_fflush:
# ... instructions ...
0x0804bc93 <+35>: pop ebx
0x0804bc94 <+36>: jmp 0x8048c80 <fflush@plt>
from binrec-tob.
Interestingly, with all the updated to S2E and binrec, the insert_calls
pass is now correctly identifying indirect function calls and bailing early.
binrec.errors.BinRecLiftingError: failed to perform initial lifting of LLVM bitcode:
cat: [insert_calls] false && "This is not implemented yet.
Also, is metadata set correctly with indirect calls?"
Previously, this wasn't being detected and, instead, the recovered binary would crash.
from binrec-tob.
Probably because we replaced all the assert statements with honest to goodness errors.
from binrec-tob.
It looks like the function being indirectly called is removed in the recover_functions
pass. My initial thought was to update insert_calls
, which runs after recover_functions
, to unconditionally branch to the target function / basic block. However, the target function/BB does not exist anymore.
from binrec-tob.
Do you know why it's being removed? IT may be because there is not trace info linking the calling block to the indirectly called function.
If so, we would need to figure out why the S2E plugins are not catching the indirect call (or if they are, why are they not adding this to the TraceInfo.)
from binrec-tob.
Do you know why it's being removed?
It looks like the function that is being called indirectly is merged into the body of the function performing the indirect call.
; 0x804a4d0 is the wrapper function which performs an indirect call
define internal void @Func_804A3D0() {
BB_804A3D0:
store i32 134521808, i32* @PC, align 4, !inststart !26
store i32 134517248, i32* @PC, align 4, !inststart !26
ret void, !lastpc !278, !succs !279
; 0x8049200 is the target function that is indirectly called
; in this scenario, this function is posix_fadvise
BB_8049200: ; No predecessors!
store i32 134517248, i32* @PC, align 4, !inststart !26
store i32 134517248, i32* @PC, align 4, !inststart !26
%tmp0_v.i = call i32 @helper_ldl_mmu(%struct.CPUX86State* null, i32 134566148, i32 33, i8* null)
store i32 %tmp0_v.i, i32* @PC, align 4
ret void, !lastpc !280, !succs !72, !extern_symbol !281
BB_8049200_join: ; No predecessors!
}
from binrec-tob.
I think the cat
sample is very specific because the Func_8049200
function is actually a library function (posix_fadvise
). So, the recover_functions
pass is incorrectly merging the stub function, which performs an indirect function call, and the library function being indirectly called.
from binrec-tob.
I've confirmed that recover_functions
appears to be treating jmp
based indirect function calls as part of the function body, which is an issue when the function being indirectly called is in an external library. The cat
sample has 3 instances of this:
rpl_fclose @ 0x0804fbc7 <+135>: jmp 0x8048ec0 <fclose@plt>
rpl_fflush @ 0x0804fc1f <+79>: jmp 0x8048e50 <fflush@plt>
fdadvise @ 0x0804a3d0 <+0>: jmp 0x8049200 <posix_fadvise64@plt>
The jmp
target is listed in the function info entry_pc_to_bb_pcs
list. For example, the rpl_fclose
function has a BB list of:
Func_804FB40 # entry point
Func_804FB40
Func_804FB50
Func_804FB57
Func_804FB60
Func_804FB67
Func_804FB70
Func_804FBC0 # ends with "jmp fflush"
Func_8048EC0 # fflush in libc.so
The result is the entire rpl_fclose
function being recovered, including the body of the library fflush
function. So, instead of a call fflush
, it's essentially statically compiled into the binary (albeit in a broken way because the function will fail to lift).
My hunch is that I need to first update recover_functions
to not merge functions when one is external and one is not, which I believe would indicate a jmp
based function "call".
from binrec-tob.
Binrec does not support lifting tail calls to external functions and, for the time being, this issue will not be addressed.
Example
The coreutils cat
sample has a tail call to the libc function posix_fadvise
.
C Code
void fdadvise (int fd, off_t offset, off_t len, fadvice_t advice) {
posix_fadvise (fd, offset, len, advice);
}
Disassembly
Dump of assembler code for function fdadvise:
0x0804a3d0 <+0>: jmp 0x8049200 <posix_fadvise64@plt>
End of assembler dump.
Captured Bitcode
; linked.ll
; Function Attrs: alwaysinline
define i64 @Func_804A3D0(%struct.CPUX86State* %0) trailofbits/binrec-prerelease#3 {
entry:
; internal function: fdadvise()
store i32 134521808, i32* @PC, align 4, !inststart !12
; 134517248 == 0x8049200
; jmp 0x8049200 (posix_fadvise)
store i32 134517248, i32* @PC, align 4, !inststart !12
ret i64 0, !lastpc !78
}
; Function Attrs: alwaysinline
define i64 @Func_8049200(%struct.CPUX86State* %0) trailofbits/binrec-prerelease#3 {
entry:
; libc posix_fadvise function
store i32 134517248, i32* @PC, align 4, !inststart !12
store i32 134517248, i32* @PC, align 4, !inststart !12
%tmp0_v = call i32 @helper_ldl_mmu(%struct.CPUX86State* %0, i32 134566148, i32 33, i8* null)
store i32 %tmp0_v, i32* @PC, align 4
ret i64 0, !lastpc !79
}
After merging, the function looks like:
define i64 @Func_804A3D0(%struct.CPUX86State* %0) trailofbits/binrec-prerelease#3 {
entry:
; internal function: fdadvise()
store i32 134521808, i32* @PC, align 4, !inststart !12
; 134517248 == 0x8049200
; jmp 0x8049200 (posix_fadvise)
store i32 134517248, i32* @PC, align 4, !inststart !12
ret i64 0, !lastpc !78
BB_8049200:
; libc posix_fadvise function
store i32 134517248, i32* @PC, align 4, !inststart !12
store i32 134517248, i32* @PC, align 4, !inststart !12
%tmp0_v = call i32 @helper_ldl_mmu(%struct.CPUX86State* %0, i32 134566148, i32 33, i8* null)
store i32 %tmp0_v, i32* @PC, align 4
ret i64 0, !lastpc !79
}
The merged function fails to lift for multiple reasons.
Design Limitations and Gaps
There are multiple limitations and gaps within Binrec that would need to be addressed to support tail calls into libraries:
- S2E does not create a new translation block when encountering a
jmp
to a library function, so the target function body is being traced in this scenario. - Binrec inlines functions early in the lifting process which causes the original callsite to lose track of the arguments passed in that would be forwarded to the target function.
- The target function would eventually be lifted to an
extern
and LLVM does not support branching to anextern
function, the function must be called. However, the original arguments would've been inlined and potentially lost. - The target function is not a successor in the recorded CFG
from binrec-tob.
Related Issues (20)
- Latest version of S2E throws errors when tracing our integration tests HOT 1
- Analysis Timeout HOT 1
- Inconsistent Trace Info on Some Samples with Concrete Inputs HOT 15
- coreutils: sleep recovered segfaults on XMM instruction
- coreutils: uniq recovered binary segfaults
- obstack support HOT 2
- coreutils: cksum produces incomplete trace HOT 1
- Conflict between symbolic tracing code and export interval in BinRec plugins
- coreutils: shuf recovered buffer overflow
- coreutils: cannot locate main function HOT 5
- Meta: Code maintainability improvements HOT 3
- Initial Support for Callbacks HOT 1
- `env` and `printenv` tests fail on Debian 11.3 version of BinRec HOT 1
- I can't install binrec, it seems there is a HTTP connection error when executing "pipenv lock --dev" HOT 2
- coreutils: stat sample writes corrupt output and segfaults
- `mkdir` test case fails at a high rate on LLVM-14 version of BinRec HOT 5
- Keep failing while cloning qemu/capstone HOT 2
- Cannot lift recompiled example binaries HOT 2
- Add support for making bytes within file inputs symbolic
- coreutils: seq fails during symbolic execution HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from binrec-tob.