In this program:
static int ONE = 1;
int getzero(void) {
return 0;
}
int getone(void) {
return ONE + getzero();
}
If you compile it to a shared library and strip
it:
$ gcc -nostdlib -shared getone.c -o libgetone-stripped.so
$ strip libgetone-stripped.so
The assembly will look like this:
$ objdump -d libgetone-stripped.so
libgetone-stripped.so: file format elf64-x86-64
Disassembly of section .plt:
0000000000001000 <getzero@plt-0x10>:
1000: ff 35 02 30 00 00 pushq 0x3002(%rip) # 4008 <getone+0x2fdd>
1006: ff 25 04 30 00 00 jmpq *0x3004(%rip) # 4010 <getone+0x2fe5>
100c: 0f 1f 40 00 nopl 0x0(%rax)
0000000000001010 <getzero@plt>:
1010: ff 25 02 30 00 00 jmpq *0x3002(%rip) # 4018 <getzero+0x2ff8>
1016: 68 00 00 00 00 pushq $0x0
101b: e9 e0 ff ff ff jmpq 1000 <getzero@plt-0x10>
Disassembly of section .text:
0000000000001020 <getzero>:
1020: 55 push %rbp
1021: 48 89 e5 mov %rsp,%rbp
1024: b8 00 00 00 00 mov $0x0,%eax
1029: 5d pop %rbp
102a: c3 retq
000000000000102b <getone>:
102b: 55 push %rbp
102c: 48 89 e5 mov %rsp,%rbp
102f: e8 dc ff ff ff callq 1010 <getzero@plt>
1034: 8b 15 e6 2f 00 00 mov 0x2fe6(%rip),%edx # 4020 <getone+0x2ff5>
103a: 01 d0 add %edx,%eax
103c: 5d pop %rbp
103d: c3 retq
Note that there are two function entry points here, one for getzero
(at address 0x1020) and another for getone
(at address 0x102b). macaw-loader
, on the other hand, only discovers the entry point for getzero
. This is due to a limitation in how entryPoints
is defined:
|
x86EntryPoints :: (X.MonadThrow m) |
|
=> BL.LoadedBinary MX.X86_64 (E.ElfHeaderInfo 64) |
|
-> m (NEL.NonEmpty (MM.MemSegmentOff 64)) |
|
x86EntryPoints loadedBinary = do |
|
case BLE.resolveAbsoluteAddress mem addrWord of |
|
-- n.b. no guarantee of uniqueness, and in particular, entryPoint is probably in symbols somewhere |
|
Just entryPoint -> return (entryPoint NEL.:| mapMaybe (BLE.resolveAbsoluteAddress mem) symbolWords) |
|
Nothing -> X.throwM (InvalidEntryPoint addrWord) |
|
where |
|
offset = fromMaybe 0 (LC.loadOffset (BL.loadOptions loadedBinary)) |
|
mem = BL.memoryImage loadedBinary |
|
addrWord = MM.memWord (offset + (fromIntegral (E.headerEntry (E.header (elf (BL.binaryFormatData loadedBinary)))))) |
|
elfData = elf (BL.binaryFormatData loadedBinary) |
|
symbolWords = [ MM.memWord (fromIntegral (offset + (E.steValue entry))) |
|
| Just (Right st) <- [E.decodeHeaderSymtab elfData] |
|
, entry <- F.toList (E.symtabEntries st) |
|
, E.steType entry == E.STT_FUNC |
|
] |
This implementation uses decodeHeaderSymtab
, which only consults the static symbol table. This happens to contain the address for getzero
because it is the main entry point address for the shared library:
$ readelf -h libgetone-stripped.so | grep "Entry point address:"
Entry point address: 0x1020
However, libgetone-stripped.so
also contains dynamic symbols:
$ readelf --dyn-syms libgetone-stripped.so
Symbol table '.dynsym' contains 3 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000001020 11 FUNC GLOBAL DEFAULT 7 getzero
2: 000000000000102b 19 FUNC GLOBAL DEFAULT 7 getone
If the entryPoints
function consulted the dynamic symbols, similarly to how it is done in macaw
, it would be able to find the address for getone
.
This example uses x86, but it applies to AArch32 and PPC32 as well, which use identical implementations for entryPoints
.