trailofbits / maat Goto Github PK
View Code? Open in Web Editor NEWOpen-source symbolic execution framework: https://maat.re
License: Other
Open-source symbolic execution framework: https://maat.re
License: Other
Maat version:
master
branch at commit 485b2c6
Issue description:
Attempting to build the Docker container yields the following error:
Step 9/9 : RUN cmake -S . -B /tmp/maat/build -DCMAKE_BUILD_TYPE=RelWithDebInfo "-DCMAKE_INSTALL_PREFIX=$(python3 -m site --user-base)" -Dmaat_USE_EXTERNAL_SLEIGH=OFF && cmake --build /tmp/maat/build -j $(nproc) && cmake --install /tmp/maat/build
---> Running in b5eb127d2a17
-- The C compiler identification is GNU 9.3.0
-- The CXX compiler identification is GNU 9.3.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found GMP: /usr/lib/x86_64-linux-gnu/libgmp.so
CMake Error at CMakeLists.txt:108 (add_subdirectory):
The source directory
/src/maat/src/third-party/sleigh/sleigh-cmake
does not contain a CMakeLists.txt file.
-- Found Z3: /usr/lib/x86_64-linux-gnu/libz3.so
-- Found LIEF: /usr/local/lib/libLIEF.a (found version "0.11.5")
CMake Error at CMakeLists.txt:150 (sleigh_compile):
Unknown CMake command "sleigh_compile".
Call Stack (most recent call first):
CMakeLists.txt:159 (maat_sleigh_compile)
-- Configuring incomplete, errors occurred!
The issue seems to be a missing CMakeLists.txt
at /src/maat/src/third-party/sleigh/sleigh-cmake
.
I'm using Docker version 20.10.12, build e91ed5707e.
Steps to reproduce:
git clone https://github.com/trailofbits/maat
cd maat
'docker build .
We should expose the Arch
class in python bindings so that users can use the generic pc
and sp
getters, and allow to get the size of an arbitrary register.
Contrary to the native C++ Arch
API, the Python wrapper should use strings to identify registers, instead of reg_t
, since it will integrate better with the way the CPU is exposed in bindings (with a dynamic attribute getter that uses the register names, rather than their reg_t
number.
Acceptance criteria:
Arch
class is exposed in bindings with methods:
pc()
: return the name of the program countersp()
: return the name of the stack pointerreg_size(<reg>)
: return the size in bits of the registerMaatEngine
has an arch
attribute accessible from python bindingsIt turns out that making ir::CPU
a templated class doesn't bring any additional value while rendering its usage more difficult. We should make it a regular class and switch from using array<><>
's internally to vector<>
and reserve as many slots as arch registers, initialising them to empty values.
Hi,
I would like to use maat
with simple projects but I have troubles running, for instance, id
. My code is the following:
from maat import *
m = MaatEngine(ARCH.X64, OS.LINUX)
m.load("/bin/id", BIN.ELF64, libdirs=["/usr/lib/x86_64-linux-gnu/"])
m.run()
and the output is the following
➜ maat python3 id.py
[Info] Adding object 'ld-linux-x86-64.so.2' to virtual fs at '/usr/lib/ld-linux-x86-64.so.2'
[Info] Adding object 'libc.so.6' to virtual fs at '/usr/lib/libc.so.6'
[Info] Adding object 'libdl.so.2' to virtual fs at '/usr/lib/libdl.so.2'
[Info] Adding object 'libpcre2-8.so.0' to virtual fs at '/usr/lib/libpcre2-8.so.0'
[Info] Adding object 'libpthread.so.0' to virtual fs at '/usr/lib/libpthread.so.0'
[Info] Adding object 'libselinux.so.1' to virtual fs at '/usr/lib/libselinux.so.1'
[Info] Adding object 'id' to virtual fs at '/id'
[Error] Exception in CALLOTHER handler: SYSCALL: EnvEmulator: syscall '218' not supported for emulation
[Error] Unexpected error when processing IR instruction, aborting...
➜ maat
Would it be hard to add the support for the missing system call?
Thanks
Currently when serialising FileSystem
, we dump the entire symbolic filesystem. When the filesystem contains many shared library files it results in a lot of overhead when dumping/loading states.
Since many shared library files will likely only be read by the program, and never modified, we could gain a lot of time and storage by caching their contents in memory and avoid serialising them when possible.
The best solution will probably to implement a dynamic cache for env files that keeps track of the PhysicalFile
internal buffers with a copy-on-write mechanism.
I have written partial bindings for the emulated symbolic filesystem API, but those are not yet documented. We should write the corresponding documentation and push it before public release since the examples will likely use the filesystem for stdin
input.
We need to make installation easier on macOS, that includes:
pkgconfig
for deps installed with homebrew (z3
, ...)sla
files needed by SleighI'm getting a crash trying to simply run a binary with an input concolic buffer following the tutorial here. I'm running against the binary AIS-Lite
from the CGC dataset, and I've attached a .tar.gz of the directory with the binary and libraries. A minimized version of my script is below to reproduce the issue.
Any help would be appreciated to figure out what I'm doing wrong here, it would also be super helpful to have a more detailed error explanation or traceback on errors like this!
Thanks for the hard work folks, this system looks really promising once it matures a bit!
from pathlib import Path
from argparse import ArgumentParser
from typing import List, Optional
from maat import MaatEngine, OS, ARCH, BIN
from angr import Project
from logging import basicConfig, getLogger
basicConfig()
logger = getLogger(__name__)
class TestMaat:
"""Can ma'at replace Triton?"""
BASE = 0x400000
def __init__(
self,
binary: Path,
input_file: Path,
args: Optional[List[str]] = None,
libdirs: Optional[List[str]] = None,
) -> None:
"""
Set up the maat engine with our binary.
:param binary: Path to the binary to run
:param args: Arguments to pass to the binary
"""
self.binary = binary
self.input_file = input_file
self.args = args
self.libdirs = (
list(map(lambda l: str(l.resolve()), libdirs))
if libdirs is not None
else []
)
self.engine = MaatEngine(ARCH.X64, OS.LINUX)
assert binary.is_file(), f"{binary} is not a file or doesn't exist."
logger.info(f"Loading binary {self.binary} with libdirs {self.libdirs}")
self.engine.load(
str(binary.resolve()),
BIN.ELF64,
base=self.BASE,
args=args if args is not None else [],
libdirs=self.libdirs,
)
self.set_input()
def run(self) -> None:
"""
Run the binary.
"""
self.engine.run()
def set_input(self) -> None:
"""
Set up the input for the binary.
"""
stdin = self.engine.env.fs.get_fa_by_handle(0) # Get stdin simfile
contents = self.input_file.read_bytes()
input_buffer = self.engine.vars.new_concolic_buffer(
"stdin",
contents,
len(contents),
)
stdin.write_buffer(input_buffer)
if __name__ == "__main__":
parser = ArgumentParser(prog="repro")
parser.add_argument(
"--binary", type=Path, required=True, help="Path to binary to execute."
)
parser.add_argument(
"--input", type=Path, required=True, help="Path to input file to read."
)
parser.add_argument("--args", nargs="*", help="Arguments to pass to binary.")
parser.add_argument(
"--libdirs", nargs="*", type=Path, help="Library directories to load."
)
cli_args = parser.parse_args()
stage1 = TestMaat(
cli_args.binary,
cli_args.input,
cli_args.args,
cli_args.libdirs + [Path("/lib/x86_64-linux-gnu/")],
)
stage1.run()
Brain dumping about ways to improve runtime performance. The current bottlenecks are most likely:
Expr
creation: they require dynamic allocation and are thus costly to create.Number
class initialisation: if the mpz
part has a non-trivial constructor it could hinder the perf gains that we expect from using Number
for concrete valuesExpr
canonisation: currently every expression is canonised upon creation. It certainly induces overhead, especially for memory operations that require using Expr
. Computing expression hashes also adds overhead!Some ideas to address them:
Expr
usage as much as possible. Especially, we should not enforce the use of Expr
for memory operations and be able to use Number
too. Maybe we could consider creating a Value
class which would be an std::variant<Expr, Number>
mpz
has non trivial initialisation, consider wrapping them in a class that enables to skip the init (std::optional
maybe?)ExprObject::eq()
to comparing Expr
raw pointers for quick expression equality (deep expr equality can be done with a recursive method on the arguments...)We should switch to storing PCODE representations of basic blocks to storing PCODE representation of individual instructions. Reasons are mainly:
We want to replace the ir::Block
class by something like ir::AsmInstruction
which holds the PCODE, address, length, ...
Maat version:
master
branch at commit 1d1c0d3
Issue description:
Same as title. I'm consistently getting a SIGABRT signal when running Value.as_int()
over python.
Steps to reproduce:
Run the following:
from maat import *
m = MaatEngine(ARCH.X64, OS.LINUX)
m.cpu.rax = Var(64, "a") # Variable "a" on 64 bits
m.cpu.rax.as_int()
Currently, we have a vendored version of the x86 sleigh specs based on Ghidra 9.2.3 (as evidenced by the download script src/third-party/sleigh/native/sleigh_download.sh
that was removed in this PR).
The following is a diff between Maat's x86 sleigh spec files (src/third-party/sleigh/processors/x86/data/languages
) and Ghidra tag Ghidra_9.2.3_build
. The diff was generated by copying Maat's src/third-party/sleigh/processors/x86/data/languages
directory into Ghidra's equivalent and running git diff
in the Ghidra repo:
diff --git a/Ghidra/Processors/x86/data/languages/ia.sinc b/Ghidra/Processors/x86/data/languages/ia.sinc
index 4e7e69d3f..f47be7806 100644
--- a/Ghidra/Processors/x86/data/languages/ia.sinc
+++ b/Ghidra/Processors/x86/data/languages/ia.sinc
@@ -704,7 +704,7 @@ addr64: [Base64 + Index64*ss] is mod=0 & r_m=4; Index64 & Base64 & ss
addr64: [Base64] is mod=0 & r_m=4; rexXprefix=0 & index64=4 & Base64 { export Base64; }
addr64: [simm32_64 + Index64*ss] is mod=0 & r_m=4; Index64 & base64=5 & ss; simm32_64 { local tmp=simm32_64+Index64*ss; export tmp; }
addr64: [Index64*ss] is mod=0 & r_m=4; Index64 & base64=5 & ss; imm32=0 { local tmp=Index64*ss; export tmp; }
-addr64: [imm32_64] is mod=0 & r_m=4; rexXprefix=0 & index64=4 & base64=5; imm32_64 { export *[const]:8 imm32_64; }
+addr64: [simm32_64] is mod=0 & r_m=4; rexXprefix=0 & index64=4 & base64=5; simm32_64 { export *[const]:8 simm32_64; }
addr64: [Base64 + Index64*ss + simm8_64] is mod=1 & r_m=4; Index64 & Base64 & ss; simm8_64 { local tmp=simm8_64+Base64+Index64*ss; export tmp; }
addr64: [Base64 + Index64*ss] is mod=1 & r_m=4; Index64 & Base64 & ss; simm8=0 { local tmp=Base64+Index64*ss; export tmp; }
addr64: [Base64 + simm8_64] is mod=1 & r_m=4; rexXprefix=0 & index64=4 & Base64; simm8_64 { local tmp=simm8_64+Base64; export tmp; }
@@ -2737,9 +2737,10 @@ enterFrames: low5 is low5 { tmp:1 = low5; export tmp; }
:INSD^rep^reptail eseDI4,DX is vexMode=0 & rep & reptail & opsize=1 & byte=0x6d & eseDI4 & DX { eseDI4 = in(DX); }
:INSD^rep^reptail eseDI4,DX is vexMode=0 & rep & reptail & opsize=2 & byte=0x6d & eseDI4 & DX { eseDI4 = in(DX); }
-:INT1 is vexMode=0 & byte=0xf1 { tmp:1 = 0x1; intloc:$(SIZE) = swi(tmp); call [intloc]; return [0:1]; }
-:INT3 is vexMode=0 & byte=0xcc { tmp:1 = 0x3; intloc:$(SIZE) = swi(tmp); call [intloc]; return [0:1]; }
-:INT imm8 is vexMode=0 & byte=0xcd; imm8 { tmp:1 = imm8; intloc:$(SIZE) = swi(tmp); call [intloc]; }
+# Removed the call [intloc]; from INT* instructions to make callother processing easier
+:INT1 is vexMode=0 & byte=0xf1 { tmp:1 = 0x1; intloc:$(SIZE) = swi(tmp); return [0:1]; }
+:INT3 is vexMode=0 & byte=0xcc { tmp:1 = 0x3; intloc:$(SIZE) = swi(tmp); return [0:1]; }
+:INT imm8 is vexMode=0 & byte=0xcd; imm8 { tmp:1 = imm8; intloc:$(SIZE) = swi(tmp); }
:INTO is vexMode=0 & byte=0xce & bit64=0
{
tmp:1 = 0x4;
@@ -3155,8 +3156,11 @@ define pcodeop swap_bytes;
:NEG rm64 is vexMode=0 & opsize=2 & byte=0xf7; rm64 & reg_opcode=3 ... { negflags(rm64); rm64 = -rm64; resultflags(rm64); }
@endif
-:NOP is vexMode=0 & opsize=0 & byte=0x90 { }
-:NOP is vexMode=0 & opsize=1 & byte=0x90 { }
+# For simple NOPs rexprefix=0 is necessary to avoid the XCHG R8D, EAX and
+# XCHG R8W, AX instructions to be wrongly interpreted as REX-prefixed NOPs
+
+:NOP is vexMode=0 & opsize=0 & byte=0x90 & rexprefix=0 { }
+:NOP is vexMode=0 & opsize=1 & byte=0x90 & rexprefix=0 { }
:NOP rm16 is vexMode=0 & mandover & opsize=0 & byte=0x0f; high5=3; rm16 ... { }
:NOP rm32 is vexMode=0 & mandover & opsize=1 & byte=0x0f; high5=3; rm32 ... { }
:NOP^"/reserved" rm16 is vexMode=0 & mandover & opsize=0 & byte=0x0f; byte=0x18; rm16 & reg_opcode_hb=1 ... { }
@@ -3907,6 +3911,8 @@ define pcodeop xend;
:XCHG RAX,Rmr64 is vexMode=0 & opsize=2 & row = 9 & page = 0 & RAX & Rmr64 { local tmp = RAX; RAX = Rmr64; Rmr64 = tmp; }
@endif
+
+
:XCHG rm8,Reg8 is vexMode=0 & byte=0x86; rm8 & Reg8 ... { local tmp = rm8; rm8 = Reg8; Reg8 = tmp; }
:XCHG rm16,Reg16 is vexMode=0 & opsize=0 & byte=0x87; rm16 & Reg16 ... { local tmp = rm16; rm16 = Reg16; Reg16 = tmp; }
:XCHG rm32,Reg32 is vexMode=0 & opsize=1 & byte=0x87; rm32 & check_rm32_dest ... & Reg32 ... & check_Reg32_dest ... { local tmp = rm32; rm32 = Reg32; build check_rm32_dest; Reg32 = tmp; build check_Reg32_dest;}
@@ -5690,19 +5696,12 @@ define pcodeop movmskps;
:MOVUPS XmmReg, m128 is vexMode=0 & byte=0x0F; byte=0x10; m128 & XmmReg ...
{
- local m:16 = m128;
- XmmReg[0,32] = m[0,32];
- XmmReg[32,32] = m[32,32];
- XmmReg[64,32] = m[64,32];
- XmmReg[96,32] = m[96,32];
+ XmmReg = m128;
}
:MOVUPS XmmReg1, XmmReg2 is vexMode=0 & byte=0x0F; byte=0x10; xmmmod = 3 & XmmReg1 & XmmReg2
{
- XmmReg1[0,32] = XmmReg2[0,32];
- XmmReg1[32,32] = XmmReg2[32,32];
- XmmReg1[64,32] = XmmReg2[64,32];
- XmmReg1[96,32] = XmmReg2[96,32];
+ XmmReg1 = XmmReg2;
}
:MOVUPS m128, XmmReg is vexMode=0 & mandover=0 & byte=0x0F; byte=0x11; m128 & XmmReg ...
@@ -5712,10 +5711,7 @@ define pcodeop movmskps;
:MOVUPS XmmReg2, XmmReg1 is vexMode=0 & mandover=0 & byte=0x0F; byte=0x11; xmmmod = 3 & XmmReg1 & XmmReg2
{
- XmmReg1[0,32] = XmmReg2[0,32];
- XmmReg1[32,32] = XmmReg2[32,32];
- XmmReg1[64,32] = XmmReg2[64,32];
- XmmReg1[96,32] = XmmReg2[96,32];
+ XmmReg2 = XmmReg1;
}
:MULPD XmmReg, m128 is vexMode=0 & $(PRE_66) & byte=0x0F; byte=0x59; m128 & XmmReg ...
@@ -6670,10 +6666,75 @@ define pcodeop pminub;
:PMINUB XmmReg1, XmmReg2 is vexMode=0 & $(PRE_66) & byte=0x0F; byte=0xDA; xmmmod = 3 & XmmReg1 & XmmReg2 { XmmReg1 = pminub(XmmReg1, XmmReg2); }
define pcodeop pmovmskb;
-:PMOVMSKB Reg32, mmxreg2 is vexMode=0 & mandover=0 & byte=0x0F; byte=0xD7; Reg32 & mmxreg2 { Reg32 = pmovmskb(Reg32, mmxreg2); }
-:PMOVMSKB Reg32, XmmReg2 is vexMode=0 & $(PRE_66) & byte=0x0F; byte=0xD7; Reg32 & XmmReg2 { Reg32 = pmovmskb(Reg32, XmmReg2); }
-@ifdef IA64
-:PMOVMSKB Reg64, mmxreg2 is vexMode=0 & opsize=2 & mandover=0 & byte=0x0F; byte=0xD7; Reg64 & mmxreg2 { Reg64 = pmovmskb(Reg64, mmxreg2); }
+:PMOVMSKB Reg32, mmxreg2 is vexMode=0 & mandover=0 & byte=0x0F; byte=0xD7; Reg32 & mmxreg2
+{
+ TempA:4 = 0;
+ TempA[0, 1] = mmxreg2[7, 1];
+ TempA[1, 1] = mmxreg2[15, 1];
+ TempA[2, 1] = mmxreg2[23, 1];
+ TempA[3, 1] = mmxreg2[31, 1];
+ TempA[4, 1] = mmxreg2[39, 1];
+ TempA[5, 1] = mmxreg2[47, 1];
+ TempA[6, 1] = mmxreg2[55, 1];
+ TempA[7, 1] = mmxreg2[63, 1];
+ Reg32 = TempA;
+}
+:PMOVMSKB Reg32, XmmReg2 is vexMode=0 & $(PRE_66) & byte=0x0F; byte=0xD7; Reg32 & XmmReg2
+{
+ TempA:4 = 0;
+ TempA[0, 1] = XmmReg2[7, 1];
+ TempA[1, 1] = XmmReg2[15, 1];
+ TempA[2, 1] = XmmReg2[23, 1];
+ TempA[3, 1] = XmmReg2[31, 1];
+ TempA[4, 1] = XmmReg2[39, 1];
+ TempA[5, 1] = XmmReg2[47, 1];
+ TempA[6, 1] = XmmReg2[55, 1];
+ TempA[7, 1] = XmmReg2[63, 1];
+ TempA[8, 1] = XmmReg2[71, 1];
+ TempA[9, 1] = XmmReg2[79, 1];
+ TempA[10, 1] = XmmReg2[87, 1];
+ TempA[11, 1] = XmmReg2[95, 1];
+ TempA[12, 1] = XmmReg2[103, 1];
+ TempA[13, 1] = XmmReg2[111, 1];
+ TempA[14, 1] = XmmReg2[119, 1];
+ TempA[15, 1] = XmmReg2[127, 1];
+ Reg32 = TempA;
+}
+@ifdef IA64
+:PMOVMSKB Reg64, mmxreg2 is vexMode=0 & opsize=2 & mandover=0 & byte=0x0F; byte=0xD7; Reg64 & mmxreg2
+{
+ TempA:8 = 0;
+ TempA[0, 1] = mmxreg2[7, 1];
+ TempA[1, 1] = mmxreg2[15, 1];
+ TempA[2, 1] = mmxreg2[23, 1];
+ TempA[3, 1] = mmxreg2[31, 1];
+ TempA[4, 1] = mmxreg2[39, 1];
+ TempA[5, 1] = mmxreg2[47, 1];
+ TempA[6, 1] = mmxreg2[55, 1];
+ TempA[7, 1] = mmxreg2[63, 1];
+ Reg64 = TempA;
+}
+:PMOVMSKB Reg64, XmmReg2 is vexMode=0 & $(PRE_66) & opsize=2 & byte=0x0F; byte=0xD7; Reg64 & XmmReg2
+{
+ TempA:8 = 0;
+ TempA[0, 1] = XmmReg2[7, 1];
+ TempA[1, 1] = XmmReg2[15, 1];
+ TempA[2, 1] = XmmReg2[23, 1];
+ TempA[3, 1] = XmmReg2[31, 1];
+ TempA[4, 1] = XmmReg2[39, 1];
+ TempA[5, 1] = XmmReg2[47, 1];
+ TempA[6, 1] = XmmReg2[55, 1];
+ TempA[7, 1] = XmmReg2[63, 1];
+ TempA[8, 1] = XmmReg2[71, 1];
+ TempA[9, 1] = XmmReg2[79, 1];
+ TempA[10, 1] = XmmReg2[87, 1];
+ TempA[11, 1] = XmmReg2[95, 1];
+ TempA[12, 1] = XmmReg2[103, 1];
+ TempA[13, 1] = XmmReg2[111, 1];
+ TempA[14, 1] = XmmReg2[119, 1];
+ TempA[15, 1] = XmmReg2[127, 1];
+ Reg64 = TempA;
+}
@endif
define pcodeop pmulhrsw;
@@ -6851,7 +6912,10 @@ define pcodeop psignd;
:PSIGND XmmReg1, XmmReg2 is vexMode=0 & $(PRE_66) & byte=0x0F; byte=0x38; byte=0x0a; xmmmod = 3 & XmmReg1 & XmmReg2 { XmmReg1=psignd(XmmReg1,XmmReg2); }
define pcodeop pslldq;
-:PSLLDQ XmmReg2, imm8 is vexMode=0 & $(PRE_66) & byte=0x0F; byte=0x73; xmmmod = 3 & reg_opcode=7 & XmmReg2; imm8 { XmmReg2 = pslldq(XmmReg2, imm8:8); }
+:PSLLDQ XmmReg2, imm8 is vexMode=0 & $(PRE_66) & byte=0x0F; byte=0x73; xmmmod = 3 & reg_opcode=7 & XmmReg2; imm8
+{
+ XmmReg2 = XmmReg2 << (imm8 * 8); # Boyan: The shift is in bytes, not bits !
+}
define pcodeop psllw;
:PSLLW mmxreg, m64 is vexMode=0 & mandover=0 & byte=0x0F; byte=0xF1; mmxreg ... & m64 ... { mmxreg = psllw(mmxreg, m64); }
@@ -7614,9 +7678,31 @@ define pcodeop rsqrtss;
:RSQRTSS XmmReg, m32 is vexMode=0 & $(PRE_F3) & byte=0x0F; byte=0x52; XmmReg ... & m32 { XmmReg = rsqrtss(XmmReg, m32); }
:RSQRTSS XmmReg1, XmmReg2 is vexMode=0 & $(PRE_F3) & byte=0x0F; byte=0x52; xmmmod = 3 & XmmReg1 & XmmReg2 { XmmReg1 = rsqrtss(XmmReg1, XmmReg2); }
+
define pcodeop shufpd;
-:SHUFPD XmmReg, m128, imm8 is vexMode=0 & $(PRE_66) & byte=0x0F; byte=0xC6; XmmReg ... & m128; imm8 { XmmReg = shufpd(XmmReg, m128, imm8:8); }
-:SHUFPD XmmReg1, XmmReg2, imm8 is vexMode=0 & $(PRE_66) & byte=0x0F; byte=0xC6; xmmmod=3 & XmmReg1 & XmmReg2; imm8 { XmmReg1 = shufpd(XmmReg1, XmmReg2, imm8:8); }
+:SHUFPD XmmReg, m128, imm8 is vexMode=0 & $(PRE_66) & byte=0x0F; byte=0xC6; XmmReg ... & m128; imm8
+{
+ shifted:16 = XmmReg >> ((imm8 & 0x1)*64);
+ tempA:8 = shifted:8;
+
+ shifted = m128 >> ((imm8 & 0x2)*64);
+ tempB:8 = shifted:8;
+
+ XmmReg[0, 64] = tempA;
+ XmmReg[64, 64] = tempB;
+}
+
+:SHUFPD XmmReg1, XmmReg2, imm8 is vexMode=0 & $(PRE_66) & byte=0x0F; byte=0xC6; xmmmod=3 & XmmReg1 & XmmReg2; imm8
+{
+ shifted:16 = XmmReg1 >> ((imm8 & 0x1)*64);
+ tempA:8 = shifted:8;
+
+ shifted = XmmReg2 >> ((imm8 & 0x2)*64);
+ tempB:8 = shifted:8;
+
+ XmmReg1[0, 64] = tempA;
+ XmmReg1[64, 64] = tempB;
+}
:SHUFPS XmmReg, m128, imm8 is vexMode=0 & mandover=0 & byte=0x0F; byte=0xC6; (m128 & XmmReg ...); imm8 & Order0 & Order1 & Order2 & Order3
{
This diff should be merged with a more recent version of Ghidra's x86 sleigh spec and potentially push any fixes as a PR to upstream.
I don't think there's too much harm in maintaining a vendored version of the sleigh specs for more fixes later, but the patches should be well-documented and include the base version of Ghidra used.
from maat import *
m = MaatEngine(ARCH.X64, OS.LINUX)
m.mem.map(0x410000, 0x411000, PERM.RX)
m.mem.write(0x410000, b"\xeb\xfe", ignore_flags=True)
m.run_from(0x410000, 1)
# tries to lift OOB bytes and fails
print(m.info)
The above script results in a failure from Sleigh attempting to decode bytes past the end of the mapped memory.
$ python3 example.py
FATAL: Error in sleigh translate(): Sleigh raised a bad data exception: r0x004120b0: Unable to resolve constructor
[Fatal] Lifter error: MaatEngine::run(): failed to lift instructions
Stop: fatal error in Maat
I have also seen other sleigh errors, all referencing memory past what should be mapped.
FATAL: Error in sleigh translate(): Can not lift instruction at 0x4110c9: IN AL, 0x3d (unsupported callother occurence)
I experience this consistently if I have only mapped a page or less. It will still happen occasionally if I map a greater amount (e.g. m.mem.map(0x410000, 0x620000)
)
I think it is because the call to lift_block
specifies a arbitrary code size of 0xfffffff
as seen below.
lifters[_current_cpu_mode]->lift_block(
*ir_map,
addr,
mem->raw_mem_at(addr),
0xfffffff,
0xffffffff,
nullptr, // is_symbolic
nullptr, // is_tainted
true
)
But I am not sure why this issue is not encountered when loading via the lief loader.
In addition to the unitary tests written in C++, we should add a couple tests in Python. Maintaining comprehensive unit-tests in Python is definitely not what we want, but we could add scripts solving small CTF
or hand-crafted challenges.
pytest
for exampleCurrently snapshots record every single memory write event to be able to restore a past state. This won't scale so well on very long traces that are very memory intensive. An alternative would be to snapshot memory on a per-page basis: save the whole page when it gets written for the first time, and don't record subsequent operations affecting this page.
We should probably allow both memory snapshoting strategies to be selected as a setting for advanced users.
When computing the ITE
expression for a symbolic read, we should only consider values from memory pages that have the R
flag set.
Hello everyone!
Today I tried to install PyMaat and it instantly failed with not finding a version to install.
$ python3 -m pip install pymaat
Defaulting to user installation because normal site-packages is not writeable
ERROR: Could not find a version that satisfies the requirement pymaat (from versions: none)
ERROR: No matching distribution found for pymaat
Also tried with sudo
but got the same errors
$ sudo -H python3 -m pip install pymaat
ERROR: Could not find a version that satisfies the requirement pymaat (from versions: none)
ERROR: No matching distribution found for pymaat
I'm using pip-22.0.3
and Python 3.8.9
Regards
At the moment the only specificity of LifterX86
vs its abstract parent class Lifter
is that it initialises the sleigh interface with the correct .sla
and .pspec
files.
This is a legacy from using a custom lifter, but since we now rely on sleigh for lifting there is no need for specialised Lifter<arch>
classes. We could just factorise the logic into Lifter
and have it handle all architectures.
Currently, we use the Arg
class to specify command line arguments to pass to the symbolically executed program. This allows to create concrete, concolic, and symbolic arguments. However it is not possible to mix concrete and symbolic/concolic bytes in the same argument string, which could be useful sometimes.
We should drop Arg
and allow passing either string
/uint8_t*
or vector<Expr>
to the loader for command line arguments. This way we can still easily create full concrete and symbolic arguments (using the VarContext::new_concolic_buffer
and VarContext::new_symbolic_buffer
methods), but also mix them if needed.
from maat import *
engine = MaatEngine(ARCH.X86, OS.LINUX)
snap = engine.take_snapshot()
engine.mem.map(0x10000, 0x11000)
engine.mem.write(0x10000, b'\x90'*0x1000)
engine.restore_snapshot(snap)
The above script will result in an abort with the error:
terminate called after throwing an instance of 'maat::runtime_exception'
what(): Trying to restore from concrete-snapshot at address 0x10ff8 not mapped int memory
Aborted (core dumped)
I think when calling record_mem_write
you may want to ensure the snapshot's map has a place for the memory, or skip writes to pages not in the maps when restoring.
Originally posted by novafacing February 24, 2022
Is there a way to either explicitly set or access the address libraries are loaded at? I'm using angr
to extract some PLT information and trying to set a callback in maat
on the PLT stub address in the loaded library, but the addresses don't match up.
I'm using the Python interface. Thanks!
#48 explains that memory mappings are currently accessible only from the C++ API. While @novafacing posted a workaround script to get the mappings programatically we should eventually add python bindings for the MemEngine.mappings
attribute and the MemMapManager
class.
Add systematic and fast tests before notifying the event manager to avoid overhead of setting the info field while no event callback is active.
We should write tutorials on how to properly do path exploration with Maat. This will surely immensely help people trying to do more advanced analysis as the exploration strategy is a bit different than with other tools.
We can re-use the python tests scripts that we have for x86
and use a different target program.
We should re-add introspection stats in the engine, especially:
Some interesting things to add:
Value
instances createdThe emulator prints out:
[Error] Exception in CALLOTHER handler: SYSCALL: EnvEmulator: syscall '231' not supported for emulation
[Error] Unexpected error when processing IR instruction, aborting...
upon executing an exit_group
syscall, which is somewhat opaque given it happens quite frequently and is not really an error. Is there a nicer way we could implement exit
-like syscalls to make things more clear to the user that execution has ended (possibly with the exit code) instead of showing an error?
Maybe it doesn't make sense right now since Maat builds via Make. But if you have future plans to move to CMake, using our SLEIGH wrapper here would clean things up a bit. I'd be happy to help with that.
Amazing work by the way. I'm really excited about your project. 🎉
In MaatEngine::load()
the users can specify a load address using the base
argument. The requested base is added to the binary base address (which is 0
for most relocatable binaries). That can be confusing and result in the binary being loaded at the wrong address if base
is non-null for a non-relocatable binary. See this comment in #49.
We should simply detect and throw an error when users try to use base
on a non-relocatable binary.
Description
It would be a really nice features to be able to dump the engine state on the disk, and later reload the state into an existing/new engine. It would allow to:
Implementation
Implementing engine serialisation/deserialisation requires a huge engineering effort. Especially, it requires serialising many different object types many of which share one or many of the following characteristics:
We will most likely need to write serialisation methods for many of Maat's classes and objects. We should see to what extend we could take advantage of existing C++ serialisation libraries, the main criteria being:
vector
, list
, optional
, map
, ...)The state manager fails while trying to deserialize if the directory passed to the SimpleStateManager
does not exist. I'd expect either an error when creating the SimpleStateManager
, or for the directory to be created. At minimum, I think Deserializer
's constructor should check if the stream is valid.
import os
from maat import *
eng = MaatEngine(ARCH.X64, OS.LINUX)
path = "/tmp/directory_does_not_exist"
state_manager = SimpleStateManager(path)
state_manager.enqueue_state(eng)
print(f"Enqueued state, exists = {os.path.exists(path)}")
# abort() during dequeue_state
if not state_manager.dequeue_state(eng):
print("Did not dequeue state")
print("Done")
/home/jay/libs/maat/build/maat.cpython-39-x86_64-linux-gnu.so
Enqueued state, exists = False
terminate called after throwing an instance of 'maat::serialize_exception'
what(): Deserializer::Factory::new_object: unsupported class uid: 0
Aborted (core dumped)
Not a big deal, but just figured I'd post an issue for it to have it on record.
At the moment the API of MemEngine
mixes methods that map memory (map
, allocate
, unmap
) with lower-level methods that manage MemSegments
(new_segment
, allocate_segment
, delete_segment
). The latter should be made protected
, or at least be given some _
prefix indicating that they are not the preferred way to manage memory mappings.
map
, allocate
, unmap
, instead of 'segment' methodsis_free
into two methods:
public is_free(start,end)
: that checks if a range is not mapped (!= no overlapping MemSegment
)private is_segment_intersecting_with(start,end)
: that checks whether an existing MemSegment
intersects with a rangeMemSegment
underlying list
MemSegment
an internal classMemMap
class (which holds just start
, end
, perms
, and name
), and a MemMapManager
that updates them on map
/unmap
new_segment
with map/alloc/unmap
Hello there!
I have been playing with pymaat and when I try to execute the instructions pushf
/ popf
i get the following:
FATAL: Error in sleigh translate(): X86: Register translation from SLEIGH to MAAT missing for register eflags
[Fatal] Lifter error: MaatEngine::run(): failed to lift instructions
Looks like maat::ExprITE::hash()
can get into infinite recursion and crash here. Here is a backtrace:
(gdb) where
#0 0x00007f8f62c8d9cb in maat::ExprITE::hash() () from /root/.cache/pypoetry/virtualenvs/reface-mFqyHumy-py3.9/lib/python3.9/site-packages/maat.cpython-39-x86_64-linux-gnu.so
#1 0x00007f8f62c8d9ed in maat::ExprITE::hash() () from /root/.cache/pypoetry/virtualenvs/reface-mFqyHumy-py3.9/lib/python3.9/site-packages/maat.cpython-39-x86_64-linux-gnu.so
<...snip...>
#7475 0x00007f8f62c8d9ed in maat::ExprITE::hash() () from /root/.cache/pypoetry/virtualenvs/reface-mFqyHumy-py3.9/lib/python3.9/site-packages/maat.cpython-39-x86_64-linux-gnu.so
#7476 0x00007f8f62c8d9ed in maat::ExprITE::hash() () from /root/.cache/pypoetry/virtualenvs/reface-mFqyHumy-py3.9/lib/python3.9/site-packages/maat.cpython-39-x86_64-linux-gnu.so
#7477 0x00007f8f62c8d9ed in maat::ExprITE::hash() () from /root/.cache/pypoetry/virtualenvs/reface-mFqyHumy-py3.9/lib/python3.9/site-packages/maat.cpython-39-x86_64-linux-gnu.so
#7478 0x00007f8f62c8d9ed in maat::ExprITE::hash() () from /root/.cache/pypoetry/virtualenvs/reface-mFqyHumy-py3.9/lib/python3.9/site-packages/maat.cpython-39-x86_64-linux-gnu.so
#7479 0x00007f8f62c8c32b in maat::ExprObject::eq(std::shared_ptr<maat::ExprObject>) () from /root/.cache/pypoetry/virtualenvs/reface-mFqyHumy-py3.9/lib/python3.9/site-packages/maat.cpython-39-x86_64-linux-gnu.so
#7480 0x00007f8f62cd1ebf in maat::MemSegment::symbolic_ptr_read(maat::Value&, std::shared_ptr<maat::ExprObject> const&, maat::ValueSet&, unsigned int, std::shared_ptr<maat::ExprObject> const&) () from /root/.cache/pypoetry/virtualenvs/reface-mFqyHumy-py3.9/lib/python3.9/site-packages/maat.cpython-39-x86_64-linux-gnu.so
#7481 0x00007f8f62cd24c4 in maat::MemEngine::symbolic_ptr_read(maat::Value&, std::shared_ptr<maat::ExprObject>, maat::ValueSet const&, unsigned int, maat::Settings const&) () from /root/.cache/pypoetry/virtualenvs/reface-mFqyHumy-py3.9/lib/python3.9/site-packages/maat.cpython-39-x86_64-linux-gnu.so
#7482 0x00007f8f62c39927 in maat::MaatEngine::resolve_addr_param(maat::ir::Param const&, maat::ir::ProcessedInst::Param&) () from /root/.cache/pypoetry/virtualenvs/reface-mFqyHumy-py3.9/lib/python3.9/site-packages/maat.cpython-39-x86_64-linux-gnu.so
#7483 0x00007f8f62c39d3a in maat::MaatEngine::process_load(maat::ir::Inst const&, maat::ir::ProcessedInst&) () from /root/.cache/pypoetry/virtualenvs/reface-mFqyHumy-py3.9/lib/python3.9/site-packages/maat.cpython-39-x86_64-linux-gnu.so
#7484 0x00007f8f62c3c2ac in maat::MaatEngine::run(int) () from /root/.cache/pypoetry/virtualenvs/reface-mFqyHumy-py3.9/lib/python3.9/site-packages/maat.cpython-39-x86_64-linux-gnu.so
#7485 0x00007f8f62bf8468 in maat::py::MaatEngine_run(_object*, _object*) () from /root/.cache/pypoetry/virtualenvs/reface-mFqyHumy-py3.9/lib/python3.9/site-packages/maat.cpython-39-x86_64-linux-gnu.so
#7486 0x00000000005310fd in ?? ()
#7487 0x0000000000512192 in _PyEval_EvalFrameDefault ()
#7488 0x0000000000528b63 in _PyFunction_Vectorcall ()
#7489 0x0000000000512192 in _PyEval_EvalFrameDefault ()
#7490 0x00000000005106ed in ?? ()
#7491 0x0000000000510497 in _PyEval_EvalCodeWithName ()
#7492 0x00000000005f5be3 in PyEval_EvalCode ()
#7493 0x0000000000619de7 in ?? ()
#7494 0x0000000000615610 in ?? ()
#7495 0x0000000000619d79 in ?? ()
#7496 0x0000000000619816 in PyRun_SimpleFileExFlags ()
#7497 0x000000000060d4e3 in Py_RunMain ()
#7498 0x00000000005ea6e9 in Py_BytesMain ()
#7499 0x00007f8f63881d0a in __libc_start_main (main=0x5ea6b0, argc=6, argv=0x7fff4abfac48, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fff4abfac38) at ../csu/libc-start.c:308
#7500 0x00000000005ea5ea in _start ()
I'm not sure what's causing the issue. I can email a test script and binaries if needed to reproduce, but I would prefer not to post them publicly.
Add the munmap
syscall once #10 is implemented
While ideally sleigh would be able to lift all instructions, there are many occasions where it can not.
When attempting to use MAAT to follow a debugger trace, it would be nice if there were a way to ignore failure to lift an instruction. I would prefer being able to set an option for an engine that will let the lifter just lift as far as it can, and only return an error if no instructions could be lifted at all.
Maybe a simpler alternative to the above would be just to limit the number of instructions to be lifted to not be more than the number of instructions passed to run
?
That way it would be possible to do something like:
eng, dbg = init()
while True:
dbg.step()
try:
stop = eng.run(1)
if stop != STOP.INST_COUNT:
# handle stops ...
except LifterError:
# fix up registers if maat gets out of sync with dbg
# possible loss of fidelity and symbolic info if we can't lift an instruction
fixup_registers(dbg, eng)
In order to implement the munmap
system call, we need the ability to arbitrarily truncate existing segments since some of their contents might get unmapped.
Just as we have extend_before
and extend_after
, we should implement a truncate(start, end)
that truncates the MemSegment
. Function signature for truncate
will likely be similar to vector<MemSegment*> MemSegment::truncate(addr_t start, addr_t end)
, and it will be called by a higher level MemManager::clear_segments(addr_t start, addr_t end)
. There are 3 main cases to consider:
start == 0
, then we can truncate easily and return a single segmentend == segment_end
, then truncating is a bit more tricky but it's still possible to return a single segment[start:end]
is in the middle of the segment, then truncating shall return two segments (the upper and lower parts of the original one)clear_segments()
will first go through all existing segments and call truncate()
, and get the new segment pointers. Then it will replace its old segments list by the new ones. Also, page permissions shall be updated accordingly.
Following comments about improving performance in #5:
If using GNU MP turns out to be a bottleneck, here are some alternatives to consider
Random thoughts:
uint512_t
instead of true multiprecision numbers could be useful it it allows the library to use stack-allocated objects instead of dynamically allocating themNumber
class, and Number
objects are created all the time, so the cost of creation of an MP object is by far the most importantWe should have a setting that allows to provide a list of address ranges, and force possible values for symbolic pointers to be within those ranges. This setting should be compatible with symptr_refine_range
and symptr_limit_range
.
The rational behind such an option is to allow the symbolic pointer analysis to target only specific memory areas, just as was needed for the zehn
challenge from hxp
2021 CTF.
Implementation idea: just compute the possible value range as before, and then refine it using the allowed ranges. If there are intersections, keep them as possible values for the pointer, if there are no intersections, use the pointer concrete value.
The function maat::env::emulated::sys_linux_fstatat
will use a null pointer for the file
argument to _stat
if the input filepath is absolute.
We should provide the value set computed for symbolic pointers in the memory access information. Moreover, we should allow the user to manually tamper it in an event callback, and take the modifications into account before performing the memory access.
mem_access.range
, with attributes
range.min
range.max
range.stride
mem_access.range
and use the modified value set to perform the mem accessmem_access.range
Currently the unit-tests are compiled as two binaries, unit-tests
and adv-tests
.
We should setup a very basic CI job that runs those binaries
As discussed in #49, the issue of libraries not being recursively loaded has (I think!) been fixed, but there is still the problem of libraries being placed in the virtual filesystem at paths different from their original location. For example using the same setup as in #49 with AIS-Lite and modifying the code a bit:
from collections import defaultdict
from typing import Dict, Tuple
from pathlib import Path
from argparse import ArgumentParser
from typing import Dict, List, Optional
from maat import MaatEngine, OS, ARCH, BIN, EVENT, WHEN
from angr import Project
from bisect import insort
class Stage1Maat:
"""Test using maat to run stage 1 instead of triton..."""
BASE = 0x400000
def __init__(
self,
binary: Path,
input_file: Path,
args: Optional[List[str]] = None,
libdirs: Optional[List[str]] = None,
) -> None:
"""
Set up the maat engine with our binary.
:param binary: Path to the binary to run
:param args: Arguments to pass to the binary
"""
self.binary = binary
self.input_file = input_file
self.args = args
self.libdirs = (
list(map(lambda l: str(l.resolve()), libdirs))
if libdirs is not None
else []
)
self.project = Project(
str(self.binary.resolve()),
load_options={
"auto_load_libs": True,
"ld_path": self.libdirs,
"use_system_libs": False,
"skip_libs": ["libc.so.6", "libm.so.6"],
},
)
self.cfg = self.project.analyses.CFGFast(normalize=True)
self.engine = MaatEngine(ARCH.X64, OS.LINUX)
assert binary.is_file(), f"{binary} is not a file or doesn't exist."
print(f"Loading binary {self.binary} with libdirs {self.libdirs}")
self.engine.load(
str(binary.resolve()),
BIN.ELF64,
args=args if args is not None else [],
libdirs=self.libdirs,
)
self.set_callbacks()
self.maps = {}
def run(self) -> None:
"""
Run the binary.
"""
self.engine.run()
def get_mappings(self, engine: MaatEngine) -> Dict[str, List[Tuple[int]]]:
"""
Get the mappings of the binary.
:param engine: Maat engine
"""
mappings = defaultdict(list)
rawmaps = list(map(lambda l: l.strip(), str(engine.mem).splitlines()))
maps = map(
lambda l: l.split(),
filter(
lambda l: l and l.startswith("0x"),
rawmaps[: rawmaps.index("Page permissions:")],
),
)
for mp in maps:
insort(mappings[mp[2]], (int(mp[0], 16), int(mp[1], 16)))
return dict(mappings)
def loader_callback(self, engine: MaatEngine) -> None:
"""
Callback when the loader is running.
:param engine: Maat engine
"""
maps = self.get_mappings(engine)
if maps != self.maps:
print(f"Maps changed: {maps}")
self.maps = maps
def got_to_main(self, _: MaatEngine) -> None:
"""
Callback when we get to main
:param engine: Maat engine
"""
print("Got to main!")
def set_callbacks(self) -> None:
"""
Set a callback to...set the rest of our callbacks (lol) once the program
is loaded.
"""
main_addr = self.project.loader.find_symbol("main").rebased_addr
print(f"Setting callback at {main_addr:#016x}")
maps = self.get_mappings(self.engine)
print(f"Initial maps {maps}")
self.engine.hooks.add(
EVENT.EXEC,
WHEN.BEFORE,
filter=(
maps["ld-linux-x86-64.so.2"][0][0],
maps["ld-linux-x86-64.so.2"][-1][1],
),
callbacks=[self.loader_callback],
group="loader",
)
self.engine.hooks.add(
EVENT.EXEC,
WHEN.BEFORE,
filter=main_addr,
callbacks=[self.got_to_main],
group="setup",
)
if __name__ == "__main__":
parser = ArgumentParser(prog="repro")
parser.add_argument(
"--binary", type=Path, required=True, help="Path to binary to execute."
)
parser.add_argument(
"--input", type=Path, required=True, help="Path to input file to read."
)
parser.add_argument("--args", nargs="*", help="Arguments to pass to binary.")
parser.add_argument(
"--libdirs", nargs="*", type=Path, help="Library directories to load."
)
cli_args = parser.parse_args()
repro = Stage1Maat(
cli_args.binary,
cli_args.input,
cli_args.args,
cli_args.libdirs + [Path("/lib/x86_64-linux-gnu"), Path("/lib64")],
)
repro.run()
We end up getting:
Maps changed: {'ld-linux-x86-64.so.2': [(4096, 8191), (8192, 139263), (139264, 172031), (176128, 188415)], 'Interp.': [(188416, 4382719)], 'Heap': [(4382720, 8417279)], 'map_anon': [(67108864, 67117055)], '/usr/lib/libcgc.so': [(67117056, 67125247), (67125248, 67133439), (67133440, 67137535), (67137536, 67145727)], 'Stack': [(8796090925056, 8796093022207)]}
AIS-Lite: /usr/lib/libcgc.so: no version information available (required by AIS-Lite)
AIS-Lite: /usr/lib/libcgc.so: no version information available (required by /usr/lib/libcgc.so)
AIS-Lite: symbol lookup error: AIS-Lite: undefined symbol: __libc_start_main, version GLIBC_2.2.5
[Error] Exception in CALLOTHER handler: SYSCALL: EnvEmulator: syscall '231' not supported for emulation
[Error] Unexpected error when processing IR instruction, aborting...
Syscall 231 is exit_group, and we're exiting partway through loading, but the thing that stands out to me is that libcgc.so
is tossed into /usr/lib
instead of being emulated in the same directory it is on the host machine. Of course we can solve this with patchelf, but is there another way to either use the same virtual_path
argument passed to MaatEngine.load
to also specify virtual paths for libraries?
I've repackaged the CGC binaries, so the .tar.gz
attached to #49 can be compiled on a tester's machine with cmake -DCMAKE_EXE_LINKER_FLAGS='-no-pie -fno-pie' . && make
to duplicate the configuration I have on my host machine.
Related to #5
More thoughts on having a Value
class equivalent to std::variant<Expr, Number>
:
Expr
and Number
/cst_t
. That includes:
IRContext
and TmpContext
instead of maintaining two listsProcessedInst::Param
CPU
's preprocess_inst
and postprocess_inst
methodsMemEngine
read/write APIInfo
, RegAccess
, MemAccess
, ...is_abstract()
and is_concrete()
methodsis_concrete
, is_concolic
, ..., wrappers methods for convenience useNumber
class, except that they automatically use the internal Number
or Expr
depending on whether the Value
is abstract or concrete. Using in-place operators is a must for performanceExpr
operatorsBasically we should start off from the current ProcessedInst::Param
implementation and build a fully functional Value
class on top of it, then progressively start to use Value
everywhere.
Implementation notes
nullptr
for the expression fieldValue
vs Expr
in the API:
Expr
when we are sure we are dealing with abstract expressions (like new_symbolic_buffer
, ...)Value
when we are unsure if data is concrete or abstractExpr
and Value
when the user inputs symbolic data to the engine (assigning registers, writing memory, creating symbolic buffers, ...), but use mainlyValue
when returning information back to the user (reading registers and memory, info
field in event callbacks, ...)At the moment, taking a snapshot saves the instruction pointer. After restoring a snapshot, execution restarts from the beginning of the instruction pointed by the saved instruction pointer. If we take a snapshot in a callback in the middle of an instruction, this might result in restoring a corrupted IR state.
We should save the InstLocation
information instead of just the PC.
Decide on what licence we will use before releasing publicly
The MemEngine
currently supports only Little Endian. We should add an endianness setting to memory, and make the code work for both Big and Little endian. Places where changes are needed include (but aren't limited to):
MemEngine
SymbolicMemEngine
VarContext
(when creating new concolic/symbolic buffers from raw bytes)Once #37 is merged we should remove the old Makefile based build
Currently the loader will add direct dependencies for the loaded binary in the emulated filesystem, so that they can later be loaded when executing the loader. However, it doesn't look for dependencies of the dependencies, resulting in missing library files when the loader recursively loads dependencies. An example of this problem can be found in #49.
We should get the dependency library list in a recursive fashion when loading a binary and add them all to the emulated file system.
Maat currently uses the following defines:
HAS_SOLVER_BACKEND
Z3_BACKEND
HAS_LOADER_BACKEND
LIEF_BACKEND
PYTHON_BINDINGS
They should be prefixed with MAAT_
to allow other projects to compile and link against Maat without name collisions.
We could allow users to build and install Maat from PyPI for more convenience.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.