Giter VIP home page Giter VIP logo

beaengine's Introduction

beaengine

BeaEngine 5

BeaEngine is a C library designed to decode instructions from 16 bits, 32 bits and 64 bits intel architectures. It includes standard instructions set and instructions set from FPU, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, VMX, CLMUL, AES, MPX, AVX, AVX2, AVX512 (VEX & EVEX prefixes), CET, BMI1, BMI2, SGX, UINTR, KL, TDX and AMX extensions. If you want to analyze malicious codes and more generally obfuscated codes, BeaEngine sends back a complex structure that describes precisely the analyzed instructions.

You can use it in C/C++ (usable and compilable with Visual Studio, GCC, MinGW, DigitalMars, BorlandC, WatcomC, SunForte, Pelles C, LCC), in assembler (usable with masm32 and masm64, nasm, fasm, GoAsm) in C#, in Python3, in Delphi, in PureBasic and in WinDev. You can use it in user mode and kernel mode.

First, you can retrieve mnemonic and operands according to the specified syntax : intel syntax for nasm, masm32 and masm64, GoAsm32 and GoAsm64, fasm (no AT&T syntax actually). Next, you can realize accurate analysis on data-flow and control-flow to generate slices or obfuscation patterns.

Its source code is under LGPL3 license with a "Makefile builder" and headers for following languages : C/C++, C#, Python3, Delphi, PureBasic, masm32, masm64, nasm(x86 and x64), fasm(x86 and x64), GoAsm(x86 and x64).

BeaEngine is implemented using opcode tables from the intel documentation, tables from Christian Ludloff website www.sandpile.org and project x86doc from Felix Cloutier https://www.felixcloutier.com

LICENSE

This software is distributed under the LGPL license. See the COPYING and COPYING.LESSER files for more details.

quick start

1. How to use it with Python :

1.1. Very simple example to read one instruction:

#!/usr/bin/python3

from BeaEnginePython import *

instr = Disasm(bytes.fromhex('6202054000443322'))
instr.read()
print(instr.repr())

Output is :

vpshufb zmm24, zmm31, zmmword ptr [r11+r14+0880h]

1.2. Loop on instructions flow and extract instructions modifying rax register:

#!/usr/bin/python3

from BeaEnginePython import *

buffer = bytes.fromhex('4831c04889fbffc04989c49031ed66586a005f80c40c')
instr = Disasm(buffer)
while instr.read() > 0:
  if instr.modifies("rax"):
    print(f"{instr.repr():20}rax register is modified")
  else:
    print(instr.repr())

Output is:

xor rax, rax        rax register is modified
mov rbx, rdi
inc eax             rax register is modified
mov r12, rax
nop
xor ebp, ebp
pop ax              rax register is modified
push 00000000h
pop rdi
add ah, 0Ch         rax register is modified

1.3. Loop on instructions flow and follow jump instructions:

#!/usr/bin/python3

from BeaEnginePython import *

instr = Disasm(bytes.fromhex('e90000000090e901000000cc90'))
while instr.read() > 0:
  print(instr.repr())
  if instr.is_jump():
    instr.follow()

1.4 Extract complete structure

Let's extract complete instruction structure:

#!/usr/bin/python3

from BeaEnginePython import *

instr = Disasm(bytes.fromhex('62017d8115443322'))
instr.read()
print(instr.json())

Output is:

{
  "repr": "vunpckhpd xmm24, xmm16, xmmword ptr [r11+r14+0220h]",
  "category": "AVX512_INSTRUCTION",
  "mnemonic": "vunpckhpd ",
  "bytes": "62 01 7d 81 15 44 33 22",
  "error": 0,
  "arch": 64,
  "operands": {
    "1": {
      "repr": "xmm24",
      "type": "register",
      "size": 128,
      "mode": "write",
      "register": {
        "type": "xmm",
        "value": "REG24"
      }
    },
    "2": {
      "repr": "xmm16",
      "type": "register",
      "size": 128,
      "mode": "read",
      "register": {
        "type": "xmm",
        "value": "REG16"
      }
    },
    "3": {
      "repr": "r11+r14+0220h",
      "type": "memory",
      "size": 128,
      "mode": "read",
      "memory": {
        "base": "REG11",
        "index": "REG14",
        "scale": 1,
        "displacement": "0x22"
      }
    }
  },
  "registers": {
    "modified": {
      "type": 4,
      "gpr": "",
      "mmx": "",
      "xmm": "REG24",
      "ymm": "",
      "zmm": "",
      "special": "",
      "cr": "",
      "dr": "",
      "mem_management": "",
      "mpx": "",
      "opmask": "",
      "segment": "",
      "fpu": "",
      "tmm": ""
    },
    "read": {
      "type": 5,
      "gpr": "REG11+REG14",
      "mmx": "",
      "xmm": "REG16",
      "ymm": "",
      "zmm": "",
      "special": "",
      "cr": "",
      "dr": "",
      "mem_management": "",
      "mpx": "",
      "opmask": "",
      "segment": "",
      "fpu": "",
      "tmm": ""
    }
  },
  "rflags": {
    "of": null,
    "sf": null,
    "zf": null,
    "af": null,
    "pf": null,
    "cf": null,
    "tf": null,
    "if": null,
    "df": null,
    "nt": null,
    "rf": null
  }
}

2. Releases

https://github.com/BeaEngine/beaengine/releases

3. How to Compile :

apt install cmake
git clone https://github.com/BeaEngine/beaengine.git
cmake beaengine
make

4. Compile shared library :

cmake -DoptBUILD_DLL=ON beaengine
make

5. Documentation

Current documentation HERE explains how are working structures from BeaEngine.

old documentation can be read here : http://beatrix2004.free.fr/BeaEngine/index1.php

Each BeaEngine release is provided with pdf documentation:

sudo apt install pandoc texlive-latex-extra
cd doc
pandoc --highlight-style tango -V mainfont="Arial" -V geometry:margin=1cm --output=beaengine.pdf beaengine.md
pandoc --highlight-style tango -V mainfont="Arial" -V geometry:margin=1cm --output=examples.pdf examples.md

6. Examples

Some basic examples to show how BeaEngine is working HERE

7. Dev corner

If you want to improve BeaEngine or just add some private features, here are some links :

beaengine's People

Contributors

beaengine avatar merinoff avatar mesaleh avatar pfautrero avatar ww898 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

beaengine's Issues

The ARGTYPE inconsistent in headers and include

in headers/BeaEngine.h

typedef struct {
char ArgMnemonic[64];
Int32 ArgType;
Int32 ArgSize;
Int32 ArgPosition;
UInt32 AccessMode;
MEMORYTYPE Memory;
UInt32 SegmentReg;
} ARGTYPE;

pragma pack()

while in include/baeengine/BaeEngine.h

pragma pack(1)

typedef struct {
char ArgMnemonic[64];
UInt64 ArgType;
Int32 ArgSize;
Int32 ArgPosition;
UInt32 AccessMode;
MEMORYTYPE Memory;
UInt32 SegmentReg;
} ARGTYPE;

pragma pack()

which one should I use...
(I think the one in "headers" should be generated by the one in "include")

BeaEngine always got wrong address

Hi ! Sir,

I use BeaEngine V4 and Delphi 2010.

var
MyDisasm: TDISASM;

MyDisasm.VirtualAddr:=xxxxxxxx;
MyDisasm.EIP:=xxxxxxxx;

BeaEngine always got wrong address on ALL CALL/JMP/JNE/JE/..........

Why ??

Off-by-one access in fillRegister

Hello,

The following bytes '\x44\x0f\xf8\x41\x8b' (disassembled as psubb mm0,mmword ptr [rcx-75h] by windbg) triggers an off-by-one in the RegistersMMX global variable:

void __bea_callspec__ fillRegister(int index, OPTYPE* pMyOperand, PDISASM pMyDisasm)
{
    size_t i = 0;
    switch(GV.Register_) {
      // ...
      case MMX_REG:
        #ifndef BEA_LIGHT_DISASSEMBLY
           (void) strcpy ((char*) pMyOperand->OpMnemonic+i, RegistersMMX[index]);
        #endif

The index variable is off-by-one:

08 0000002f`75dfce10 00007ff7`b1172f2a     rp_win_x64!fillRegister(int index = 0n8, struct OPTYPE * pMyOperand = 0x0000002f`75dfe06c, struct _Disasm * pMyDisasm = 0x0000002f`75dfdecc)+0x598 [C:\work\codes\rp\src\third_party\beaengine\src\Includes\Routines_ModRM.c @ 105] 

Cheers

xmm absolute address decoding issue

With BeaEngine 5.3 64 bit DLL, when disassembling 64bit, F20F5E142534120000 is being decoded as divsd xmm2, qword ptr [00007FF4FD806425h] while it should be something like divsd xmm2, qword ptr [0000000000001234h]

Compile under Visual Studio on Windows

I'm trying to compile a static library under visual studio on Windows so its linked against the MSVCRuntimes. CMake-GUI throws errors with the included makefile and manually creating the project gives me a slew of errors.

Theres got to be an easier, more straightforward way to do this, its advertised on the readme that it supports VS compilation but I can't find any docs covering it.

Out-of-bounds access in FixOpSizeForMemoryOperand

Hello,

I'm hitting an OOB read access in the below code:

void __bea_callspec__ FixOpSizeForMemoryOperand (PDISASM pMyDisasm)
{
  int i = GV.MemDecoration / 100;
  if (ArgsSize[GV.MemDecoration - (i*100+1)] != 0) {

Basically, the following arithmetic GV.MemDecoration - (i*100+1) is equal to -1 which makes it access 4 bytes before the array:

0:004> ?? pMyDisasm->Reserved_.MemDecoration
int 0n0
0:004> ?? i
int 0n0

This has been caught using address-sanitizer on Windows:

=================================================================
==16444==ERROR: AddressSanitizer: global-buffer-overflow on address 0x7ff67bf4dbfc at pc 0x7ff67bbd3a36 bp 0x002bd7cfd680 sp 0x002bd7cfd688
READ of size 4 at 0x7ff67bf4dbfc thread T16777215
    #0 0x7ff67bbd3a35 in FixOpSizeForMemoryOperand C:\work\codes\rp\src\third_party\beaengine\src\Includes\Routines_Disasm.c:192
    #1 0x7ff67bce538d in Disasm C:\work\codes\rp\src\third_party\beaengine\src\Includes\Routines_Disasm.c:35
    #2 0x7ff67ba75cfd in IntelBeaEngine::disass(unsigned char const *, unsigned __int64, unsigned __int64, enum DisassEngineReturn &) C:\work\codes\rp\src\rp\intelbeaengine.hpp:28

Cheers

Incorrect decoding of vp instructions

The vp instructions does not decode well.
For example:
"c5 f1 ef c9" should be decoded to "vpxor xmm1,xmm1,xmm1" (instead of "lds...").
"c5 f5 74 01" should be decoded to "vpcmpeqb ymm0,ymm1,ymmword ptr [ecx]" (instead of "lds...").

Two stack corruptions

In Routine_ModRM.c, there are 2 lines with "char str[2],,.", and then sprintf is writing to it 2 chars + NULL termination byte, which currupts the stack.

pop dword ptr [esp]

In #2 you applied a change that broke decoding of pop dword ptr [esp+displ].

  • 8F0F24 must be disassembled to pop dword ptr [esp], not pop dword ptr [esp+4]
  • 8F442404 must be disassembled to pop dword ptr [esp+4], not pop dword ptr [esp+8]
  • 8F8424F0000000 must be disassembled to pop dword ptr [esp+F0], not pop dword ptr [esp+F4]
  • and so on

While the current output might seem more "logical", it's a discrepancy from all other disassemblers and assemblers. Copying output from BeaEngine into an x86 assembler will result in a broken program.

A strange behavior in the ShowSegmentRegs option.

My code:

/*
 * mov rax,qword ptr ds:[2D40D36016C]
 * mov qword ptr ds:[2D4127C48A2],rax
 * lea rdx,qword ptr ds:[2D40DB7449C]
*/
unsigned char Ins[]="\x48\x8B\x05\x65\x01\x00\x00\x48\x89\x05\x94\x48\x46\x05\x48\x8D\x15\x87\x44\x81\x00";
DISASM disAsm = {0};
int len;
unsigned char* pEnd = Ins + Size;

disAsm.VirtualAddr = 0x18100000000;
disAsm.EIP = (UInt64)Ins;
disAsm.Archi = 0x40;		// 0x40 = x64,0x20 = x86
disAsm.Options = MasmSyntax | ShowSegmentRegs  | PrefixedNumeral;

while (!disAsm.Error)
{
disAsm.SecurityBlock = (UInt64)(pEnd - disAsm.EIP);
if (disAsm.SecurityBlock <= 0) break;

len = Disasm(&disAsm);

switch (disAsm.Error)
{
case OUT_OF_BLOCK:
	break;
case UNKNOWN_OPCODE:
	printf("%s \n", &disAsm.CompleteInstr);
	disAsm.EIP += 1;
	disAsm.Error = 0;
	break;
default:
	
	printf("%s \n", &disAsm.CompleteInstr);
	disAsm.EIP += len;
	break;
}
}

Output:

mov rax, qword ptr ??:[0x000001810000016C]
mov qword ptr ??:[0x000001810546489B], rax
lea rdx, qword ptr ??:[0x000001810081448E]

ARGUMENTS_TYPE ISSUE

MEMORY_TYPE = NO_ARGUMENT|REGISTER_TYPE
this may lead to mistake when use bit test (e.g. Asm.Argument2.ArgType&NO_ARGUMENT)

ArgSize for JCC short set to 32 rather than 8

Example:

int main()
{
    BYTE instr[] = { 0x75, 0x01 };

    DISASM disasm;
    ZeroMemory(&disasm, sizeof(disasm));
    disasm.EIP = (UIntPtr)instr;
    disasm.VirtualAddr = 0x401000;

    Disasm(&disasm);
    printf("%s\n", disasm.CompleteInstr);
    printf("%d\n", disasm.Argument1.ArgSize);
    printf("%d\n", disasm.Argument2.ArgSize);
    return 0;
}

Output

jne 00401003h
32
0

Invalid LockPrefix

The disassembly of \xf0\x48\x89\xce\xc3 is mov rsi, rcx ; ret when it should be actually be prefixed by the lock prefix. I've done a bit of debugging and BeaEngine sees the lock prefix but tags it as InvalidPrefix.

Also note that BeaEngine seems to behave the same way on the same bytes disassembled as x86.

A workaround would be appreciated!

Cheers

BeaEngineDelphi64 header isssues

Hi, there are a couple issues with the BeaEngineDelphi64 header

  • TREGISTERTYPE declares a "type" field but that is a reserved keyword, it should be renamed or prefixed with '&' as an escape

  • ESReg, DSReg, etc. use C '0x' prefix, while it should be '$'

    ESReg = $1;
    DSReg = $2;
    FSReg = $4;
    GSReg = $8;
    CSReg = $10;
    SSReg = $20;
    

Submit several bug

Right:mov r8b, 0x1 》》》》wrong: mov al, 0x01
Right:mov r9b, 0x1 》》》》wrong: mov cl, 0x01
Wait a lot。。。。。
Right:fstsw ax 》》》》wrong:wait fstsw
And so on, a series of things。。。。。。
Right:add rax, 0xFFFFFFFFAF1DD900 》》》》wrong:add rax, 0x00000000AF1DD900
Right:movsxd rax, edx 》》》》wrong:movsxd rax, rdx
Wait a lot。。。。。。。。。。。

Incorrect Operand AccessMode parsing

Incorrect operand parsing:

x64 assembly: xor r11b, 0
in binary: 41 80 f3 00

Here is a part of test code:

DISASM infos;
int len;
BYTE buf[4] = { 0x41, 0x80, 0xf3, 0x00 };
// init
memset(&infos, 0, sizeof(DISASM));
infos.EIP = (UINT64)buf;
infos.VirtualAddr = 0;
// disasm
len = Disasm(&infos);
cout << infos.Operand1.AccessMode << endl;

I expect the result to be infos.Operand1.AccessMode = 3 (3 = READ | WRITE)
But it print 2 (2 = WRITE)

I check xor rcx, Imm, the result is 3, which means rcx will be read and than write.
But when it disasm xor r11b, Imm, the result is 2, means r11 only be write.
I think the result should be the same as the former as 3.

Could please fix library directory location

Could you please use ${CMAKE_BINARY_DIR} instead of ${CMAKE_SOURCE_DIR} for lib and bin directories in CMakeLists.txt?

set (myLIB_OUTPUT "${CMAKE_SOURCE_DIR}/lib/${myQualification}" )
set (myBIN_OUTPUT "${CMAKE_SOURCE_DIR}/bin/${myQualification}")

Incorrect decode of vextract128

The vextract128 family destination register is disassembled in reversed order with a spurious ymm0 register, for instance

c4 c3 7d 19 c0 01 vextractf128 xmm8,ymm0,0x1

is disassembled as

vextractf128 ymm0, ymm0, xmm8, 01h

Incorrect decode for vaddss with 3 registers

vaddss with 3 xmm registers is decoded with the last register as a memory access, for instance

c5 fa 58 c4 vaddss xmm0,xmm0,xmm4

is disassembled as

vaddss xmm0, xmm0, dword ptr [xmm4]

different versions of BeaEngine.h and basic_types.h.

headers\include\basic_types.h is not up to date, as include\beaengine\basic_types.h.
BeaEngine.h has 3 different versions in the repo.

And, why duplication of the same file in the first place (also valid for export.h)

__impl__ Disasm

I think i need the lib for it, but I don't got it. anyone has the lib?

BeaEnginePython reading invalid address

Hello,

I installed the BeaEnginePython via pip and was going to test out the python examples, specifically this one:

#!/usr/bin/python3

from BeaEnginePython import *

instr = Disasm(bytes.fromhex('90'))
instr.read()
print(instr.repr())

However, I immediately got the following error:

Traceback (most recent call last):
  File "c:/Users/Michael/Desktop/python/Test.py", line 5, in <module>
    instr = Disasm(bytes.fromhex('90'))
OSError: exception: access violation reading 0x00000090

It appears to be interpreting the bytes as an address. Perhaps there is a structure that needs to be pre-set similar to the C wrapper?
Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.