Giter VIP home page Giter VIP logo

mbugid's People

Contributors

skylined avatar un-fmunozs avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

mbugid's Issues

Consider reimplementing debugger

cdb should be replaced by a BugId specific alternative, because an open source alternative would allow me to:

  • integrate better (do away with sending stdin and parsing stdout, but have a real API).
  • fix any bugs I find myself and immediately.
  • add improvements and new features.

The downside of this is that such an alternative would require a lot of work to develop from scratch, test against real applications and maintain. As such, I do not expect development to start soon.

However, I have previously written a simple debugger (https://github.com/SkyLined/jsondebugger) that implements the basics. Based on my experience there, this may not be entirely unfeasible.

Potentially detect bad casts using structure size

A bad casts involves a pointer in a local variable, object member or argument to a function pointing to a different class of object than the code expects it to. If the symbols for the module in which the bug is detected provides type information on local variables and arguments, we could determine what class the code expects any pointers among them to point to, and recursively do the same for any pointers in the members of the objects they point to. Page heap will be able to tell us the size of the memory allocated to store these objects. If any pointer + size of the class the code thinks it points to falls outside the allocated memory, that is an obvious bad cast.

Otherwise, if the object has a vftable, we could get an indication of the likely actual class and report if it differs from the expected class. However, this should be presented as information and not guaranteed bug as it may indicate a valid cast rather than a bad cast, and if there are multiple classes with the exact same vftable, they may have been combined to optimize memory use and the symbols may be wrong.

Another option is to scan the memory allocation stack provided by page heap for "new xxx" functions to determine the class, but that is probably even more likely to yield misleading information, so it should come with big caveats if provided in the report.

BugId does not handle multi-line values in Get-AppxPackage output correctly.

I got the below bug report through email. The underlying problem is that Get-AppxPackage can output a Name : Value pair over multiple lines. I attempted to address this in my fix for issue #81 however, I was unable to test the fix and looking at the code I do not believe that it worked. This fix also does not address the issue below, so a more comprehensive fix is needed.

Original report

Dear Skylined,

It appears the current version of BugId have a problem when perform fuzzing the Edge browser. Please see exception error:

* UWP application id: MicrosoftEdge, package name: Microsoft.MicrosoftEdge, Arguments: file://C:\Fuzzing\Tests\index.html

┌─ An internal exception has occured ───────────────────────────────────────────────────────
│ AssertionError("Unrecognized Get-AppxPackage output: '                    S=Washington, C=US' in\r\n\r\n\r\nName              : Microsoft.MicrosoftEdge\r\nPublisher         : CN=Microsoft Corporation, O=Microsoft Corporation, L=Redmond, \r\n                    S=Washington, C=US\r\nArchitecture      : Neutral\r\nResourceId        : \r\nVersion           : 44.17763.1.0\r\nPackageFullName   : Microsoft.MicrosoftEdge_44.17763.1.0_neutral__8wekyb3d8bbwe\r\nInstallLocation   : C:\\Windows\\SystemApps\\Microsoft.MicrosoftEdge_8wekyb3d8bbwe\r\nIsFramework       : False\r\nPackageFamilyName : Microsoft.MicrosoftEdge_8wekyb3d8bbwe\r\nPublisherId       : 8wekyb3d8bbwe\r\nIsResourcePackage : False\r\nIsBundle          : False\r\nIsDevelopmentMode : False\r\nNonRemovable      : True\r\nIsPartiallyStaged : False\r\nSignatureKind     : System\r\nStatus            : Ok\r\n\r\n\r\n",)
│
│  Stack:
│   0 __init__ @ C:\Fuzzing\BugId\modules\cBugId\cUWPApplication.py/45
│      > "Unrecognized Get-AppxPackage output: %s in\r\n%s" % (repr(sLine), "\r\n".join(asQueryOutput));
│   1 __init__ @ C:\Fuzzing\BugId\modules\cBugId\cCdbWrapper.py/88
│      > oCdbWrapper.oUWPApplication = sUWPApplicationPackageName and cUWPApplication(sUWPApplicationPackageName, sUWPApplicationId) or None;
│   2 __init__ @ C:\Fuzzing\BugId\modules\cBugId\cBugId.py/112
│      > uMaximumNumberOfBugs = uMaximumNumberOfBugs,
│   3 fMain @ C:\Fuzzing\BugId\BugId.py/883
│      > uMaximumNumberOfBugs = guMaximumNumberOfBugs,
│   4 C:\Fuzzing\BugId\BugId.py/976
│      > fMain(sys.argv[1:]);

Thanks in advanced,
Joan

Memleak in cBugId cleanup?

Hey,

Using cBugId to hit an AV target, my code is something like this:

def finishedCbk(oBugId, oBugReport):
    write2file(oBugReport.sReportHTML)

while True:
    # special process setup / verifying it's running correctly
    oBugId = cBugId(
        sCdbISA = "x64",
        auApplicationProcessIds = [pid],
        fFinishedCallback = finishedCbk)

    oBugId.fStart();

    # start a fuzzer thread asynchronously here, doesn't interact with oBugId

    oBugId.fWait();

    # ensure teardown of target process

    # tried calling fStop here, del oBugId, etc. no difference

    # dump largest str objects in memory, compare to strs from last time.

On that last line I'm seeing the largest string as always around 5mb, and it appears to be a portion of the htmlreport related to the CDB command listing. It's the same kind of snippet across each time this loops. It never seems to get garbage collected, even when calling gc.collect(). A dump of the first few thousand chars is below:

<hr/><span class="CDBPrompt">0:018&gt; </span><span class="CDBCommand">.prompt_allow -dis -ea -reg -src -sym;</span> <span class="CDBComment">$ Display only the prompt</span><br/><br/><hr/><span class="CDBPrompt">0:018&gt; </span><span class="CDBCommand">.pcmd -s &quot;.echo;&quot;;</span> <span class="CDBComment">$ Output a CRLF after running the application</span><br/><br/><hr/><span class="CDBPrompt">0:018&gt; </span><span class="CDBCommand">.lastevent;</span> <span class="CDBComment">$ Get information about last event</span><br/><br/><hr/><span class="CDBPrompt">0:018&gt; </span><span class="CDBCommand">!peb;</span> <span class="CDBComment">$ Get current proces environment block</span><br/><br/><hr/><span class="CDBPrompt">0:018&gt; </span><span class="CDBCommand">lmov a 0x7FF62C250000;</span> <span class="CDBComment">$ Get module information</span><br/><br/><hr/><hr/><span class="CDBPrompt">0:018&gt; </span><span class="CDBCommand">.childdbg 1;</span> <span class="CDBComment">$ Debug child processes</span><br/><br/><hr/><span class="CDBPrompt">0:018&gt; </span><span class="CDBCommand">sxd *;sxi ld;sxi ud;sxd 0xC0000094;sxd 0xC0000095;sxd 0xC0000008;sxd 0xC0000235;sxd 0x80000004;sxd 0x4000001E;sxd 0x40080201;sxd 0xE06D7363;sxe cpr;sxe ibp;sxe epr;sxe aph;sxe 0xC0000005;sxe 0xC0000420;sxe 0x80000003;sxe 0xC000008C;sxe 0x80000002;sxe 0xC0000602;sxe 0x80000001;sxe 0xC000001D;sxe 0xC0000006;sxe 0xC0000096;sxe 0xC0000409;sxe 0xC00000FD;sxe 0x4000001F;sxe 0x80000007;</span> <span class="CDBComment">$ Setup exception handling</span><br/><br/><hr/><span class="CDBPrompt">0:018&gt; </span><span class="CDBCommand">~*m;</span> <span class="CDBComment">$ Resume all threads</span><br/><br/><hr/><hr/><span class="CDBPrompt">0:018&gt; </span><span class="CDBCommand">!heap -p;</span> <span class="CDBComment">$ Get page heap status</span><br/><br/><hr/><span class="CDBPrompt">0:018&gt; </span><span class="CDBCommand">.time;</span> <span class="CDBComment">$ Get debugger time</sp

I'm sure you can spot where the GC lifetime of whatever that is should be, if not I can do more work on figuring out python to track down its refs.

Use internal disassembler

BugId currently relies on cdb.exe to disassemble code. Because I'd like to stop using the closed-source and sometimes buggy cdb.exe in favor of open-source alternatives (so I can fix any bugs I encounter), I will need to find an alternative disassembler.

distorm (https://github.com/gdabah/distorm) is probably the best option for this.

Consider reimplementing page-heap

Page heap/Application Verifier should be replaced by a BugId specific alternative, because an open source alternative would allow me to:

  • fix any bugs I find myself and immediately.
  • add improvements and new features.
  • integrate it with other heap managers, so application that do not use the Windows heap manager can have the same benefits.

The downside of this is that such an alternative would require a lot of work to develop from scratch, test against real applications and maintain. As such, I do not expect development to start soon.

Allow cBugId to work as a JIT debugger by setting this up in the registry.

Allowing cBugId to work as a JIT debugger would have a number of benefits:

  1. No overhead from debugging the application when there's no crash.
  2. Application will not detect debugger because there is none until it crashes.
  3. Avoid dbgsrv.exe problems (SkyLined/BugId#85).
    It also has a number of problems:
  4. Binaries that do first-chance exception handling may prevent JIT debugger from being activated and leave crashes undetected.
  5. JIT debugger setup through registry applies to all applications; coincidental crashes in unrelated applications will get reported.

I think it should be possible to modify the code such that cBugId can work as both a regular debugger and a JIT debugger at the same time so users can decide which option to use for various use-cases.

Debugger Never Attaches

I get this error when I attempt to attach to Edge (Win 10) with cdb. I have the following paths set:

  • Set cdb=C:\Program Files (x86)\Windows Kits\10\Debuggers\x64\cdb.exe
  • Set BugId=C:\Users\test\Desktop\BugId-master\BugId-master\BugId.py

The relevant lines in EdgeBugId.cmd:

Exception in thread Thread-1:
Traceback (most recent call last):
  File "C:\Python27\Lib\threading.py", line 801, in __bootstrap_inner
    self.run()
  File "C:\Python27\Lib\threading.py", line 754, in run
    self.__target(*self.__args, **self.__kwargs)
  File "C:\Users\test\Desktop\BugId-master\BugId-master\modules\cBugId\cCdbWrapper.py", line 213, in _fThreadWrapper
    oCdbWrapper.fInternalExceptionCallback(oException);
  File "C:\Users\test\Desktop\BugId-master\BugId-master\modules\cBugId\cCdbWrapper.py", line 207, in _fThreadWrapper
    fActivity(oCdbWrapper);
  File "C:\Users\test\Desktop\BugId-master\BugId-master\modules\cBugId\cCdbWrapper_fCdbStdInOutThread.py", line 117, in cCdbWrapper_fCdbStdInOutThread
    oCdbWrapper.nApplicationResumeDebuggerTime = fnGetDebuggerTime(oTimeMatch.group(1));
  File "C:\Users\test\Desktop\BugId-master\BugId-master\modules\cBugId\cCdbWrapper_fCdbStdInOutThread.py", line 19, in fnGetDebuggerTime
    assert oTimeMatch, "Cannot parse debugger time: %s" % repr(sDebuggerTime);
AssertionError: Cannot parse debugger time: 'Tue Sep  6 22:33:10.098 2016 (UTC - 7:00)'

I've tried with different URLs, running it as admin, manually checking that cdb.exe was working at that path (it is). Not sure what else to try.

Use internal stack unwinder

BugId currently relies on cdb.exe to unwind stacks. Because I'd like to stop using the closed-source and sometimes buggy cdb.exe in favor of open-source alternatives (so I can fix any bugs I encounter), I will need to find an alternative stack unwinder.

I believe it should not be too hard to implement this using direct Windows API calls and the dbghelp.dll modules.

Allow cBugId to handle dump files.

Since cdb.exe can load dump files and I have plans to make cBugId work as a JIT debugger, it should be possible to have cBugId work with dump files. This would allow off-line creating of bug reports from dump files collecting during fuzzing.

Consider porting to WinAppDbg

WinAppDbg implements a debugger using ctypes in Python. Reimplementing BugId using WinAppDbg should have the following benefits:

  1. direct control over debugging behavior
    1a) faster debugging because more fin grained control over what the debugger does.
    1b) better debugging because issues can be reported and fixed upstream.
  2. more control over application
    2a) easier to meddle with the debugged process when additional analysis requires this.
    2b) possibility to reimplement page-heap and add features and/or support for other heap implementations.

http://winappdbg.sourceforge.net/

usage of os.path.filename in cBugId.py?

Is this the correct function name? I didn't see it listed in https://docs.python.org/2/library/os.path.html, and had always had to use the second element of os.path.split to snag it?

This results in the following exception, but the intent is pretty obvious and thus it's not anything too critical.

********************************************************************************

Traceback (most recent call last):
  File "BugId.py", line 45, in <module>
    __import__(sModuleName, globals(), locals(), [], -1);
  File "Z:\work\cBugId\__init__.py", line 1, in <module>
    from cBugId import cBugId;
  File "Z:\work\cBugId\cBugId.py", line 40, in <module>
    print "%s depends on %s which you can download at:" % (os.path.filename(__fi
le__), sModuleName);
AttributeError: 'module' object has no attribute 'filename'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.