skylined / mbugid Goto Github PK
View Code? Open in Web Editor NEWPython module to detect, analyze and id application bugs
License: Other
Python module to detect, analyze and id application bugs
License: Other
https://twitter.com/epakskape/status/943628615337439233
I think this may be more reliable, in that it won't spawn a flurry of breakpoints that I have to guess are related to process creation as is currently the case
The Windows Debugger team have added time-travel debugging (TTD) to WinDbg. I assume it also works in cdb.exe. It might be useful in tracking down the root cause of a bug, e.g. uninitialized memory or maybe even help detect bad casts.
cdb should be replaced by a BugId specific alternative, because an open source alternative would allow me to:
The downside of this is that such an alternative would require a lot of work to develop from scratch, test against real applications and maintain. As such, I do not expect development to start soon.
However, I have previously written a simple debugger (https://github.com/SkyLined/jsondebugger) that implements the basics. Based on my experience there, this may not be entirely unfeasible.
Windows Debuggers now have Javascript extensibility, which could be used to reimplement some or all of the code in a more robust way.
https://blogs.msdn.microsoft.com/windbg/2016/10/27/new-insider-sdk-and-javascript-extensibility/
A bad casts involves a pointer in a local variable, object member or argument to a function pointing to a different class of object than the code expects it to. If the symbols for the module in which the bug is detected provides type information on local variables and arguments, we could determine what class the code expects any pointers among them to point to, and recursively do the same for any pointers in the members of the objects they point to. Page heap will be able to tell us the size of the memory allocated to store these objects. If any pointer + size of the class the code thinks it points to falls outside the allocated memory, that is an obvious bad cast.
Otherwise, if the object has a vftable, we could get an indication of the likely actual class and report if it differs from the expected class. However, this should be presented as information and not guaranteed bug as it may indicate a valid cast rather than a bad cast, and if there are multiple classes with the exact same vftable, they may have been combined to optimize memory use and the symbols may be wrong.
Another option is to scan the memory allocation stack provided by page heap for "new xxx" functions to determine the class, but that is probably even more likely to yield misleading information, so it should come with big caveats if provided in the report.
I got the below bug report through email. The underlying problem is that Get-AppxPackage can output a Name : Value
pair over multiple lines. I attempted to address this in my fix for issue #81 however, I was unable to test the fix and looking at the code I do not believe that it worked. This fix also does not address the issue below, so a more comprehensive fix is needed.
Dear Skylined,
It appears the current version of BugId have a problem when perform fuzzing the Edge browser. Please see exception error:
* UWP application id: MicrosoftEdge, package name: Microsoft.MicrosoftEdge, Arguments: file://C:\Fuzzing\Tests\index.html
┌─ An internal exception has occured ───────────────────────────────────────────────────────
│ AssertionError("Unrecognized Get-AppxPackage output: ' S=Washington, C=US' in\r\n\r\n\r\nName : Microsoft.MicrosoftEdge\r\nPublisher : CN=Microsoft Corporation, O=Microsoft Corporation, L=Redmond, \r\n S=Washington, C=US\r\nArchitecture : Neutral\r\nResourceId : \r\nVersion : 44.17763.1.0\r\nPackageFullName : Microsoft.MicrosoftEdge_44.17763.1.0_neutral__8wekyb3d8bbwe\r\nInstallLocation : C:\\Windows\\SystemApps\\Microsoft.MicrosoftEdge_8wekyb3d8bbwe\r\nIsFramework : False\r\nPackageFamilyName : Microsoft.MicrosoftEdge_8wekyb3d8bbwe\r\nPublisherId : 8wekyb3d8bbwe\r\nIsResourcePackage : False\r\nIsBundle : False\r\nIsDevelopmentMode : False\r\nNonRemovable : True\r\nIsPartiallyStaged : False\r\nSignatureKind : System\r\nStatus : Ok\r\n\r\n\r\n",)
│
│ Stack:
│ 0 __init__ @ C:\Fuzzing\BugId\modules\cBugId\cUWPApplication.py/45
│ > "Unrecognized Get-AppxPackage output: %s in\r\n%s" % (repr(sLine), "\r\n".join(asQueryOutput));
│ 1 __init__ @ C:\Fuzzing\BugId\modules\cBugId\cCdbWrapper.py/88
│ > oCdbWrapper.oUWPApplication = sUWPApplicationPackageName and cUWPApplication(sUWPApplicationPackageName, sUWPApplicationId) or None;
│ 2 __init__ @ C:\Fuzzing\BugId\modules\cBugId\cBugId.py/112
│ > uMaximumNumberOfBugs = uMaximumNumberOfBugs,
│ 3 fMain @ C:\Fuzzing\BugId\BugId.py/883
│ > uMaximumNumberOfBugs = guMaximumNumberOfBugs,
│ 4 C:\Fuzzing\BugId\BugId.py/976
│ > fMain(sys.argv[1:]);
Thanks in advanced,
Joan
Hey,
Using cBugId to hit an AV target, my code is something like this:
def finishedCbk(oBugId, oBugReport):
write2file(oBugReport.sReportHTML)
while True:
# special process setup / verifying it's running correctly
oBugId = cBugId(
sCdbISA = "x64",
auApplicationProcessIds = [pid],
fFinishedCallback = finishedCbk)
oBugId.fStart();
# start a fuzzer thread asynchronously here, doesn't interact with oBugId
oBugId.fWait();
# ensure teardown of target process
# tried calling fStop here, del oBugId, etc. no difference
# dump largest str objects in memory, compare to strs from last time.
On that last line I'm seeing the largest string as always around 5mb, and it appears to be a portion of the htmlreport related to the CDB command listing. It's the same kind of snippet across each time this loops. It never seems to get garbage collected, even when calling gc.collect(). A dump of the first few thousand chars is below:
<hr/><span class="CDBPrompt">0:018> </span><span class="CDBCommand">.prompt_allow -dis -ea -reg -src -sym;</span> <span class="CDBComment">$ Display only the prompt</span><br/><br/><hr/><span class="CDBPrompt">0:018> </span><span class="CDBCommand">.pcmd -s ".echo;";</span> <span class="CDBComment">$ Output a CRLF after running the application</span><br/><br/><hr/><span class="CDBPrompt">0:018> </span><span class="CDBCommand">.lastevent;</span> <span class="CDBComment">$ Get information about last event</span><br/><br/><hr/><span class="CDBPrompt">0:018> </span><span class="CDBCommand">!peb;</span> <span class="CDBComment">$ Get current proces environment block</span><br/><br/><hr/><span class="CDBPrompt">0:018> </span><span class="CDBCommand">lmov a 0x7FF62C250000;</span> <span class="CDBComment">$ Get module information</span><br/><br/><hr/><hr/><span class="CDBPrompt">0:018> </span><span class="CDBCommand">.childdbg 1;</span> <span class="CDBComment">$ Debug child processes</span><br/><br/><hr/><span class="CDBPrompt">0:018> </span><span class="CDBCommand">sxd *;sxi ld;sxi ud;sxd 0xC0000094;sxd 0xC0000095;sxd 0xC0000008;sxd 0xC0000235;sxd 0x80000004;sxd 0x4000001E;sxd 0x40080201;sxd 0xE06D7363;sxe cpr;sxe ibp;sxe epr;sxe aph;sxe 0xC0000005;sxe 0xC0000420;sxe 0x80000003;sxe 0xC000008C;sxe 0x80000002;sxe 0xC0000602;sxe 0x80000001;sxe 0xC000001D;sxe 0xC0000006;sxe 0xC0000096;sxe 0xC0000409;sxe 0xC00000FD;sxe 0x4000001F;sxe 0x80000007;</span> <span class="CDBComment">$ Setup exception handling</span><br/><br/><hr/><span class="CDBPrompt">0:018> </span><span class="CDBCommand">~*m;</span> <span class="CDBComment">$ Resume all threads</span><br/><br/><hr/><hr/><span class="CDBPrompt">0:018> </span><span class="CDBCommand">!heap -p;</span> <span class="CDBComment">$ Get page heap status</span><br/><br/><hr/><span class="CDBPrompt">0:018> </span><span class="CDBCommand">.time;</span> <span class="CDBComment">$ Get debugger time</sp
I'm sure you can spot where the GC lifetime of whatever that is should be, if not I can do more work on figuring out python to track down its refs.
BugId currently relies on cdb.exe to disassemble code. Because I'd like to stop using the closed-source and sometimes buggy cdb.exe in favor of open-source alternatives (so I can fix any bugs I encounter), I will need to find an alternative disassembler.
distorm (https://github.com/gdabah/distorm) is probably the best option for this.
I should explain how to integrate BugId into any framework by instantiating cBugId and adding callbacks.
Page heap/Application Verifier should be replaced by a BugId specific alternative, because an open source alternative would allow me to:
The downside of this is that such an alternative would require a lot of work to develop from scratch, test against real applications and maintain. As such, I do not expect development to start soon.
Allowing cBugId to work as a JIT debugger would have a number of benefits:
I think it should be possible to modify the code such that cBugId can work as both a regular debugger and a JIT debugger at the same time so users can decide which option to use for various use-cases.
Looks like one of the ftuLimitedAndAlignedMemoryDumpStartAddressAndSize calls in cBugReport_foAnalyzeException_STATUS_ACCESS_VIOLATION.py is missing a uPointerSize arg.
I get this error when I attempt to attach to Edge (Win 10) with cdb. I have the following paths set:
The relevant lines in EdgeBugId.cmd:
Exception in thread Thread-1:
Traceback (most recent call last):
File "C:\Python27\Lib\threading.py", line 801, in __bootstrap_inner
self.run()
File "C:\Python27\Lib\threading.py", line 754, in run
self.__target(*self.__args, **self.__kwargs)
File "C:\Users\test\Desktop\BugId-master\BugId-master\modules\cBugId\cCdbWrapper.py", line 213, in _fThreadWrapper
oCdbWrapper.fInternalExceptionCallback(oException);
File "C:\Users\test\Desktop\BugId-master\BugId-master\modules\cBugId\cCdbWrapper.py", line 207, in _fThreadWrapper
fActivity(oCdbWrapper);
File "C:\Users\test\Desktop\BugId-master\BugId-master\modules\cBugId\cCdbWrapper_fCdbStdInOutThread.py", line 117, in cCdbWrapper_fCdbStdInOutThread
oCdbWrapper.nApplicationResumeDebuggerTime = fnGetDebuggerTime(oTimeMatch.group(1));
File "C:\Users\test\Desktop\BugId-master\BugId-master\modules\cBugId\cCdbWrapper_fCdbStdInOutThread.py", line 19, in fnGetDebuggerTime
assert oTimeMatch, "Cannot parse debugger time: %s" % repr(sDebuggerTime);
AssertionError: Cannot parse debugger time: 'Tue Sep 6 22:33:10.098 2016 (UTC - 7:00)'
I've tried with different URLs, running it as admin, manually checking that cdb.exe was working at that path (it is). Not sure what else to try.
BugId currently relies on cdb.exe to unwind stacks. Because I'd like to stop using the closed-source and sometimes buggy cdb.exe in favor of open-source alternatives (so I can fix any bugs I encounter), I will need to find an alternative stack unwinder.
I believe it should not be too hard to implement this using direct Windows API calls and the dbghelp.dll modules.
Since cdb.exe can load dump files and I have plans to make cBugId work as a JIT debugger, it should be possible to have cBugId work with dump files. This would allow off-line creating of bug reports from dump files collecting during fuzzing.
Some people have asked if I could add a feature that would allow BugId to hide the fact that the application is being debugged, similar to what procHideDebug
does in mona (https://github.com/corelan/mona/blob/master/mona.py#L17514)
This should be possible, but it would take more time than I have at this moment, so I am opening this feature request.
WinAppDbg implements a debugger using ctypes in Python. Reimplementing BugId using WinAppDbg should have the following benefits:
Is this the correct function name? I didn't see it listed in https://docs.python.org/2/library/os.path.html, and had always had to use the second element of os.path.split to snag it?
This results in the following exception, but the intent is pretty obvious and thus it's not anything too critical.
********************************************************************************
Traceback (most recent call last):
File "BugId.py", line 45, in <module>
__import__(sModuleName, globals(), locals(), [], -1);
File "Z:\work\cBugId\__init__.py", line 1, in <module>
from cBugId import cBugId;
File "Z:\work\cBugId\cBugId.py", line 40, in <module>
print "%s depends on %s which you can download at:" % (os.path.filename(__fi
le__), sModuleName);
AttributeError: 'module' object has no attribute 'filename'
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.