Giter VIP home page Giter VIP logo

Comments (28)

akirilov avatar akirilov commented on July 4, 2024 1

Ooooh.... -nargs specifies the number of arguments to the function, not the command line! That certainly explains a lot.

However, I would expect -fuzz_iterations 1 to still work correctly even with the wrong -nargs parameter. And even with the correct -nargs parameter now, I'm still seeing the exact same behavior. How many iterations does the dry run do? Am I just not waiting long enough for it to finish? (Each iteration it taking me a little under half a minute right now).

from winafl.

ivanfratric avatar ivanfratric commented on July 4, 2024

Besides timeout, hangs like this typically occur when the target fails to report back to afl-fuzz, which can happen if the target_offset is incorrect or the target function does not return cleanly. These issues should be caught by the debug mode though. What does your debug log look like for this particular sample?

from winafl.

akirilov avatar akirilov commented on July 4, 2024

Ah, no, my function wasn't returning. I take it the app can't exit without the target function returning?

Also where do I find the debug log?

Also I'm seeing a possibly related issue where module names are incorrectly case-sensitive.

"FOO.DLL" (the way it is in the filesystem) causes the timeout error
"foo.dll" (all lower) gives me a "No instrumentation detected" error
"fookdsjfklsjdklfs.dll" (not a real module) also gives me the timeout

This leads me to think that module names are converted to all lower and made case-sensitive?

UPDATE:

After fixing the module names, picking a function that returns, and adding some coverage modules back in, I get:

WinAFL 1.02 by <[email protected]>
Based on AFL 1.96b by <[email protected]>
[*] Setting up output directories...
[+] Output directory exists but deemed OK to reuse.
[*] Deleting old session data...
[+] Output dir cleanup successful.
[*] Scanning 'inputs'...
[+] No auto-generated dictionary tokens to reuse.
[*] Creating hard links for all input files...
[*] Attempting dry run with 'id_000000'...
0 processes nudged
SUCCESS: The process with PID 11504 has been terminated.
0 processes nudged
SUCCESS: The process with PID 9140 has been terminated.
0 processes nudged
SUCCESS: The process with PID 4432 has been terminated.

This goes on for about as long as I have patience to watch it, and never seems to reach the status window. I'm guessing the dry run is stuck in some sort of loop?

from winafl.

ivanfratric avatar ivanfratric commented on July 4, 2024
  • Your target_function needs to return normally (except in the case of crash or a hang) in order for WinAFL to be able to redirect the execution back to its beginning.
  • Please see the README for info on the debug mode. Once you run in the debug mode, the debug log should be created in the current directory.
  • Module names are indeed case sensitive and should be exactly the same as they appear in the debug log (this may or may not correspond to the filename). This is also mentioned in the README.

from winafl.

akirilov avatar akirilov commented on July 4, 2024
  • Changed to a function that returns normally
  • Module names changed according to the debug mode list
  • Still have the issue listed above where it never finishes the dry run
  • With the -debug flag, it fails saying it cannot find and kill the child process (I assume this is normal)

Log file contents (filenames removed and long sequences of module loads truncated):

Module loaded,
...
Module loaded,
In OpenFileW,
Module loaded,
...
Module loaded,
In OpenFileW, reading C:\Windows\Microsoft.NET\Framework64\\v2.0.50727\clr.dll
In OpenFileW, reading C:\Windows\Microsoft.NET\Framework64\\v2.0.50727\mscorwks.dll
In OpenFileW, reading C:\Windows\Microsoft.NET\Framework64\\v4.0.30319\clr.dll
In OpenFileW, reading C:\(redacted)\(redacted)\(THE APP).exe.config
Module loaded,
...
Module loaded,
In OpenFileW, reading
Module loaded,
...
Module loaded,
In OpenFileW, reading
Module loaded,
...
Module loaded,
In OpenFileW,
Module loaded,
In OpenFileW,
In OpenFileW,
In OpenFileW,
In OpenFileW,
Module loaded,
In OpenFileW,
Module loaded,
Module loaded,
Module loaded,
In pre_fuzz_handler
In OpenFileW, reading (THE FUZZED FILE)
In OpenFileW, reading (THE FUZZED FILE)
In OpenFileW,
In OpenFileW,
In OpenFileW,
In OpenFileW,
Module loaded,
...
Module loaded,
In OpenFileW,
In post_fuzz_handler
In pre_fuzz_handler
In post_fuzz_handler
In pre_fuzz_handler
In post_fuzz_handler
In pre_fuzz_handler
In post_fuzz_handler
In pre_fuzz_handler
In post_fuzz_handler
In pre_fuzz_handler
In post_fuzz_handler
In pre_fuzz_handler
In post_fuzz_handler
In pre_fuzz_handler
In post_fuzz_handler
In pre_fuzz_handler
In post_fuzz_handler
In pre_fuzz_handler
In post_fuzz_handler

from winafl.

ivanfratric avatar ivanfratric commented on July 4, 2024

Looking at your debug output, it seems that your (THE FUZZED FILE) is being opened outside the target function. This has the effect that WinAFL will not be able to replace the input file for every iteration and will need to kill the target process for every iteration in order to do it. That's why the dry run is taking a long time and you're getting messages about nudging/terminating the child process.

from winafl.

akirilov avatar akirilov commented on July 4, 2024

Not sure I understand. It seems to be opening inside the target function, at least the first time:

In pre_fuzz_handler
In OpenFileW, reading (THE FUZZED FILE)
In OpenFileW, reading (THE FUZZED FILE)
In OpenFileW,
In OpenFileW,
In OpenFileW,
In OpenFileW,
Module loaded,
...
Module loaded,
In OpenFileW,
In post_fuzz_handler

from winafl.

mrpeppels avatar mrpeppels commented on July 4, 2024

Above behaviour is consistent with selecting a function that does contain the file access, but is too high up in the execution. By that i mean that the function does return, but only after running the appropriate function down the line. The sequence of In pre-and post-fuzz messages without file open seem like the target function only opens the file uncer certain circumstances. Hence i suspect your target function is too "broad".
I'm curious what info ivan can provide.

from winafl.

akirilov avatar akirilov commented on July 4, 2024

I can definitely believe "too broad". I chose a function very high up to ensure that the file open is contained (and I just moved it even higher up with the same result).

I'm not sure if it's possible to scope it down any further. The file open happens very early into the parsing. Any clue why "too broad" is an issue with AFL?

from winafl.

mrpeppels avatar mrpeppels commented on July 4, 2024

This is dependent on the execution flow of the program. I suggest using IDA's graphs to get a visual impression of how it runs. My feeling is the function you selected only eventually opens the file if given certain parameters or in a certain state of the program; it's behaviour under DynamoRIO seems to be very variable in your case.

To find a suitable function, i tend to open the file while running ProcMon, then filter for file access. Then right-click the desired file access and view the associated stack trace. Again, i'd love to get some feedback from the developer on an optimal strategy to select an appropriate function.

from winafl.

akirilov avatar akirilov commented on July 4, 2024

Ah, interesting. I was just following up with procmon as well to confirm. I understand the scenario you're talking about but I'd be somewhat surprised if that's the way the execution is working. The broadest function I used is before argument parsing, so as far as I can tell it would have to open the file each time (clearly that's not happening).

I wonder if it's some interaction between the program, DynamoRIO, and/or AFL. How does DynamoRIO interact with Control Flow Guard? It shouldn't matter since control flow is redirected to a valid function, but maybe I'm missing something?

Also, am I correct in understanding that AFL basically jumps back to the target function after it's done running every time for (# OF ITERATIONS) times? In which case, I wonder if some global state might be causing issues.

from winafl.

mrpeppels avatar mrpeppels commented on July 4, 2024

This is where my expertise ends as far as i can tell :). But one thing i would like to mention is the bit about argument parsing; by parameters i don't mean the options passed to STDIN but rather to the internal functions of the executable. The names of the functions these get passed to are only visible with symbols.
This is what you have to get right when specifying the -nargs option
https://msdn.microsoft.com/en-us/library/windows/hardware/ff552052(v=vs.85).aspx

from winafl.

akirilov avatar akirilov commented on July 4, 2024

Yeah, I got what you meant :)

I specifically picked the function that parses the command line arguments since that seems to be the least likely to maintain some kind of weird global state (and should definitely enclose the file open, since the filename is specified in the command line).

And I do have full symbols and source for my target, thought that hasn't helped much with debugging AFL/DynamoRIO

from winafl.

ivanfratric avatar ivanfratric commented on July 4, 2024

Ah sorry, I didn't see that the file was open inside the target function the first time.

Another thing that could cause this behavior (besides the global state as mentioned previously) is incorrect restoring of the target function parameters (in which case it's possible the target function errors out before it can open the file). Take a look if nargs param is correct.

In general, I can't provide much guidance for selecting the target function apart from what was already said. Unless a target is a simple command line utility that exits after processing the input (like a typical target for linux AFL would be) where just using main or a similar function will do, it will take some reversing to find a good offset.

from winafl.

mrpeppels avatar mrpeppels commented on July 4, 2024

Yes :). I already suspected there was a misunderstanding there. STDIN (standard input) in cmd is the keyboard input so maybe that was a bit unclear.

You indicate that an iteration takes ~0.5 minutes, this means there is still something going wrong.
1 second per iteration is already considered very slow.
Such a long time shows the function you selected -is- returning, but only after doing a ton of work.
Your input files should be as small as possible, optimally <1kb. The purpose of selecting a target function is to speed up the process, which is very important in fuzzing. So the function you target should be as narrow and self-contained as possible. If done right you will get hundreds per second.
I can imaging a lot is happening in the time your test case takes; this makes the control flow of the program unmanageable because the behaviour is so unpredictable.

from winafl.

mrpeppels avatar mrpeppels commented on July 4, 2024

As for the question of the iterations on the dry run, just once.
afl-fuzz.c:2440 has this comment:

/* Perform dry run of all test cases to confirm that the app is working as
expected. This is done only for the initial inputs, and only once. */

from winafl.

akirilov avatar akirilov commented on July 4, 2024

Unfortunately, there isn't much to be done to speed up the run. 30 seconds is longer than average for the app (I assume DynamoRIO instrumentation is slowing it down a lot), but this is a very large program with extremely complicated parsing code. Is there any way to get WinAFL working with long parse times like that?

from winafl.

mrpeppels avatar mrpeppels commented on July 4, 2024

No. The function you selected is so high-up that it seems to present two fundamental problems:

  1. The code is so complex that not every fuzz iteration behaves the same, leading to the function not restoring or false positives in instrumentation.
  2. The chance of finding bugs is greatly decreased at slow speeds, AFL needs a lot of iterations to 'learn'.
    The more complex the code, the more iterations it needs.

You still needs days if not weeks to get good coverage on a medium sized media DLL at 100 execs / s. So you need to find a balance between how much code you want to cover and the speed, this will increase your chance of success literally thousandfold. Trying to get low speeds to work kind of misses the point in my view.

I think we might be able to help you better if you provide some info of your target and the size of your input files. I highly doubt a large part of those 30 seconds of work is useful to the fuzzing process.

Also, in my experience, DynamoRIO only presents about a 2 fold increase of execution time which is acceptable for blackbox testing. But it does require your target function to be narrow.

from winafl.

ivanfratric avatar ivanfratric commented on July 4, 2024

@akirilov Normally setting fuzz_iterations to something larger than 1 is how you'd get the largest speedup. WinAFL would be at its slowest the first time it passes through the code, because at that time the code needs to be translated. Afterwards, it is cached so subsequent runs (especially those that do not encounter new coverage) should not exceed 2x native speed. I'm not sure if with correct nargs you can increase fuzz_iterations or is there still a problem.

However if it takes seconds to process the input file natively, I agree with @mrpeppels that it might take a very long time for AFL to get good coverage. In fact it will take a very long time to finish even a single pass over a single file. One thing you can do is to use the -S id option to have WinAFL run in a non-deterministic mode and tamper with the code to reduce the iteration count per sample. This won't make a single iteration run faster but might discover new coverage more quickly (at the expense of thoroughness) for slower targets.

from winafl.

akirilov avatar akirilov commented on July 4, 2024

I think I was unclear when I was talking about opening times. It takes <1s normally and about 15 seconds under WinAFL. I suspect something else may be going on that's slowing it down.

I did try increasing fuzz_iterations using several different functions up and down the callstack, but even with the correct number of args, the process had to be killed every time. I suspect the program has so much stored global state that restoring the function is incredibly difficult.

I couldn't find this anywhere, but what are the rules for coverage modules? I noticed that if I have too few it says there's no instrumentation. I'm wondering if I have the right amount now or if I have too many (or if some of them are too big). Is there a way to know which ones or how many AFL expects? Or a good rule of thumb?

from winafl.

ivanfratric avatar ivanfratric commented on July 4, 2024

The "no instrumentation" message means that no code in the modules you specified was ever ran. You can probably discard those modules as they are not relevant for parsing input files. The good rule is to only use modules that do input file parsing and no other modules (every additional module is going to introduce a slowdown and possibly flaky coverage you don't care about).

from winafl.

akirilov avatar akirilov commented on July 4, 2024

Ah, I see. I removed a bunch of coverage modules and it seems to be a little faster now. Letting it run to completion goes into the main fuzzing logic so I guess I was just impatient with killing it after 5 minutes. I'll mark this as closed and maybe we can follow up offline for ideas on how to speed it up if you're ok with that.

from winafl.

ivanfratric avatar ivanfratric commented on July 4, 2024

If I was you I'd concentrate on figuring out why multiple iterations don't work as this is the best bet to speed it up. You wrote you had the source code for a target so perhaps there is a reasonable way to patch it.

from winafl.

akirilov avatar akirilov commented on July 4, 2024

Yeah, that's my big goal right now. However I'm concerned that won't even help that much. Basically, the issue I'm running into is this:

  • The code I'm targeting is in one massive binary
  • The file open effectively happens in the main function, so I can't target a function lower down because then we would always be looking at the same file

I'm guessing there isn't an easy way to address this

from winafl.

mrpeppels avatar mrpeppels commented on July 4, 2024

The executable always loads the same filename, the contents of the file is what changes during iterations.
The parameters of the function don't change during runs, AFL just needs the -nargs to restore the state if i'm not mistaken.

So choosing a function deeper down would not be a problem in my view.

Also, if you have the source code, it's not a massive blob of binary. You could look in the code at what libraries are used for media parsing and maybe even construct your own harness.

from winafl.

ivanfratric avatar ivanfratric commented on July 4, 2024

If the file open happens in main, then you should use main as the target function, otherwise in general WinAFL might not be able to replace a file that is at the same time open by the target. But the main problem I see is that the file isn't being opened on subsequent iterations of main as seen in the debug log you posted earlier - I'd concentrate on fixing that.

from winafl.

akirilov avatar akirilov commented on July 4, 2024

What calling conventions are you using for saving the arguments? I can't find where it's specified in the source. I'm guessing x64 uses DRWRAP_CALLCONV_MICROSOFT_X64?

from winafl.

ivanfratric avatar ivanfratric commented on July 4, 2024

I'm not explicitly specifying a calling convention and assuming drwrap selects the correct default for the platfrom (see DRWRAP_CALLCONV_DEFAULT in https://github.com/DynamoRIO/dynamorio/blob/a71bf92fb995dada7bfb93046655b6f7c94a22e0/ext/drwrap/drwrap.h#L381)

from winafl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.