aflplusplus / libafl-legacy Goto Github PK

AFL++ as a library: gives you all the tools necessary to craft the best fuzzer for your targets with ease!

License: Apache License 2.0

Python 0.87% Makefile 1.11% C 96.71% C++ 1.31%

libafl-legacy's Issues

Order of implementation of features

So there is as little work needed to push changes in afl++ into libafl I think we should define the order in which the library features are to be implemented.

So using mostly the names from: https://github.com/AFLplusplus/Website/blob/master/content/aflpp_fuzzing_framework_proposal.md

this starts with queue and then mutator as these are the least like to change. the further down the more likely there will be changes in either afl++ or how we envision what features are useful on the way.

queue : Structure & Handling (Struct; Get, Add, Remove)
queue : Generic seed scheduler (-p stuff)
queue : Generic seed scheduler : custom
queue : Trim (for new new entries)
queue : Trim : custom
Mutator : structure
Mutator : determinstic
Mutator : havoc
Mutator : mopt
Mutator : custom
Executor (Forkserver, Fauxserver, Network Connector): Structure
Executor : Input Channel (A way to send a new testcase to the target (multiple can be stacked)) > andrea: why multiple? marc: maybe so this can be threaded? one controller creating the input and putting them in a qeue, and the executor passes them on one by one. but: IMHO this would not be good for performance, too much complexity, too much code)
Executor: Observation Channel (shared mem, data from a tcp connection)
Feedback : Structure
Feedback : Converter (Reduce input channel information to something useful)
Feedback : Analysis crash or new path, etc.

Not sure what this is:
Virtual Input (hold input buffer and associated metadata (e.g. structure))
Feedback
Feedback specific queue
Feedback specific seed scheduler
Feedback specific seed energy

feel free to change or discuss

How to multicore by dom

My concept:

The main thread parses commandline flags, loads all test cases from disk, spawns all threads ("engines"). All engines have unique seeds (like main seed + engine id).
The main thread makes sure there are more queue elements than threads, worst case it clones them.
Afterwards, the main thread just sleeps, shows some stats, sleeps, writes to sync dir/ disk, tears down the workers on ctrl+c
The queue(s) (linked lists for now) live in a shared map, each queue element includes a in_use flag
If an engine looks for the next queue element to fuzz, it walks the queue (or starts from the beginning), keeping an internal ptr, does a CAS on in_use, if it doesn't work, it increases it and walks to the next element, repeat.
After fuzzing, in_use is reset.
If a new queue entry needs to be added, a global lock for this queue is acquired (spinlock), then the new element is added.

Downsides:

All threads work on the same shared map
inserts can be slow (however that will only be a problem in the first minutes)

Cleanup Makefiles

The Makefiles right now are far from ideal...
It would be great to have them cleaned up a little.

Libfuzzer Compatible Wrapper for In-Mem Fuzzer

We should add a in-mem fuzzer example that is compatible to LibFuzzer Testcases out of the box.
That makes LibFuzzer more versatile and will be needed for #17

Add Dictionary/Extras Support

The main feature lacking in LibAFL right now is Extas support.
This includes the dictionary extras, you would pass to AFL using -x, but also autoextras added during fuzzing, as well as the compile-time autodict feature in AFL LTO builds, and eventually even cmplog.
The important pieces of code are in https://github.com/AFLplusplus/AFLplusplus/blob/stable/src/afl-fuzz-extras.c

Wrong usage of maybe_grow

This uses maybe_grow in a wrong way:
https://github.com/AFLplusplus/LibAFL/blob/7e7dcb21d8438d237b2a711b0a521b225c18c868/src/libcommon.c#L131
The size of the buf changes exponentially, but not the length.
Instead of providing &length to maybe_grow, an additional pointer with the actual size (ignoring the actual content) needs to be used.
Not sure how to describe it any better, maybe read the impl of maybe_grow :)

AFL-Style Testcase Support

Right now, in the example, each thread outputs lines to its own little file.
Instead, the broker or a background thread could write new queue entries to disk, whenever a new queue message flys by.
On top, we should add the option to resume a longer-running fuzzing campaign, and maybe even sync with AFL workers. This could also be done in the background thread.

Autogenerate Docs & more Documentation

Right now all doc is inside the code.
We should set up a sphinx instance or some other nice way to make the API browseable.
To get started with sphinx, there is this project (not super actively maintained) which may just be enough or this blog post that seems to be a bit more work, going through doxygen first.
I didn't find a way to get the sphinx thing going directly, without additional tools, but maybe there's an easy way.

Once we decided on a documentation format, we can start writing more docu.

In-Mem Crash Recovery

Right now, a Crashing input in the In-Memory Fuzzer will be lost.
This is rather unfortunate. As we use shared maps/llmp for fuzzing, the input is still around after a crash, and a new process, or even the broker, should be able to recover it.

Crash analysis - more info about crash in/from LibAFL

Wrote a harness to fuzz a lib using LibAFL.

I get a crash within LibAFL, but cannot reproduce it outside LibAFL (almost identical harness ... outside LibAFL I added just main() function)

Would be great if LibAFL could save stacktrace, registers along the crashfile etc during the observed crash to debug it.

Could be that it is an issue in my harness or LibAFL.

Having this info would help to debug issues with harnesses, also for other projects and other users.

Thanks,

P.S Cool project

Bind to CPU

The current LibAFL does not bind to CPU cores, however, this should be part of the lib:
https://github.com/AFLplusplus/AFLplusplus/blob/7f621509eee57f0b6fd9ad542adc4f2acafeb059/src/afl-fuzz-init.c#L109

How to multicore by marc

Goals:

no locks because its a bottleneck.
no or very little heap, because a target running havor could trash it
even if the threads all die the data is still there

There is one primary process which is the started process.
It forks of a single secondary, which is responsible for spawning the fuzzing childs.
It also creates a central shmem where all fuzzers register (a unique id, shmem IDs of queue, dictionary, ...)
it also occasionally (like every 30 minutes) persists the data to disk, and besides recovering if all threads die does (so far) nothing.

The secondary spawns off as many fuzzing threads as required and is itself also a fuzzer.

Every fuzzing thread first registers itself in the register. this needs a lock, once.
If a thread crashes it will not update a timing field in its main map anymore, the primary can do an alive check occasionally and tell the secondary to respawn, or forks of another secondary which does that.

Every fuzzer writes into his maps (e.g. info map, queue map, dict map, service request map, ...) and visits all other fuzzers occasionally (shmem IDs from the register) for stuff it is interested in. only read access to other fuzzer maps, no writing.
e.g. selecting a queue entry could be rand_below(number_of_fuzzers) -> rand_below(selected_fuzzer->queue->no_of_entries).
(needs more glue, as the selection should be based on a calibration and weighting the entries)

and that is already it. except for the central register no locks and no writing into foreign memory.
no performance problems.

and with the knowledge of the shmemID of the register you can even start different fuzzers that join in.

other idea I had: a fuzzer can perform services for other fuzzers.
e.g. if fuzzer A needs a path to be solved where it has no finds,and thenwrites into his service request map to please handle queue entry X for cmplog and laf.
A fuzzer B that has cmplog visits occasionally the service request map of others and if it sees something it can perform and has nothing better todo can then perform that and writes that it was completed into his own service request map.

Get rid of calloc during fuzzing

Some mutators at the moment may resort to allocations, for example:
https://github.com/AFLplusplus/LibAFL/blob/a82d437ea4e2cb458fd2b625aba509b36ac0f209/src/common.c#L50
Instead, we should use afl_realloc or similar, as afl++ does.

Oracle class

A fuzzer can look for something different than crashes. Think about timeouts for instance.

The oracle class looks at observation channels, similarly to feedback, but it decides if the testcase is crashing, not if it is interesting.
For regular fuzzing, you should define the exit code of a process as an observation channel. To make the AFL oracle, you need two observation channels that are the exit code and the coverage map.

The dedup (you can choose also that dedup is part of oracle, not a separate entity) take a crashing input, look at observation channels and say of a crash worth saving (not a duplicate). in AFL this just uses the feedback to see if the crash triggers new coverage.

For timeout, the oracle will use the coverage and the timing observation channels. I hope it is clear.

C++ Bindings

We need a nice way to interface with LibAFL from C++.

When we should go public

Rishi did good work, but ofc libafl is a greater thing than GSoC and it will not be ready for the end of August.

Should we make it public to highlight the contribution of Rishi? IIRC this is not needed for GSoC, and if he will do a blogpost this should be fine.

We can publish it with a WIP disclaimer while slowing migrate pieces of AFL++ here. My main concern is that other people can "steal" good ideas from here before we finish because maintaining compatibility with AFL will take more time than rewriting a fuzzer from scratch.

What should we do?

Compile problem under Debian Buster 32-Bit

When compiling examples/libpng-1.6.37/contrib/libtests/pngvalid.c, I get
clang -DHAVE_CONFIG_H -I. -g -fPIC -Iinclude -Wall -Wextra -Werror -Wshadow -fstack-protector-strong -D_FORTIFY_SOURCE=2 -O3 -MT contrib/libtests/pngvalid.o -MD -MP -MF $depbase.Tpo -c -o contrib/libtests/pngvalid.o contrib/libtests/pngvalid.c &&
mv -f $depbase.Tpo $depbase.Po
In file included from contrib/libtests/pngvalid.c:26:
In file included from /usr/include/signal.h:25:
/usr/include/features.h:185:3: error: "_BSD_SOURCE and _SVID_SOURCE are deprecated, use _DEFAULT_SOURCE" [-Werror,-W#warnings]
# warning "_BSD_SOURCE and _SVID_SOURCE are deprecated, use _DEFAULT_SOURCE"
^
1 error generated.
make[3]: *** [Makefile:1196: contrib/libtests/pngvalid.o] Error 1

pngvalid.c line 24 has
#define _BSD_SOURCE 1 /* For the floating point exception extension */

The problem is in the Makefile which exports CFLAGS instead of setting CFLAGS as make argument.

Rust Bindings

We need a nice way to use LibAFL from Rust.

Get some CI going

Right now, we don't have any CI for LibAFL.
However, we do have unittests and examples that can pose as test cases. These should be added to Travis, similar to AFL++.

AFL++ Custom Mutator Support

Right now LibAFL only supports its own, internal, mutators at build time. It may be beneficial to add a wrapper around the current mutators that can interface with existing Custom Mutators (at least those in C).

file naming convetion

I just noticed it, but why using libX as naming convention? seems ugly

Status update

@rish9101 as I not one of your GSoC mentors, but your work is closely related to my specification for a fuzzing framewok, can you update me on the status of your project?

I guess, at this point, your reimplemented my FFF poc in C and added things from AFL++. Can I build a fuzzer with the current API?

Possible PoC Usecases

We can collect some use cases here that would be nice to make possible using libAFL.

I'll start with a "proper" untracer support
https://twitter.com/buherator/status/1269396963998543872?s=19
Taking away breakpoints as they are hit also means that subsequent runs are not reproducible.
Subsequent runs with the same input should be considered successful on "no new coverage". Instead of "same hash".

Add to Fuzzbench

This is a longer-reaching issue:
We should eventually add LibAFL to Fuzzbench, potentially even the in-mem fuzzer (crushing all other fuzzers :) ).

Directory format

We have entities (eg executor) and for each entity we will provide some implementations into libafl (eg inmemoryexecutor and forkserverexecutor). These implementations are part of the library, they should not be in the example folder.

I propose to create a subfolder in include and src for each entity and place into them a main file with the asbtract class and the other implementations.

For instance.

include/executor/executor.h contains only the abstract class
include/executor/in_memory_executor.h contains the implementation

same for src/

Btw just look at FFF

Build system

If libafl has to be multi platform, we cannot really use just GNU makefiles.

I propose meson as it was recently adopted by QEMU it and seems a sane build system, more than cmake for sure.

We can maintain both meson files and raw makefiles for linux/bsd if needed.

How to multicore by andrea

We have to converge about how we have to implement multicore fuzzing in libAFL

aflplusplus / libafl-legacy Goto Github PK

libafl-legacy's Issues

Recommend Projects

Recommend Topics

Recommend Org