aflplusplus / libafl-legacy Goto Github PK
View Code? Open in Web Editor NEWAFL++ as a library: gives you all the tools necessary to craft the best fuzzer for your targets with ease!
License: Apache License 2.0
AFL++ as a library: gives you all the tools necessary to craft the best fuzzer for your targets with ease!
License: Apache License 2.0
So there is as little work needed to push changes in afl++ into libafl I think we should define the order in which the library features are to be implemented.
So using mostly the names from: https://github.com/AFLplusplus/Website/blob/master/content/aflpp_fuzzing_framework_proposal.md
this starts with queue and then mutator as these are the least like to change. the further down the more likely there will be changes in either afl++ or how we envision what features are useful on the way.
queue : Structure & Handling (Struct; Get, Add, Remove)
queue : Generic seed scheduler (-p stuff)
queue : Generic seed scheduler : custom
queue : Trim (for new new entries)
queue : Trim : custom
Mutator : structure
Mutator : determinstic
Mutator : havoc
Mutator : mopt
Mutator : custom
Executor (Forkserver, Fauxserver, Network Connector): Structure
Executor : Input Channel (A way to send a new testcase to the target (multiple can be stacked)) > andrea: why multiple? marc: maybe so this can be threaded? one controller creating the input and putting them in a qeue, and the executor passes them on one by one. but: IMHO this would not be good for performance, too much complexity, too much code)
Executor: Observation Channel (shared mem, data from a tcp connection)
Feedback : Structure
Feedback : Converter (Reduce input channel information to something useful)
Feedback : Analysis crash or new path, etc.
Not sure what this is:
Virtual Input (hold input buffer and associated metadata (e.g. structure))
Feedback
Feedback specific queue
Feedback specific seed scheduler
Feedback specific seed energy
feel free to change or discuss
My concept:
main seed + engine id
).ctrl+c
in_use
flagin_use
, if it doesn't work, it increases it and walks to the next element, repeat.in_use
is reset.Downsides:
The Makefiles right now are far from ideal...
It would be great to have them cleaned up a little.
We should add a in-mem fuzzer example that is compatible to LibFuzzer Testcases out of the box.
That makes LibFuzzer more versatile and will be needed for #17
The main feature lacking in LibAFL right now is Extas support.
This includes the dictionary extras, you would pass to AFL using -x
, but also autoextras added during fuzzing, as well as the compile-time autodict feature in AFL LTO builds, and eventually even cmplog.
The important pieces of code are in https://github.com/AFLplusplus/AFLplusplus/blob/stable/src/afl-fuzz-extras.c
This uses maybe_grow in a wrong way:
https://github.com/AFLplusplus/LibAFL/blob/7e7dcb21d8438d237b2a711b0a521b225c18c868/src/libcommon.c#L131
The size of the buf changes exponentially, but not the length.
Instead of providing &length to maybe_grow, an additional pointer with the actual size (ignoring the actual content) needs to be used.
Not sure how to describe it any better, maybe read the impl of maybe_grow
:)
Right now, in the example, each thread outputs lines to its own little file.
Instead, the broker or a background thread could write new queue entries to disk, whenever a new queue message flys by.
On top, we should add the option to resume a longer-running fuzzing campaign, and maybe even sync with AFL workers. This could also be done in the background thread.
Right now all doc is inside the code.
We should set up a sphinx instance or some other nice way to make the API browseable.
To get started with sphinx, there is this project (not super actively maintained) which may just be enough or this blog post that seems to be a bit more work, going through doxygen first.
I didn't find a way to get the sphinx thing going directly, without additional tools, but maybe there's an easy way.
Once we decided on a documentation format, we can start writing more docu.
Right now, a Crashing input in the In-Memory Fuzzer will be lost.
This is rather unfortunate. As we use shared maps/llmp for fuzzing, the input is still around after a crash, and a new process, or even the broker, should be able to recover it.
Wrote a harness to fuzz a lib using LibAFL.
I get a crash within LibAFL, but cannot reproduce it outside LibAFL (almost identical harness ... outside LibAFL I added just main() function)
Would be great if LibAFL could save stacktrace, registers along the crashfile etc during the observed crash to debug it.
Could be that it is an issue in my harness or LibAFL.
Having this info would help to debug issues with harnesses, also for other projects and other users.
Thanks,
P.S Cool project
The current LibAFL does not bind to CPU cores, however, this should be part of the lib:
https://github.com/AFLplusplus/AFLplusplus/blob/7f621509eee57f0b6fd9ad542adc4f2acafeb059/src/afl-fuzz-init.c#L109
Goals:
There is one primary process which is the started process.
It forks of a single secondary, which is responsible for spawning the fuzzing childs.
It also creates a central shmem where all fuzzers register (a unique id, shmem IDs of queue, dictionary, ...)
it also occasionally (like every 30 minutes) persists the data to disk, and besides recovering if all threads die does (so far) nothing.
The secondary spawns off as many fuzzing threads as required and is itself also a fuzzer.
Every fuzzing thread first registers itself in the register. this needs a lock, once.
If a thread crashes it will not update a timing field in its main map anymore, the primary can do an alive check occasionally and tell the secondary to respawn, or forks of another secondary which does that.
Every fuzzer writes into his maps (e.g. info map, queue map, dict map, service request map, ...) and visits all other fuzzers occasionally (shmem IDs from the register) for stuff it is interested in. only read access to other fuzzer maps, no writing.
e.g. selecting a queue entry could be rand_below(number_of_fuzzers) -> rand_below(selected_fuzzer->queue->no_of_entries).
(needs more glue, as the selection should be based on a calibration and weighting the entries)
and that is already it. except for the central register no locks and no writing into foreign memory.
no performance problems.
and with the knowledge of the shmemID of the register you can even start different fuzzers that join in.
other idea I had: a fuzzer can perform services for other fuzzers.
e.g. if fuzzer A needs a path to be solved where it has no finds,and thenwrites into his service request map to please handle queue entry X for cmplog and laf.
A fuzzer B that has cmplog visits occasionally the service request map of others and if it sees something it can perform and has nothing better todo can then perform that and writes that it was completed into his own service request map.
Some mutators at the moment may resort to allocations, for example:
https://github.com/AFLplusplus/LibAFL/blob/a82d437ea4e2cb458fd2b625aba509b36ac0f209/src/common.c#L50
Instead, we should use afl_realloc or similar, as afl++ does.
A fuzzer can look for something different than crashes. Think about timeouts for instance.
The oracle class looks at observation channels, similarly to feedback, but it decides if the testcase is crashing, not if it is interesting.
For regular fuzzing, you should define the exit code of a process as an observation channel. To make the AFL oracle, you need two observation channels that are the exit code and the coverage map.
The dedup (you can choose also that dedup is part of oracle, not a separate entity) take a crashing input, look at observation channels and say of a crash worth saving (not a duplicate). in AFL this just uses the feedback to see if the crash triggers new coverage.
For timeout, the oracle will use the coverage and the timing observation channels. I hope it is clear.
We need a nice way to interface with LibAFL from C++.
Rishi did good work, but ofc libafl is a greater thing than GSoC and it will not be ready for the end of August.
Should we make it public to highlight the contribution of Rishi? IIRC this is not needed for GSoC, and if he will do a blogpost this should be fine.
We can publish it with a WIP disclaimer while slowing migrate pieces of AFL++ here. My main concern is that other people can "steal" good ideas from here before we finish because maintaining compatibility with AFL will take more time than rewriting a fuzzer from scratch.
What should we do?
When compiling examples/libpng-1.6.37/contrib/libtests/pngvalid.c, I get
clang -DHAVE_CONFIG_H -I. -g -fPIC -Iinclude -Wall -Wextra -Werror -Wshadow -fstack-protector-strong -D_FORTIFY_SOURCE=2 -O3 -MT contrib/libtests/pngvalid.o -MD -MP -MF $depbase.Tpo -c -o contrib/libtests/pngvalid.o contrib/libtests/pngvalid.c &&
mv -f $depbase.Tpo $depbase.Po
In file included from contrib/libtests/pngvalid.c:26:
In file included from /usr/include/signal.h:25:
/usr/include/features.h:185:3: error: "_BSD_SOURCE and _SVID_SOURCE are deprecated, use _DEFAULT_SOURCE" [-Werror,-W#warnings]
# warning "_BSD_SOURCE and _SVID_SOURCE are deprecated, use _DEFAULT_SOURCE"
^
1 error generated.
make[3]: *** [Makefile:1196: contrib/libtests/pngvalid.o] Error 1
pngvalid.c line 24 has
#define _BSD_SOURCE 1 /* For the floating point exception extension */
The problem is in the Makefile which exports CFLAGS instead of setting CFLAGS as make argument.
We need a nice way to use LibAFL from Rust.
Right now, we don't have any CI for LibAFL.
However, we do have unittests and examples that can pose as test cases. These should be added to Travis, similar to AFL++.
Right now LibAFL only supports its own, internal, mutators at build time. It may be beneficial to add a wrapper around the current mutators that can interface with existing Custom Mutators (at least those in C).
I just noticed it, but why using libX as naming convention? seems ugly
@rish9101 as I not one of your GSoC mentors, but your work is closely related to my specification for a fuzzing framewok, can you update me on the status of your project?
I guess, at this point, your reimplemented my FFF poc in C and added things from AFL++. Can I build a fuzzer with the current API?
We can collect some use cases here that would be nice to make possible using libAFL.
I'll start with a "proper" untracer support
https://twitter.com/buherator/status/1269396963998543872?s=19
Taking away breakpoints as they are hit also means that subsequent runs are not reproducible.
Subsequent runs with the same input should be considered successful on "no new coverage". Instead of "same hash".
This is a longer-reaching issue:
We should eventually add LibAFL to Fuzzbench, potentially even the in-mem fuzzer (crushing all other fuzzers :) ).
We have entities (eg executor) and for each entity we will provide some implementations into libafl (eg inmemoryexecutor and forkserverexecutor). These implementations are part of the library, they should not be in the example folder.
I propose to create a subfolder in include and src for each entity and place into them a main file with the asbtract class and the other implementations.
For instance.
include/executor/executor.h contains only the abstract class
include/executor/in_memory_executor.h contains the implementation
same for src/
Btw just look at FFF
If libafl has to be multi platform, we cannot really use just GNU makefiles.
I propose meson as it was recently adopted by QEMU it and seems a sane build system, more than cmake for sure.
We can maintain both meson files and raw makefiles for linux/bsd if needed.
We have to converge about how we have to implement multicore fuzzing in libAFL
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.