Giter VIP home page Giter VIP logo

halfempty's Introduction

Introducing halfempty πŸ₯›


Fast, Parallel Testcase Minimization

Halfempty is a new testcase minimization tool, designed with parallelization in mind. Halfempty was built to use strategies and techniques that dramatically speed up the minimization process.

Background

Fuzzers find inputs that trigger bugs, but understanding those bugs is easier when you remove as much extraneous data as possible. This is called testcase minimization or delta debugging.

Minimization tools use various techniques to simplify the testcase, but the core algorithm is simply bisection. Bisection is an inherently serial process, you can't advance through the algorithm without knowing the result of each step. This data dependency problem can make minimization very slow, sometimes taking hours to complete while cpu cores sit idle.

Bisection

In this diagram you can see we progressively remove parts of the file to determine which section is interesting.

halfempty solves this problem using pessimistic speculative execution. We build a binary tree of all the possible bisection steps and then idle cores can speculatively test future steps ahead of our position in the algorithm. In many cases, the test results are already known by the time we need them.

We call it pessimistic, because real workloads are characterized by long series of consecutive failures. We simply assume that tests are going to fail, and speculatively follow the failure path until proven wrong.

Tree

In this diagram, you can see we generated a binary tree of all possible outcomes, and now idle cores can speculatively work ahead of the main thread.

If you're fuzzing a target that takes more than a few seconds to run then parallelizing the minimization can dramatically speedup your workflow. Real fuzzing inputs that take several seconds to reproduce can take many hours to complete using serial bisection, but halfempty can produce the same output in minutes.

In real tests, the author often finds speedup exceeding hours.

Path Path Full

This is a real minimization path from a fuzzer generated crash.

Halfempty generates a binary tree, and this graph shows the path through the tree from the root to the final leaf (discarded paths are hidden on the left to simplify the diagram).

The green nodes were successful and the red nodes were failures. The grey nodes in the right were explored but discarded. Because all consecutive red nodes are executed in parallel, the actual wall clock time required to minimize the input was minimal.

Each crash took ~11 seconds to reproduce, requiring about Β 34 minutes of compute time - but halfempty completed in just 6 minutes!

The original input was 240K, the final output was just 75 bytes.

Building

The only dependency is libglib2.0-dev, used for some useful data structures, like N-ary trees.

On RedHat systems, try glib2-devel.

Just type make to build the main binary.

The --monitor mode feature requires the graphviz package and a web browser.

The author has tested the following distributions:

  • CentOS 6 amd64
  • Ubuntu 14 amd64

Mac OS X

Halfempty has preliminary macOS support using homebrew.

Please use brew install pkg-config glib to install necessary dependencies, then make to build the main binary.

Usage

First, create a shell script that when given your input on stdin, returns zero.

A simple example might look like this if you wanted to test a gzip crash:

#!/bin/sh

gzip -dc

# Check if we were killed with SIGSEGV
if test $? -eq 139; then
 Β Β Β exit 0 # We want this input
else
 Β Β Β exit 1 # We don't want this input
fi

Make the file executable and verify it works:

$ chmod +x testgzip.sh
$ ./testgzip.sh < crashinput.gz && echo success || echo failure
success

Now simply run halfempty with your input and it will find the smallest version that still returns zero.

Note: If you need to create temporary files, see some advanced examples in the documentation.

$ halfempty testgzip.sh crashinput.gz

If everything worked, there should be a minimal output file in halfempty.out.

Screenshot

If you want to monitor what halfempty is doing, you can use --monitor mode, which will generate graphs you can watch in realtime. halfempty will generate a URL you can open, and you can view the data in your web browser.

Note: --monitor mode requires the graphviz package to be installed.

Screenshot

Options

Halfempty includes many options to fine tune the execution environment for the child processes, and tweak performance options. The full documentation can be shown with --help-all, but here are the most commonly useful parameters.

Parameter Description
--num-threads=threads Halfempty will default to using all available cores, but you can tweak this if you prefer.
--stable Sometimes different strategies can shake out new potential for minimizing.
If you enable this, halfempty will repeat all strategies until the output doesn't change.
(Slower, but recommended).
--timeout=seconds If tested programs can run too long, we can send them a SIGALRM.
You can catch this in your test script (see help trap) and cleanup if you like, or accept the default action and terminate.
--limit RLIMIT_???=N You can fine tune the resource limits available to child processes.
Perhaps you want to limit how much memory they can allocate, or enable core dumps.
An example might be --limit RLIMIT_CPU=600
--inherit-stdout
--inherit-stderr
By default, we discard all output from children.
If you want to see the output instead, you can disable this and you can see child error messages.
--zero-char=byte Halfempty tries to simplify files by overwriting data with nul bytes. This makes sense for binary file formats.
If you're minimizing text formats (html, xml, c, etc) then you might want whitespace instead.
Set this to 0x20 for space, or 0x0a for a newline.
--monitor If you have the graphviz package installed, halfempty can generate graphs so you watch the progress.
--no-terminate If halfempty guesses wrong, it might already be running your test on an input we know we don't need.
By default, we will try to kill it so we can get back to using that thread sooner.
You can disable this if you prefer.
--output=filename By default your output is saved to halfempty.out, but you can save it anywhere you like.
--noverify If tests are very slow, you can skip the initial verification and go straight to parallelization.
(Faster, but not recommended).
--generate-dot Halfempty can generate a dot file of the final tree state that you can inspect with xdot.
--gen-intermediate Save the best result as it's found, so you don't lose your progress if halfempty is interrupted.

Examples

There are more examples available in the wiki.

Creating temporary files

Note: Are you sure you need temporary files? Many programs will accept /dev/stdin.

If you need to create temporary files to give to your target program, you can simply do something like this.

#!/bin/sh
tempfile=`mktemp` && cat > ${tempfile}

yourprogram ${tempfile}

Remember to clean it up when you're done, you can do this if you like:

#!/bin/sh
tempfile=`mktemp` && cat > ${tempfile}
result=1

trap 'rm -f ${tempfile}; exit ${result}' EXIT TERM ALRM

yourprogram ${tempfile}

if test $? -eq 139; then
    result=0
fi

Verifying crashes

Sometimes your target program might crash with a different crash accidentally found during minimization. One solution might be to use gdb to verify the crash site.

#!/bin/sh
exec gdb -q                                                                 \
         -ex 'r'                                                            \
         -ex 'q !($_siginfo.si_signo == 11 && $pc == 0x00007ffff763f2e7)'   \
         -ex 'q 1'                                                          \
         --args yourprogram --yourparams

This will exit 0 if the signal number and crash address match, or 1 otherwise.

You can test various things such as registers ($rip, $eax, etc), fault address ($_siginfo._sifields._sigfault.si_addr), and many more. If you want to see more things you can test, try the command show conv in gdb.

FAQ

Q. What does finalized mean in halfempty output?

A. Halfempty works by guessing what the results of tests will be before the real result is known. If the path through the bisection tree from the root node to the final leaf was entirely through nodes where we knew the result, then the path is finalized (as opposed to pending).

Q. Where does the name come from?

A. We use pessimistic speculative execution, so the glass is always half empty? ....? Sorry. πŸ₯›

Q. How can I kill processes that take too long?

A. Use --timeout 10 to send a signal that can be caught after 10 seconds, or --limit RLIMIT_CPU=10 to enforce a hard limit.

Q. Halfempty wastes a lot of CPU time exploring paths, so is it really faster?

A. It's significantly faster in real time (i.e. wall clock time), that's what counts!

Q. I have a very large input, what do I need to know?

A. Halfempty is less thorough by default on very large inputs that don't seem to minimize well. Removing each byte from multi-gigabyte inputs just takes too long, even when run in parallel.

If you really want halfempty to be thorough, you can do this:

$ halfempty --bisect-skip-multiplier=0 --zero-skip-multiplier=0 --stable --gen-intermediate harness.sh input.bin

  • --bisect-skip-multiplier=0 and --zero-skip-multiplier=0 means to try removing every single byte.
  • --stable means to keep retrying minimization until it no further removals work.
  • --gen-intermediate means to save the best result as it's found, so you won't lose your work if you change your mind.

On the other hand, if you just want halfempty to be faster and don't care if it's not very thorough, you can do the opposite. Something like this:

$ halfempty --bisect-skip-multiplier=0.01 --zero-skip-multiplier=0.01 harness.sh input.bin

The reasonable range for the multiplier is 0 to 0.1.

BUGS

  • If your program intercepts signals or creates process groups, it might be difficult to cleanup.
  • For very long trees, we keep an fd open for each successful node. It's possible we might exhaust fds.

Please report more bugs or unexpected results to [email protected]. The author intends to maintain this tool and make it a stable and reliable component of your fuzzing workflow.

Better quality bug reports require simpler reproducers, and that requires good quality tools.

FUTURE

  • The next version will allow the level of pessimism to be controlled at runtime.

AUTHORS

Tavis Ormandy [email protected]

LICENSE

Apache 2.0, See LICENSE file for details.

NOTICE

This is not an officially supported Google product.

halfempty's People

Contributors

0-wiz-0 avatar aytey avatar qlyoung avatar smattr avatar taviso avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

halfempty's Issues

Check for stale file descriptors

From #17, it was pointed out that there are a bunch of attempts to close(-1).

I think this is a cleanup thread trying to garbage collect destroyed tasks, it should be checking if the file descriptor is still valid. This could mask a real bug later on, so it makes sense to avoid it if possible.

Suggestion: additional topics

Nice project!

I discovered this some months ago and then forgot about it.

Despite knowing it existed, I struggled a bit to find it, searching for "testcase reduction" and "parallel". I used Google, Duckduckgo and GitHub search. I had "parallel creduce" in mind, which wasn't quite right. I didn't think of the word "minimization" until later.

Could I suggest putting more topics on the GitHub repository might make it easier to find? e.g.

  • testcase-minimization
  • testcase-reduction

print_status_message is wrong if no success nodes

I've noticed that print_status_message() shows user=0.0s if no tasks have been successful, this makes it look like there was no speedup:

treesize=911, height=456, unproc=2, real=80.9s, user=0.0s, speedup=~-80.9s

Even though it was really really fast, it just couldn't make any progress. This is because finalnode = find_finalized_node(tree, true); only finds a success node, if there havent been any yet then the elapsed time is zero.

This is just a cosmetic bug, it would be nice if it showed how much faster we were, it's just not displaying.

Hangs when using musl libc

When using halfempty 0.40 on Linux 5.9.8/musl 1.1.24/glib 2.66.2 from Void Linux, the test suite gets stuck, e.g. on the grep example.

Valgrind shows that close(-1) is executed, which points me to cleanup_orphaned_tasks.

Sample run with debugging output (limited to one thread for clarity):

G_MESSAGES_DEBUG=all ../halfempty -P 1 --cleanup-threads=1 grep.sh grep.in
$ G_MESSAGES_DEBUG=all ../halfempty -P 1 --cleanup-threads=1 grep.sh grep.in
** (process:18916): DEBUG: 22:49:07.296: configuring default rlimits for child process
** (process:18916): DEBUG: 22:49:07.296: Configured rlimit RLIMIT_CPU => { 18446744073709551615, 18446744073709551615 }
** (process:18916): DEBUG: 22:49:07.296: Configured rlimit RLIMIT_FSIZE => { 18446744073709551615, 18446744073709551615 }
** (process:18916): DEBUG: 22:49:07.296: Configured rlimit RLIMIT_DATA => { 18446744073709551615, 18446744073709551615 }
** (process:18916): DEBUG: 22:49:07.296: Configured rlimit RLIMIT_STACK => { 8388608, 18446744073709551615 }
** (process:18916): DEBUG: 22:49:07.296: Configured rlimit RLIMIT_CORE => { 0, 18446744073709551615 }
** (process:18916): DEBUG: 22:49:07.296: Configured rlimit RLIMIT_RSS => { 18446744073709551615, 18446744073709551615 }
** (process:18916): DEBUG: 22:49:07.296: Configured rlimit RLIMIT_NPROC => { 62081, 62081 }
** (process:18916): DEBUG: 22:49:07.296: Configured rlimit RLIMIT_NOFILE => { 4096, 4096 }
** (process:18916): DEBUG: 22:49:07.296: Configured rlimit RLIMIT_MEMLOCK => { 65536, 65536 }
** (process:18916): DEBUG: 22:49:07.296: Configured rlimit RLIMIT_AS => { 18446744073709551615, 18446744073709551615 }
** (process:18916): DEBUG: 22:49:07.296: Configured rlimit RLIMIT_LOCKS => { 18446744073709551615, 18446744073709551615 }
** (process:18916): DEBUG: 22:49:07.297: Configured rlimit RLIMIT_SIGPENDING => { 62081, 62081 }
** (process:18916): DEBUG: 22:49:07.297: Configured rlimit RLIMIT_MSGQUEUE => { 819200, 819200 }
** (process:18916): DEBUG: 22:49:07.297: Configured rlimit RLIMIT_NICE => { 0, 0 }
** (process:18916): DEBUG: 22:49:07.297: Configured rlimit RLIMIT_RTPRIO => { 0, 0 }
** INFO: 22:49:07.297: Initializing 2 strategies...
** INFO: 22:49:07.297: Strategy 1: "bisect", Remove consecutively larger chunks of data from the file
** INFO: 22:49:07.297: Strategy 2: "zero", Zero consecutively larger chunks of data from the file
β•­β”‚   β”‚ ── halfempty ───────────────────────────────────────────────── v0.40 ──
β•°β”‚  8β”‚ A fast, parallel testcase minimization tool
 ╰───╯ ───────────────────────────────────────────────────────── by @taviso ──

Input file "grep.in" is now 1191359 bytes, starting strategy "bisect"...
Verifying the original input executes successfully... (skip with --noverify)
** (halfempty:18916): DEBUG: 22:49:07.298: thread 0x55b2a74dce60 processing task 0x55b2a74dcce0, size 1191359, fd 3, status TASK_STATUS_PENDING
** (halfempty:18916): DEBUG: 22:49:07.299: writing data to child 18918 pipefd=5
** (halfempty:18916): DEBUG: 22:49:07.302: Broken pipe received from 18916
** (halfempty:18916): DEBUG: 22:49:07.302: failed to splice all data into pipe, 1191359 remaining
** (halfempty:18916): DEBUG: 22:49:07.302: finished writing data to child, about to waitid(18918)
** (halfempty:18916): DEBUG: 22:49:07.302: child 18918 exited with code 0
** (halfempty:18916): DEBUG: 22:49:07.302: thread 0x55b2a74dce60, child returned 0 after 0.004 seconds, size 1191359
** (halfempty:18916): DEBUG: 22:49:07.302: task 0x55b2a74dcce0 success, aborting mispredicted jobs
** (halfempty:18916): DEBUG: 22:49:07.302: abort_pending_tasks() called, but no child nodes to traverse
** INFO: 22:49:07.302: thread 0x55b2a74dce60 found task 0x55b2a74dcce0 succeeded after 0.004 seconds, size 1191359, depth 1
** (halfempty:18916): DEBUG: 22:49:07.302: thread 0x55b2a74dce60 completed workunit 0x55b2a74dcce0
The original input file succeeded after 0.0 seconds.
(halfempty:18916): bisect-DEBUG: 22:49:07.303: strategy_bisect_data(0x55b2a74dce00)
(halfempty:18916): bisect-DEBUG: 22:49:07.303: initializing a new root node size 1191359
** (halfempty:18916): DEBUG: 22:49:07.303: generator thread obtained treelock, finding next leaf
New finalized size: 1191359 (depth=2) real=0.0s, user=0.0s, speedup=~-0.0s
** (halfempty:18916): DEBUG: 22:49:07.303: found a TASK_STATUS_SUCCESS task, size 1191359 
(halfempty:18916): bisect-DEBUG: 22:49:07.303: strategy_bisect_data(0x55b2a74dce00)
(halfempty:18916): bisect-DEBUG: 22:49:07.303: parent succeeded, not incrementing offset from 0
(halfempty:18916): bisect-DEBUG: 22:49:07.303: creating task for 0x55b2a74dce00 with parent 0x55b2a74dcce0 and source 0x55b2a74dcce0
** (halfempty:18916): DEBUG: 22:49:07.303: node is a leaf node, generating children
** (halfempty:18916): DEBUG: 22:49:07.303: generator thread releasing tree lock
** (halfempty:18916): DEBUG: 22:49:07.303: generator thread obtained treelock, finding next leaf
** (halfempty:18916): DEBUG: 22:49:07.303: thread 0x7f7c167af460 processing task 0x55b2a74dd940, size 0, fd 4, status TASK_STATUS_PENDING
** (halfempty:18916): DEBUG: 22:49:07.304: found a TASK_STATUS_SUCCESS task, size 1191359 
** (halfempty:18916): DEBUG: 22:49:07.304: node is not a leaf, traversing
** (halfempty:18916): DEBUG: 22:49:07.304:  found a TASK_STATUS_PENDING task, size 0 
(halfempty:18916): bisect-DEBUG: 22:49:07.304: strategy_bisect_data(0x55b2a74dcec0)
bisect-INFO: 22:49:07.304: reached end of cycle (offset 0 + chunksize 1191359 > size 0)
(halfempty:18916): bisect-DEBUG: 22:49:07.304: creating task for 0x55b2a74dcec0 with parent 0x55b2a74dd940 and source 0x55b2a74dcce0
** (halfempty:18916): DEBUG: 22:49:07.305: writing data to child 18920 pipefd=6
** (halfempty:18916): DEBUG: 22:49:07.305: finished writing data to child, about to waitid(18920)
** (halfempty:18916): DEBUG: 22:49:07.307: child 18920 exited with code 1
** (halfempty:18916): DEBUG: 22:49:07.307: thread 0x7f7c167af460, child returned 1 after 0.004 seconds, size 0
** (halfempty:18916): DEBUG: 22:49:07.307: task 0x55b2a74dd940 failed, fd 4, pid 18920
** (halfempty:18916): DEBUG: 22:49:07.308: thread 0x7f7c167af460 completed workunit 0x55b2a74dd940
** (halfempty:18916): DEBUG: 22:49:07.308: thread 0x7f7c166826a0 cleaning up task 0x55b2a74dd940 (pid=18920), now attempting to lock
** (halfempty:18916): DEBUG: 22:49:07.308: thread 0x7f7c166826a0 acquired lock on task 0x55b2a74dd940, state TASK_STATUS_FAILURE
** (halfempty:18916): DEBUG: 22:49:07.308: task 0x55b2a74dd940 unlocked by 0x7f7c166826a0, now discarded
** (halfempty:18916): DEBUG: 22:49:07.311:  node is a leaf node, generating children
** (halfempty:18916): DEBUG: 22:49:07.311: generator thread releasing tree lock
** (halfempty:18916): DEBUG: 22:49:07.311: generator thread obtained treelock, finding next leaf
** (halfempty:18916): DEBUG: 22:49:07.311: thread 0x7f7c167af460 processing task 0x55b2a74dd9a0, size 595680, fd 8, status TASK_STATUS_PENDING
** (halfempty:18916): DEBUG: 22:49:07.312: found a TASK_STATUS_SUCCESS task, size 1191359 
** (halfempty:18916): DEBUG: 22:49:07.312: node is not a leaf, traversing
** (halfempty:18916): DEBUG: 22:49:07.312:  found a TASK_STATUS_FAILURE task, size 0 
** (halfempty:18916): DEBUG: 22:49:07.312:  node is not a leaf, traversing
** (halfempty:18916): DEBUG: 22:49:07.312:   found a TASK_STATUS_PENDING task, size 595680 
(halfempty:18916): bisect-DEBUG: 22:49:07.313: strategy_bisect_data(0x55b2a74dcef0)
(halfempty:18916): bisect-DEBUG: 22:49:07.313: parent failed or pending, trying next offset 0 => 595679
(halfempty:18916): bisect-DEBUG: 22:49:07.313: creating task for 0x55b2a74dcef0 with parent 0x55b2a74dd9a0 and source 0x55b2a74dcce0
** (halfempty:18916): DEBUG: 22:49:07.322:   node is a leaf node, generating children
** (halfempty:18916): DEBUG: 22:49:07.322: generator thread releasing tree lock
** (halfempty:18916): DEBUG: 22:49:07.322: generator thread obtained treelock, finding next leaf
** (halfempty:18916): DEBUG: 22:49:07.322: found a TASK_STATUS_SUCCESS task, size 1191359 
** (halfempty:18916): DEBUG: 22:49:07.322: node is not a leaf, traversing
** (halfempty:18916): DEBUG: 22:49:07.322:  found a TASK_STATUS_FAILURE task, size 0 
** (halfempty:18916): DEBUG: 22:49:07.322:  node is not a leaf, traversing
** (halfempty:18916): DEBUG: 22:49:07.322:   found a TASK_STATUS_PENDING task, size 595680 
** (halfempty:18916): DEBUG: 22:49:07.322:   node is not a leaf, traversing
** (halfempty:18916): DEBUG: 22:49:07.322:    found a TASK_STATUS_PENDING task, size 595680 
(halfempty:18916): bisect-DEBUG: 22:49:07.323: strategy_bisect_data(0x55b2a74dcf50)
bisect-INFO: 22:49:07.323: reached end of cycle (offset 595679 + chunksize 595679 > size 595680)
(halfempty:18916): bisect-DEBUG: 22:49:07.323: creating task for 0x55b2a74dcf50 with parent 0x55b2a74e1300 and source 0x55b2a74dcce0
** (halfempty:18916): DEBUG: 22:49:07.333:    node is a leaf node, generating children
** (halfempty:18916): DEBUG: 22:49:07.333: generator thread releasing tree lock
** (halfempty:18916): DEBUG: 22:49:07.333: generator thread obtained treelock, finding next leaf
** (halfempty:18916): DEBUG: 22:49:07.333: found a TASK_STATUS_SUCCESS task, size 1191359 
** (halfempty:18916): DEBUG: 22:49:07.333: node is not a leaf, traversing
** (halfempty:18916): DEBUG: 22:49:07.333:  found a TASK_STATUS_FAILURE task, size 0 
** (halfempty:18916): DEBUG: 22:49:07.334:  node is not a leaf, traversing
** (halfempty:18916): DEBUG: 22:49:07.334:   found a TASK_STATUS_PENDING task, size 595680 
** (halfempty:18916): DEBUG: 22:49:07.334:   node is not a leaf, traversing
** (halfempty:18916): DEBUG: 22:49:07.334:    found a TASK_STATUS_PENDING task, size 595680 
** (halfempty:18916): DEBUG: 22:49:07.334:    node is not a leaf, traversing
** (halfempty:18916): DEBUG: 22:49:07.334:     found a TASK_STATUS_PENDING task, size 893520 
(halfempty:18916): bisect-DEBUG: 22:49:07.334: strategy_bisect_data(0x55b2a74dd210)
(halfempty:18916): bisect-DEBUG: 22:49:07.334: parent failed or pending, trying next offset 0 => 297839
(halfempty:18916): bisect-DEBUG: 22:49:07.334: creating task for 0x55b2a74dd210 with parent 0x55b2a74e1360 and source 0x55b2a74dcce0
** (halfempty:18916): DEBUG: 22:49:07.339:     node is a leaf node, generating children
** (halfempty:18916): DEBUG: 22:49:07.339: generator thread releasing tree lock

Interestingly, under valgrind it manages to finish, but does these invalid close calls (which also seem to occur on glibc!):

valgrind ../halfempty -P 1 --cleanup-threads=1 grep.sh grep.in
valgrind ../halfempty -P 1 --cleanup-threads=1 grep.sh grep.in
==19081== Memcheck, a memory error detector
==19081== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==19081== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright info
==19081== Command: ../halfempty -P 1 --cleanup-threads=1 grep.sh grep.in
==19081== 
β•­β”‚   β”‚ ── halfempty ───────────────────────────────────────────────── v0.40 ──
β•°β”‚  8β”‚ A fast, parallel testcase minimization tool
 ╰───╯ ───────────────────────────────────────────────────────── by @taviso ──

Input file "grep.in" is now 1191359 bytes, starting strategy "bisect"...
--19081-- WARNING: unhandled amd64-linux syscall: 315
--19081-- You may be able to write your own handler.
--19081-- Read the file README_MISSING_SYSCALL_OR_IOCTL.
--19081-- Nevertheless we consider this a bug.  Please report
--19081-- it at http://valgrind.org/support/bug_reports.html.
Verifying the original input executes successfully... (skip with --noverify)
The original input file succeeded after 0.0 seconds.
New finalized size: 1191359 (depth=2) real=0.0s, user=0.0s, speedup=~-0.0s
New finalized size: 595680 (depth=5) real=0.1s, user=0.0s, speedup=~-0.0s
New finalized size: 297841 (depth=7) real=0.1s, user=0.1s, speedup=~-0.0s
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
New finalized size: 148922 (depth=9) real=0.2s, user=0.1s, speedup=~-0.1s
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
New finalized size: 74463 (depth=11) real=0.3s, user=0.1s, speedup=~-0.1s
New finalized size: 37234 (depth=13) real=0.3s, user=0.2s, speedup=~-0.1s
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
New finalized size: 18620 (depth=14) real=0.4s, user=0.2s, speedup=~-0.2s
New finalized size: 9313 (depth=17) real=0.5s, user=0.2s, speedup=~-0.2s
New finalized size: 4660 (depth=19) real=0.5s, user=0.3s, speedup=~-0.2s
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
New finalized size: 2334 (depth=20) real=0.6s, user=0.3s, speedup=~-0.2s
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
New finalized size: 1171 (depth=23) real=0.7s, user=0.4s, speedup=~-0.3s
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
New finalized size: 590 (depth=25) real=0.8s, user=0.4s, speedup=~-0.3s
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
New finalized size: 300 (depth=26) real=0.9s, user=0.4s, speedup=~-0.4s
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
New finalized size: 155 (depth=29) real=1.0s, user=0.5s, speedup=~-0.5s
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
New finalized size: 83 (depth=30) real=1.0s, user=0.5s, speedup=~-0.5s
==19081== Warning: invalid file descriptor -1 in syscall close()
New finalized size: 47 (depth=33) real=1.1s, user=0.5s, speedup=~-0.5s
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
New finalized size: 29 (depth=34) real=1.2s, user=0.6s, speedup=~-0.6s
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
New finalized size: 20 (depth=37) real=1.3s, user=0.6s, speedup=~-0.6s
New finalized size: 11 (depth=38) real=1.3s, user=0.6s, speedup=~-0.6s
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
New finalized size: 9 (depth=45) real=1.5s, user=0.7s, speedup=~-0.7s
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
New finalized size: 8 (depth=46) real=1.5s, user=0.8s, speedup=~-0.7s
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
New finalized size: 7 (depth=53) real=1.7s, user=0.9s, speedup=~-0.7s
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
Reached the end of our path through tree, all nodes were finalized
45 nodes failed, 44 worked, 21 discarded, 1 collapsed
1.636 seconds of compute was required for final path
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()

Strategy "bisect" complete, output 7 bytes
Input file "grep.in" is now 7 bytes, starting strategy "zero"...
Verifying the original input executes successfully... (skip with --noverify)
The original input file succeeded after 0.0 seconds.
New finalized size: 7 (depth=2) real=0.0s, user=0.0s, speedup=~-0.0s
New finalized size: 7 (depth=5) real=0.0s, user=0.0s, speedup=~-0.0s
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
Reached the end of our path through tree, all nodes were finalized
11 nodes failed, 2 worked, 0 discarded, 1 collapsed
0.230 seconds of compute was required for final path
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()
==19081== Warning: invalid file descriptor -1 in syscall close()

Strategy "zero" complete, output 7 bytes
All work complete, generating output halfempty.out (size: 7)
==19081== 
==19081== HEAP SUMMARY:
==19081==     in use at exit: 24,758 bytes in 37 blocks
==19081==   total heap usage: 9,966 allocs, 9,929 frees, 538,679 bytes allocated
==19081== 
==19081== LEAK SUMMARY:
==19081==    definitely lost: 0 bytes in 0 blocks
==19081==    indirectly lost: 0 bytes in 0 blocks
==19081==      possibly lost: 0 bytes in 0 blocks
==19081==    still reachable: 24,758 bytes in 37 blocks
==19081==         suppressed: 0 bytes in 0 blocks
==19081== Rerun with --leak-check=full to see details of leaked memory
==19081== 
==19081== For lists of detected and suppressed errors, rerun with: -s
==19081== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

I hope this is enough detail for someone familiar with the code to debug it.

Handle low disk space better

We can close old file descriptors if they're not needed anymore and free up some disk space.

halfempty can use a lot of temporary disk space right now, which is noticeable in VMs.

Error when opening the HTML page for monitor mode from a Windows host

The HTML page generated for monitor mode references the PNG image using an absolute UNIX-style path. As a result, opening the HTML page from a Windows host results in an error:

image

In case both the HTML and PNG files are guaranteed to reside in the same directory, I think a relative filename will work well for all platforms.

Windows support

Hi!

do you have any plan to support Windows applications?

non-Linux portability request

halfempty does not compile on NetBSD. I fixed some issues in #11 but two more remain:

  • the use of splice(), which is Linux-specific
  • the use of sendfile(), which I think FreeBSD also supports, but NetBSD doesn't.

Could you please advise on workarounds for these?
Thanks.

Write a real autoconf script to verify build dependencies

I tried "make" on my VPS (which is 32-bit), and got all kinds of errors.

sander@haring:~/git/halfempty$ make
Checking for glib-2.0...ok
gcc -Wall -std=gnu99 -O0 -ggdb3 -fPIC -Wno-format-zero-length -Wno-unused-parameter -UNDEBUG -UG_DISABLE_ASSERT `getconf LFS_CFLAGS` `pkg-config --cflags glib-2.0` -D_GNU_SOURCE  -c -o halfempty.o halfempty.c
halfempty.c:52:27: error: β€˜G_OPTION_FLAG_NONE’ undeclared here (not in a function)
     { "num-threads", 'P', G_OPTION_FLAG_NONE, G_OPTION_ARG_INT, &kProcessThreads, "How many threads to use (default=ncores+1).", "threads" },
                           ^
halfempty.c:60:5: error: initializer element is not constant
     { "generate-dot", 0, G_OPTION_FLAG_NONE, G_OPTION_ARG_NONE, &kGenerateDotFile, "Generate a DOT file to display the tree status (default=off).", NULL },
     ^

"make" went OK on 64-bit Intel and 64-bit ARM.

So I guess halfempty is 32-bit only?
If so: I created 3 lines of code in Makefile to check for that. Do you want a PR for that?

~/git/halfempty$ make
Checking for 64-bit...ok
Checking for glib-2.0...ok
gcc -Wall -std=gnu99 -O0 -ggdb3 -fPIC -Wno-format-zero-length -Wno-unused-parameter -UNDEBUG -UG_DISABLE_ASSERT `getconf LFS_CFLAGS` `pkg-config --cflags glib-2.0` -D_GNU_SOURCE  -c -o halfempty.o halfempty.c

and

sander@haring:~/git/halfempty$ make
Checking for 64-bit...Not 64-bit
make: *** [check] Error 1

Error when compiling in Ubuntu 16.04 - /usr/include/x86_64-linux-gnu/bits/fcntl2.h:50:4: error: call to '__open_missing_mode' declared with attribute error: open with O_CREAT or O_TMPFILE in second argument needs 3 arguments

Hi Tavis,

I tried compiling in Ubuntu 16.04, but can't really figure out how to fix it at the moment.
This is the error I get:
Checking for glib-2.0...ok
gcc -Wall -std=gnu99 -O2 -ggdb3 -march=native -fPIC -Wno-format-zero-length -Wno-unused-parameter -UNDEBUG -UG_DISABLE_ASSERT getconf LFS_CFLAGS pkg-config --cflags glib-2.0 -D_GNU_SOURCE -c -o halfempty.o halfempty.c
gcc -Wall -std=gnu99 -O2 -ggdb3 -march=native -fPIC -Wno-format-zero-length -Wno-unused-parameter -UNDEBUG -UG_DISABLE_ASSERT getconf LFS_CFLAGS pkg-config --cflags glib-2.0 -D_GNU_SOURCE -c -o proc.o proc.c
gcc -Wall -std=gnu99 -O2 -ggdb3 -march=native -fPIC -Wno-format-zero-length -Wno-unused-parameter -UNDEBUG -UG_DISABLE_ASSERT getconf LFS_CFLAGS pkg-config --cflags glib-2.0 -D_GNU_SOURCE -c -o bisect.o bisect.c
hexdump -ve '"" 1/1 "%#02x" ","' monitor.tpl > monitor.h
gcc -Wall -std=gnu99 -O2 -ggdb3 -march=native -fPIC -Wno-format-zero-length -Wno-unused-parameter -UNDEBUG -UG_DISABLE_ASSERT getconf LFS_CFLAGS pkg-config --cflags glib-2.0 -D_GNU_SOURCE -c -o util.o util.c
In file included from /usr/include/fcntl.h:289,
from util.c:34:
In function 'open',
inlined from 'g_unlinked_tmp' at util.c:154:17:
/usr/include/x86_64-linux-gnu/bits/fcntl2.h:50:4: error: call to '__open_missing_mode' declared with attribute error: open with O_CREAT or O_TMPFILE in second argument needs 3 arguments
__open_missing_mode ();
^~~~~~~~~~~~~~~~~~~~~~
: recipe for target 'util.o' failed
make: *** [util.o] Error 1

Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.