efficient / catbench Goto Github PK

CATBench, the Intel Cache Allocation Technology benchmarking suite described in our tech report, "Simple Cache Partitioning for Networked Workloads"

Makefile 0.71% C 75.34% Shell 15.38% Python 8.29% C++ 0.09% Assembly 0.02% Perl 0.17%

catbench's People

Contributors

Stargazers

Watchers

catbench's Issues

Data format doesn't accommodate multiple programs per dataset

Is this a problem? Would be easy to fix now...

jaguar returns the pairs of our data.$series.samples arrays in a nondeterministic order

The Python library's JSON implementation is probably at fault—unless we're forgetting to sort somewhere—but it's kind of annoying. For instance, when writing scripts that use arrcsv, one has to use a loop to ensure the output comes back in the expected order.

network_rtt has vestigial sigpid variable

run-extended-benchmark divides 100 by an integer

So... 200 and up are not so hot. While we're at it, revert the addition of the multiplier argument to avg_data.pl.

Make jaguar script also accept strings via stdin

This is necessary because some things (e.g. the base 64–encoded Kconfig) are too long to pass in as command-line arguments.

Tail latency legend entries are named incorrectly

This prevents their being used by the graphing script, so the axis labels aren't human readable.

square_evictions.c:271 histogram saving conditional seems backward

It disagrees with the comment, and only recording the first period appears nonsensical...

Make run-extended benchmark compare against square_evictions' -w sanity-check

This would ensure we don't run into constant-factor troubles in the future.

network_rtt's WARNING: WRONG NUMBER math/logic seems all wrong

Sanity check... or insanity check?

network_slowdown computes contender throughput slowdown ratios incorrectly

They're computed as contention/allocation, but since contention is the baseline in this case, we believe they should actually be 1-(allocation/contention), which is mathematically slightly different.
The existing ones are probably close enough for graphing purposes, but long-term, this needs to be fixed.

exit 0 solution isn't allowed to kill the server

since the latter action must be done as root. This means that ^C'ing the script can leave the server running.

lockserver-driver:16 uses hardware-specific constant

network_rtt should prolly set -u

lockserver-driver:99 we should squelch stderr

lockserver-driver:141 should be 5%, not 50%!

What a silly sanity-check!

Server failure due to hugepages allocation failure is not caught

square_evictions can't accept powers of two below 1/4

This could be rectified by supporting a command-line switch for exponents instead of percentages

locksmith doesn't always perform the same number of total accesses?!

This is on 362318f atop master.

Critical ToT latency/event counter measurement breakage

This appears to be a regression: we went from:

https://drive.google.com/a/andrew.cmu.edu/file/d/0Bx1O0RDCSN1rSDFYV2xZR09Kdmc/view and https://drive.google.com/a/andrew.cmu.edu/file/d/0Bx1O0RDCSN1rY1F3aGt1cXJUVzQ/view

https://drive.google.com/a/andrew.cmu.edu/file/d/0Bx1O0RDCSN1rY0NNR0xURWE0Mkk/view
https://drive.google.com/a/andrew.cmu.edu/file/d/0Bx1O0RDCSN1rNUx6RENCMF9LMUU/view

and

https://drive.google.com/a/andrew.cmu.edu/file/d/0Bx1O0RDCSN1rTDhNcVNkOTNPSFE/view
https://drive.google.com/a/andrew.cmu.edu/file/d/0Bx1O0RDCSN1rUHd1elFvWDhiRnM/view

We currently suspect:

commit 498b86b (refs/bisect/bad)
Author: Hyeontaek Lim <[email protected]>
Date:   Fri Jul 29 01:48:53 2016 -0400

    network_rtt: Control the run time of experiment stages explicitly

network_rtt should detect contender duration and use that for baseline and basealloc

Otherwise, you need to tweak it when using contender_duration so that the baseline series aren't incomparable to the control and experimental ones.

Graphing script reuse colors

Also please properly sort the legend

jaguar chokes with MemoryError on large (gigabytes) JSON files

This makes it really hard to deal with these files because jq and Vim also tend to choke on them. Although the coreutils are mostly tolerant of these files, it's still unclear how to easily/quickly use them to extract the logs because base64 adds line breaks that jaguar saves as escapes; normally it then interprets them on the query side, but this is actually hard to do efficiently with other tools, and base64 -d doesn't handle escapes itself.

Then of course there's the obvious problem that our data extraction/processing scripts barf when jaguar does. Currently I've been working around this by replacing the network_rawnums/timescale_and_cdf superhero dream team with:
$ grep -F "Completed after:" ><temp 1>
$ cut -d" " -f4 <temp 1> ><temp 2>
$ tail -n+"$((1 + n * 30000000))" <temp 2> | head -n30000000 | tail -n20000000 | <...>

AND -

$ ./largescale_and_cdf <...>

Document jaguar script

lockserver-driver:95 is racy even in non-CAT case

Its proximity to line 98 seems to mean that the lockserver may or may not be in the cgroup (albeit with the same mask) with the contenders. Breaks our control?

square_evictions breaks when daemonized with too high an -a argument

Repro:
$ ./square_evictions -n10000 -p1 -c100 -e100 -r -ia 16384 # runs normally
$ ./square_evictions -n10000 -p1 -c100 -e100 -r -ia 16385 # prints "Beginning {un,}saturated passes" ad infinitum

driver should save exit status

driver should error-check or set -e

Then we wouldn't fail on e.g. 49 when e.g. you try to check it out from a different branch but forget to drag along template.json...

network_rtt should fail-fast or retry when collecting a data point times out

Currently, it just plods along working on the rest of the points for the next two hours, then calmly reports it couldn't shoehorn the data into the Jaguar file at the end.
I'm in ur results files, wastin ur time.

Lock{smith/wesson}

Might just work if you invoke via driver program, but try to figure out whether this logs enough info to be replicabile.

lockserver-driver:118 throws an assertion that doesn't cause problems when not in sporadic mode

Also, do we want to commit usleep?

TRASH1 drops off the face of the earth at large contender sizes?

Setting the trash working set size to 102400 (1 GB) on dog results in the first trash process—the one sharing a core with the lockserver, incidentally—never printing ENDLAP002 for some data points! Does it crash? Is it just running extremely slowly, and does waitforexit fail to actually do its job? (Although in the latter case, wouldn't we see it printed later on, during some other trial?)

There doesn't seem to be enough info in the logs to discern...

gzip and gunzip modules should bail out if their test files don't exist

It took me a while to figure out that the sole problem was a missing linux-4.7.tar.gz. :(

square_evictions_csv script neglects to set the series' type field

This will break lookups of the command-line arguments' measurement units, among other things. This may need to be done differently depending on the decision made in #2

sleeps in run-extended-benchmark

Where's our race...?

network_rtt should have better variable scoping

There are a whole lot of function variables that aren't declared local, and some of our conditionals could probably still use unsets at the end.

All experiments run with network_rtt after cache way sensitivity analysis was added don't run contention!

In recent runs on the cachenice_mica2 branch, the contention case was exactly the same as the allocation one!
This bug was introduced by commit 6754023 on line 267 of network_rtt: notice how the line that launches contention is identical to the one for allocation.

To quickly tell whether a Jaguar file $file is afflicted, you can run:

$ git log --oneline "`jaguar/jaguar get "$file" meta.commit | cut -d" " -f1`" | grep 6754023 || jaguar/jaguar get "$file" meta.patch | base64 -d | grep '^+.recordtrial contention [^ ]\+ [^ ]\+ [^ ]'

No output/nonzero status means you're good; otherwise, you probably can't trust the contender numbers from that run.

efficient / catbench Goto Github PK

catbench's People

Contributors

Stargazers

Watchers

catbench's Issues

Recommend Projects

Recommend Topics

Recommend Org