Giter VIP home page Giter VIP logo

googlecloudplatform / perfkitbenchmarker Goto Github PK

View Code? Open in Web Editor NEW
1.9K 126.0 473.0 25.5 MB

PerfKit Benchmarker (PKB) contains a set of benchmarks to measure and compare cloud offerings. The benchmarks use default settings to reflect what most users will see. PerfKit Benchmarker is licensed under the Apache 2 license terms. Please make sure to read, understand and agree to the terms of the LICENSE and CONTRIBUTING files before proceeding.

Home Page: https://googlecloudplatform.github.io/PerfKitBenchmarker/

License: Apache License 2.0

Shell 1.04% Python 94.77% Lua 1.11% HTML 1.52% Dockerfile 0.07% R 0.01% Jinja 0.95% Tcl 0.20% PowerShell 0.32% BitBake 0.01%

perfkitbenchmarker's People

Contributors

andyz422 avatar asaksena avatar bvliu avatar cmccoy avatar cwilkes avatar dlott avatar dorellang avatar dphanekham avatar ehankland avatar flint-dominic avatar gareth-ferneyhough avatar hildrum avatar jellyfishcake avatar jerlawson avatar kivio avatar klausw avatar mateusz-blaszkowski avatar meteorfox avatar nathanteeuwen avatar noahl avatar pmkc avatar raymond13513 avatar s-deitz avatar shyamsabhaya avatar skschneider avatar stfeng2 avatar tedsta avatar tohaowu avatar voellm avatar yuyantingzero avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

perfkitbenchmarker's Issues

PerfKit Benchmarker is failing on new AWS C4 instance types

It looks like the issue is related to the lookup of disks in aws_virtual_machine.py

When looking for NUM_LOCAL_VOLUMES for code should default to 0 rather than throw an exception.

The quick fix is to add the C4 types to NUM_LOCAL_VOLUMES. The whole fix is to also fix all the places that use NUM_LOCAL_VOLUMES to handle the cause of an instance type that is not listed.

My plan is to fix the first part of adding the new instance types. Eric is planning on doing the default to 0 fix.

Add a "best practices check

Today each vendor has a set of best practices like:

  • Using NOOP for the block scheduler
  • Turning on THP
  • Turning of CONNTRACK
  • etc...

As part of running the benchmarks it would be good to have a feature that verifies if best practices have been applied. Mostly this would be a set of checks and it could be published as metadata.

Cassandra Benchmark Missing Quotes

Test Provider: Google Compute Engine
Test VM: n1-standard-1 (by default)
The file: cassandra_benchmark.py
Affected Lines: 145, 148, 151, and 154

In the function ConfigureCassandraEnvScript(), trying to invoke the following line of code:

echo "JVM_OPTS=\"\\$JVM_OPTS -XX:+UseLargePages\""

But the input received by the VM is:

JVM_OPTS=$JVM_OPTS -XX:+UseLargePages

instead of what we assume it should be, which is :

JVM_OPTS="$JVM_OPTS -XX:+UseLargePages"

It will still continue to run, but the Shell interprets -XX as a command and says

-XX: command not found

Outdated/broken image on Azure

PerfkitBenchmarker has not updated its images used by vms for a while, causing some software installation problems, especially on Azure.

When trying to install openjdk-7-jdk, I get following errors.

2015-01-27 17:26:36,504 ERROR Got exception running InstallPackages: Got non-zero return code (100) executing sudo DEBIAN_FRONTEND='noninteractive' /usr/bi
n/apt-get -y install openjdk-7-jdk
Full command: /usr/bin/ssh -A -p 22 [email protected] -2 -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o IdentitiesOnly=yes -o PreferredAut
hentications=publickey -o PasswordAuthentication=no -o ConnectTimeout=5 -o GSSAPIAuthentication=no -o ServerAliveInterval=30 -o ServerAliveCountMax=10 -i /tmp
/perfkitbenchmarker/run_fdf1048d/perfkitbenchmarker_keyfile sudo DEBIAN_FRONTEND='noninteractive' /usr/bin/apt-get -y install openjdk-7-jdk
STDOUT: Reading package lists...
Building dependency tree...
Reading state information...
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
openjdk-7-jdk : Depends: openjdk-7-jre (= 7u71-2.5.3-0ubuntu0.14.04.1) but it is not going to be installed
Recommends: libxt-dev but it is not installable
STDERR: Warning: Permanently added '191.237.18.89' (ECDSA) to the list of known hosts.
E: Unable to correct problems, you have held broken packages.

Create the --benchmarks=<your-name>Set processing

At the last community meeting we talked about adding named sets of benchmarks for companies that are going to be named in the announcement post. Time to add the feature to support named sets.

The feature will be in two parts.

  1. Part one is a simple addition of a dictionary mapping the name to a set of benchmarks.
  2. Part two which wont be implemented until we clean up config files is the ability or override default flags and setting to further customize how things run.

The current "all" should actually become a superset of all named sets, and a new "standard/official/community" set should be created.

Any preference on "standard" or "official" or "community" as the default set that runs?

Add managed NoSQL benchmarks

Today we have lots of coverage for VMs and Cloud Storage. We should extend to managed NoSQL like Google Cloud Platform (GCP) Datastore, AWS SimpleDB/DynamoDB, Azure [Redis] Cache, etc.

We should add a standard comment block to each Benchmark to detail exactly how it publishes data.

Today we have a schema does that is not broadly shared, and reading the code to figure of the metrics published is slow.

We should create a comment block in each benchmark file detailing how it publishes. In particular we should detail:
test - this is the name of the bechmark
owner - who is running the numbers
labels - This data is a '|' separate list of extra meta data to understand the data point. There are some common labels like 'zone' and 'machine_type'. We should detail all of them for the given benchmark.
metric - What is being reported
units - Units of the metric. There is a pretty standard list here we should not expand unless really needed.
value - the number being reported.

Concerns with assumptions and limitations of UnixBench

I have several concerns with the current usage of UnixBench. I know that most people seems to run UnixBench 'as is' with whatever defaults are set, but I believe this will not show a proper comparison between systems (i.e. OS versions, Compiler versions, libC versions, etc).

Here's why:

  • Doesn't scale beyond 16 CPUs; at least not by default
    • This issue has been known for a while and a patch is available, for some reason when UnixBench was written, a hard-coded limit (16) was set for the Systems and Non-Index benchmarks suites. Obviously, nowadays we have much more CPUs than that on single machine.
    • This means that running UnixBench on a instance-type with 32 CPUs will give scores approximately the same as as 16 CPUs instance-type, assuming they use the same processors, of course.
  • GCC compiler options
  • The UnixBench Makefile by default assumes a Solaris OS, even when the flags for Linux are commented out, although it assumes a Pentium ISA.
  • Compiling with different flags it's known to have huge impacts on the performance of a program, you can tweaks things like unrolling loops, to inlining functions, but more concerning, it's been shown recently that even linking order and environment variables can impact performance significantly.
  • This is a little trickier to solve, but perhaps one way to mitigate this is to statically compiled binaries for specific architectures, this way it will always run the same number of instructions, and be linked the same way.
  • GCC versions!
    • Brendan Gregg wrote this great post showing how wildly UnixBench can vary by just changing GCC versions. I'm just going to link his blog post here, which explains it better and was the inspiration for this issue.

What is your take on these issues?

Consider moving the SPEC CPU runs to use LongRunningRemoteCommand

For commands that take hours to days there are issues that can arise when connection failures happen. Today SPEC is running on an active connection vs HPCC which is using nohup an logging out.

We should consider moving

vm.RemoteCommand('cd %s; . ./shrc; ./bin/relocate; . ./shrc; rm -rf result; '
'runspec --config=linux64-x64-gcc47.cfg --tune=base '
'--size=ref --noreportable -rate %s %s '
% (vm.spec_dir, num_cpus, FLAGS.benchmark_subset))

To use the vm.LongRunningRemoteCommand.

Error running object_storage_service on GCP

I am able to run gust mb gs://bucketname from my macbook but the benchmark fails with the below error.
Any help/hint to solve this would be great.

./pkb.py --project=iron-potion-771 --benchmarks=object_storage_service

2015-02-17 23:18:30,533 vm_util.py:335 INFO Running: /usr/bin/ssh -A -p 22 [email protected] -2 -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o IdentitiesOnly=yes -o PreferredAuthentications=publickey -o PasswordAuthentication=no -o ConnectTimeout=5 -o GSSAPIAuthentication=no -o ServerAliveInterval=30 -o ServerAliveCountMax=10 -i /tmp/perfkitbenchmarker/run_076855d6/perfkitbenchmarker_keyfile /home/perfkit/google-cloud-sdk/bin/gsutil mb gs://pkb076855d6
2015-02-17 23:18:32,714 vm_util.py:349 DEBUG Ran /usr/bin/ssh -A -p 22 [email protected] -2 -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o IdentitiesOnly=yes -o PreferredAuthentications=publickey -o PasswordAuthentication=no -o ConnectTimeout=5 -o GSSAPIAuthentication=no -o ServerAliveInterval=30 -o ServerAliveCountMax=10 -i /tmp/perfkitbenchmarker/run_076855d6/perfkitbenchmarker_keyfile /home/perfkit/google-cloud-sdk/bin/gsutil mb gs://pkb076855d6. Got return code (1). STDOUT: STDERR: Warning: Permanently added '130.211.177.246' (RSA) to the list of known hosts.
Creating gs://pkb076855d6/...
BadRequestException: 400 Invalid argument.

Azure:Cloud Service stood up for every VM?

I'm running PerfkitBenchmarker against Microsoft Azure, and I'm noticing the system creates a cloud service for every VM it stands up. This is not a realistic use of the cloud service. In Azure, Cloud Service is similar to a GCE "Project." In fact, I would suggest that the behavior against Azure be modified such that if the user specifies a --project parameter at runtime, that you use it just like you would for GCP - to put all the VMs for the benchmarks into a cloud service with that name. Otherwise, I would recommend that a cloud service only be stood up against the run-id, and not for every VM.

illegal escape sequence "\" from sshKeys causes creation of GceVirtualMachine to fail

./pkb.py --project=elite-firefly-786 --benchmarks=iperf --machine_type=f1-micro
ERROR:root:Got exception running _CreateResource: Creation of GceVirtualMachine failed.
Cause: pkb test fails to create a new instance due to illegal escape sequence character:
Ran gcloud compute instances create perfkit-04bdc6d2-1 --disk name=perfkit-04bdc6d2-1 boot=yes mode=rw --machine-type f1-micro --tags=perfkitbenchmarker --maintenance-policy TERMINATE --metadata sshKeys=perfkit:ssh-rsa\ AAA...RVHn\ root@instance-1 owner=root --project elite-firefly-786 --format json --quiet --zone us-central1-a.
Got return code (2). STDOUT: STDERR: usage: gcloud compute instances create NAME [NAME ...] [optional flags]ERROR: (gcloud.compute.instances.create) argument --metadata: illegal escape sequence at index 23: sshKeys=perfkit:ssh-rsa\

Error during benchmark iperf

Receiving the following error while running PerfKitBenchmarker iperf.Please advise on the resolution

ubuntu@ip-172-30-0-134:/GPerfKitB/PerfKitBenchmarker$ clear
ubuntu@ip-172-30-0-134:
/GPerfKitB/PerfKitBenchmarker$ ./pkb.py --cloud=AWS --benchmarks=iperf --machine_type=t1.micro
2015-02-16 05:11:09,800 INFO Verbose logging to: /tmp/perfkitbenchmarker/run_569a6928/pkb.log
2015-02-16 05:11:09,910 INFO Preparing benchmark iperf
2015-02-16 05:11:09,911 INFO Running: aws ec2 create-vpc --region=us-east-1 --cidr-block=10.0.0.0/16
2015-02-16 05:11:10,426 ERROR Error during benchmark iperf
Traceback (most recent call last):
File "/home/ubuntu/GPerfKitB/PerfKitBenchmarker/perfkitbenchmarker/pkb.py", line 228, in RunBenchmark
benchmark_specification.Prepare()
File "/home/ubuntu/GPerfKitB/PerfKitBenchmarker/perfkitbenchmarker/benchmark_spec.py", line 185, in Prepare
vm_util.RunThreaded(self.PrepareNetwork, prepare_args)
File "/home/ubuntu/GPerfKitB/PerfKitBenchmarker/perfkitbenchmarker/vm_util.py", line 249, in RunThreaded
'\n'.join([stacktrace for stacktrace in exceptions]))
ThreadException: The following exceptions occurred during threaded execution: Traceback (most recent call last):
File "/home/ubuntu/GPerfKitB/PerfKitBenchmarker/perfkitbenchmarker/vm_util.py", line 236, in RunThreaded
thread.join(1000) # Set timeout so that join is interruptable.
File "/home/ubuntu/GPerfKitB/PerfKitBenchmarker/perfkitbenchmarker/vm_util.py", line 179, in join
raise errors.VmUtil.ThreadException(self.exception)
ThreadException: Exception occured in thread 139723530974976:
Traceback (most recent call last):
File "/home/ubuntu/GPerfKitB/PerfKitBenchmarker/perfkitbenchmarker/vm_util.py", line 166, in run
self.RunWithExceptions()
File "/home/ubuntu/GPerfKitB/PerfKitBenchmarker/perfkitbenchmarker/vm_util.py", line 173, in RunWithExceptions
super(ThreadWithExceptions, self).run()
File "/usr/lib/python2.7/threading.py", line 763, in run
self.**target(_self.__args, _self.__kwargs)
File "/home/ubuntu/GPerfKitB/PerfKitBenchmarker/perfkitbenchmarker/benchmark_spec.py", line 202, in PrepareNetwork
network.Create()
File "/home/ubuntu/GPerfKitB/PerfKitBenchmarker/perfkitbenchmarker/aws/aws_network.py", line 339, in Create
self.vpc.Create()
File "/home/ubuntu/GPerfKitB/PerfKitBenchmarker/perfkitbenchmarker/resource.py", line 113, in Create
self._CreateResource()
File "/home/ubuntu/GPerfKitB/PerfKitBenchmarker/perfkitbenchmarker/vm_util.py", line 306, in WrappedFunction
return f(_args, _kwargs)
File "/home/ubuntu/GPerfKitB/PerfKitBenchmarker/perfkitbenchmarker/resource.py", line 87, in _CreateResource
self._Create()
File "/home/ubuntu/GPerfKitB/PerfKitBenchmarker/perfkitbenchmarker/aws/aws_network.py", line 104, in _Create
response = json.loads(stdout)
File "/usr/lib/python2.7/json/__init
.py", line 338, in loads
return _default_decoder.decode(s)
File "/usr/lib/python2.7/json/decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python2.7/json/decoder.py", line 384, in raw_decode
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded

2015-02-16 05:11:10,428 INFO Running: aws ec2 --region=us-east-1 delete-key-pair --key-name=perfkit-key-569a6928
Traceback (most recent call last):
File "./pkb.py", line 21, in
sys.exit(Main())
File "/home/ubuntu/GPerfKitB/PerfKitBenchmarker/perfkitbenchmarker/pkb.py", line 358, in Main
return RunBenchmarks()
File "/home/ubuntu/GPerfKitB/PerfKitBenchmarker/perfkitbenchmarker/pkb.py", line 324, in RunBenchmarks
RunBenchmark(benchmark, collector)
File "/home/ubuntu/GPerfKitB/PerfKitBenchmarker/perfkitbenchmarker/pkb.py", line 228, in RunBenchmark
benchmark_specification.Prepare()
File "/home/ubuntu/GPerfKitB/PerfKitBenchmarker/perfkitbenchmarker/benchmark_spec.py", line 185, in Prepare
vm_util.RunThreaded(self.PrepareNetwork, prepare_args)
File "/home/ubuntu/GPerfKitB/PerfKitBenchmarker/perfkitbenchmarker/vm_util.py", line 249, in RunThreaded
'\n'.join([stacktrace for stacktrace in exceptions]))
perfkitbenchmarker.errors.ThreadException: The following exceptions occurred during threaded execution: Traceback (most recent call last):
File "/home/ubuntu/GPerfKitB/PerfKitBenchmarker/perfkitbenchmarker/vm_util.py", line 236, in RunThreaded
thread.join(1000) # Set timeout so that join is interruptable.
File "/home/ubuntu/GPerfKitB/PerfKitBenchmarker/perfkitbenchmarker/vm_util.py", line 179, in join
raise errors.VmUtil.ThreadException(self.exception)
ThreadException: Exception occured in thread 139723530974976:
Traceback (most recent call last):
File "/home/ubuntu/GPerfKitB/PerfKitBenchmarker/perfkitbenchmarker/vm_util.py", line 166, in run
self.RunWithExceptions()
File "/home/ubuntu/GPerfKitB/PerfKitBenchmarker/perfkitbenchmarker/vm_util.py", line 173, in RunWithExceptions
super(ThreadWithExceptions, self).run()
File "/usr/lib/python2.7/threading.py", line 763, in run
self.**target(_self.__args, _self.__kwargs)
File "/home/ubuntu/GPerfKitB/PerfKitBenchmarker/perfkitbenchmarker/benchmark_spec.py", line 202, in PrepareNetwork
network.Create()
File "/home/ubuntu/GPerfKitB/PerfKitBenchmarker/perfkitbenchmarker/aws/aws_network.py", line 339, in Create
self.vpc.Create()
File "/home/ubuntu/GPerfKitB/PerfKitBenchmarker/perfkitbenchmarker/resource.py", line 113, in Create
self._CreateResource()
File "/home/ubuntu/GPerfKitB/PerfKitBenchmarker/perfkitbenchmarker/vm_util.py", line 306, in WrappedFunction
return f(_args, _kwargs)
File "/home/ubuntu/GPerfKitB/PerfKitBenchmarker/perfkitbenchmarker/resource.py", line 87, in _CreateResource
self._Create()
File "/home/ubuntu/GPerfKitB/PerfKitBenchmarker/perfkitbenchmarker/aws/aws_network.py", line 104, in _Create
response = json.loads(stdout)
File "/usr/lib/python2.7/json/__init
.py", line 338, in loads
return _default_decoder.decode(s)
File "/usr/lib/python2.7/json/decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python2.7/json/decoder.py", line 384, in raw_decode
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded

Add Facebook's LinkBench benchmark

Overview taken from the LinkBench repo:

LinkBench is a database benchmark developed to evaluate database performance for workloads similar to those of Facebook's production MySQL deployment. LinkBench is highly configurable and extensible. It can be reconfigured to simulate a variety of workloads and plugins can be written for benchmarking additional database systems.

LinkBench is released under the Apache License, Version 2.0.

GCP: Only 4 of 9 VMs spawn on Hadoop Terrasort benchmark

I'm using an f1-micro instance on GCP as my orchestrator for running pkb.

When I execute the Hadoop Terrasort benchmark against an n1-standard-4 machine type, I see 9 disks successfully created, but then only 4 vm's successfully start. The other 5 continue retrying until timeout, and pkb tears down the resources and never executes the benchmark. It's a different 4 each time... on one run, -1,-3,-4,-7, and -8 were missing... but in another run -1,-2,-5,-6, and -7 were missing. That said, it always seems to be the same number of instances that end up missing.

I've only tried this against GCP in us-central1-a.

I see "Got exception running _CreateResource: Creation of DebianBasedGceVirtualMachine failed" for the instances that don't start.

Any ideas why this is happening?

Azure: Perfkit VNets overwrite ExpressRoute settings

I have a service account that runs ExpressRoute for connectivity to my corporate network. I decided to run PerfKitExplorer to test out various Azure VM Sizes and Regions. My assumption was that since PerfkitBenchmarker was setting up its own affinity groups and VNets, that I'd have complete isolation. Unfortunately, it appears that when the network configuration file is being parsed to add new virtual networks to the service account, PerfkitBenchmarker is dropping the ExpressRoute configuration information off... so when it stands up new virtual networks, it destroys ExpressRoute connectivity (and takes out corporate connectivity to Azure resources as a consequence) :-(

Broken pipe error when running object_storage_service benchmark

Hi,
Has anyone seen this error, Broken pipe, during benchmark object_storage_service?
./pkb.py --project=elite-firefly-786 --benchmarks=object_storage_service --machine_type=n1-standard-2
pkb.log
...
2015-02-25 21:38:46,569 vm_util.py:335 INFO Running: /usr/bin/ssh -A -p 22 [email protected] -2 -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o IdentitiesOnly=yes -o PreferredAuthentications=publickey -o PasswordAuthentication=no -o ConnectTimeout=5 -o GSSAPIAuthentication=no -o ServerAliveInterval=30 -o ServerAliveCountMax=10 -i /tmp/perfkitbenchmarker/run_fcd580f8/perfkitbenchmarker_keyfile /scratch0/run/object_storage_api_tests.py --bucket=pkbfcd580f8 --storage_provider=GCS --iterations=200 --scenario=ListConsistency
2015-02-26 01:34:05,081 vm_util.py:349 DEBUG Ran /usr/bin/ssh -A -p 22 [email protected] -2 -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o IdentitiesOnly=yes -o PreferredAuthentications=publickey -o PasswordAuthentication=no -o ConnectTimeout=5 -o GSSAPIAuthentication=no -o ServerAliveInterval=30 -o ServerAliveCountMax=10 -i /tmp/perfkitbenchmarker/run_fcd580f8/perfkitbenchmarker_keyfile /scratch0/run/object_storage_api_tests.py --bucket=pkbfcd580f8 --storage_provider=GCS --iterations=200 --scenario=ListConsistency. Got return code (0). STDOUT: STDERR: Warning: Permanently added '104.154.77.220' (ECDSA) to the list of known hosts.^M
โ€ฆ..
Final stats:
INFO:root:list-after-write consistency percentage: 87.000000
INFO:root:list-after-write inconsistency window: {"average": 39.82011866569519, "p1": 1.0820820331573486, "p5": 1.132835865020752, "p50": 5.805269002914429, "p90": 134.60191202163696, "p99": 283.1713089942932, "p99.9": 283.1713089942932, "stddev": 73.03022897677377}
INFO:root:list-after-write latency: {"average": 1.2659037058380829, "p1": 0.9758129119873047, "p5": 1.0399689674377441, "p50": 1.1664412021636963, "p90": 1.4518799781799316, "p99": 5.052584886550903, "p99.9": 5.406584978103638, "stddev": 0.4693329095847918}
INFO:root:list-after-update consistency percentage: 89.000000
INFO:root:list-after-update inconsistency window: {"average": 21.084003946997903, "p1": 1.2440087795257568, "p5": 1.4276509284973145, "p50": 8.543776035308838, "p90": 53.372872829437256, "p99": 147.42951703071594, "p99.9": 147.42951703071594, "stddev": 34.4644712034284}
INFO:root:list-after-update latency: {"average": 1.263827731100361, "p1": 0.9965829849243164, "p5": 1.017981767654419, "p50": 1.1851608753204346, "p90": 1.4783661365509033, "p99": 5.225747108459473, "p99.9": 5.397141933441162, "stddev": 0.4733233401719766}

2015-02-26 01:34:05,095 pkb.py:256 ERROR Error during benchmark object_storage_service
Traceback (most recent call last):
File "/root/PerfKitBenchmarker/perfkitbenchmarker/pkb.py", line 238, in RunBenchmark
samples = benchmark.Run(benchmark_specification)
File "/root/PerfKitBenchmarker/perfkitbenchmarker/benchmarks/object_storage_service_benchmark.py", line 806, in Run
print results
IOError: [Errno 32] Broken pipe
2015-02-26 01:34:05,096 vm_util.py:335 INFO Running: gcloud compute instances delete perfkit-fcd580f8-0 --project elite-firefly-786 --format json --quiet --zone us-central1-a
2015-02-26 01:34:37,261 vm_util.py:349 DEBUG Ran gcloud compute instances delete perfkit-fcd580f8-0 --project elite-firefly-786 --format json --quiet --zone us-central1-a. Got return code (0). STDOUT: []
STDERR: Deleted [https://www.googleapis.com/compute/v1/projects/elite-firefly-786/zones/us-central1-a/instances/perfkit-fcd580f8-0].

Variable BENCHMARK_INFO and name not consistently defined

  1. Some of *_benchmark.py have variable BENCHMARKS_INFO while other files have BENCHMARK_INFO (singular). For example,
    iperf_benchmark.py:BENCHMARKS_INFO = {'name': 'iperf'
    hadoop_benchmark.py:BENCHMARK_INFO = {'name': 'hadoop_benchmark'
  2. Also, the benchmark names themselves are not consistent. Some are appended with _benchmark and some are not.
  3. Is there a gcloud command that lists out all proper benchmark names that can be given to the pkb.py command for "--benchmark flag=" ?

The cluster boot benchmark should the num_cpus function in parallel

The cluster boot benchmark has the following code:

for vm in vms:
metadata = {'machine_type': vm.machine_type, 'num_cpus': vm.num_cpus,
'machine_instance': vm_number}
value = vm.TimeToBoot()

This looks great until you realize vm.num_cpus is a method on the virtual machine which in turn calls RemoteCommand leading to an ssh. When large number of VM's boot the result is a long set of serially run ssh's to each VM. This could be done a lot faster by moving the code into a method and then using RunThreaded.

Cassandra Benchmark VM Provisioning Problem Due to Exceeded Quota

Just noted as an FYI, if you already have other VMs running in your Google Project and you try to run Cassandra with the included config file, you might hit resource limitation issues, as it requires 4 n1-standard-8 virtual machines. If you try to run the script, it will just return the following error message:

ERROR:root:Got exception running _CreateResource: Creation of GceVirtualMachine failed.

and retry the VM build. And again.

So if you are running into VM provisioning issues on some tests like Cassandra, but the VMs were provisioning just fine a while ago for other tests, then it might be a resource limitation you're bumping into.

allow scratch space to be an aribtirary directory

It would be useful to be able to run PerfKitBenchmarker on VMs/containers/physical-servers where there isn't a dedicated mount point for the scratch space, or where creating a mountpoint per service is not supported.

Currenlty trying to run fio with "scratch_disk_mountpoints": "/tmp/google-pkb" results in```
Traceback (most recent call last):
File "./pkb.py", line 21, in
sys.exit(Main())
File "/home/csb/prog/github/PerfKitBenchmarker/perfkitbenchmarker/pkb.py", line 358, in Main
return RunBenchmarks()
File "/home/csb/prog/github/PerfKitBenchmarker/perfkitbenchmarker/pkb.py", line 324, in RunBenchmarks
RunBenchmark(benchmark, collector)
File "/home/csb/prog/github/PerfKitBenchmarker/perfkitbenchmarker/pkb.py", line 229, in RunBenchmark
benchmark.Prepare(benchmark_specification)
File "/home/csb/prog/github/PerfKitBenchmarker/perfkitbenchmarker/benchmarks/fio_benchmark.py", line 65, in Prepare
disk_size_kb = vm.GetDeviceSizeFromPath(vm.GetScratchDir())
File "/home/csb/prog/github/PerfKitBenchmarker/perfkitbenchmarker/virtual_machine.py", line 405, in GetDeviceSizeFromPath
return int(stdout)
ValueError: invalid literal for int() with base 10: '210411904 16419404 '


While omitting results in the only marginally more helpful error message:

Traceback (most recent call last):
File "./pkb.py", line 21, in
sys.exit(Main())
File "/home/csb/prog/github/PerfKitBenchmarker/perfkitbenchmarker/pkb.py", line 358, in Main
return RunBenchmarks()
File "/home/csb/prog/github/PerfKitBenchmarker/perfkitbenchmarker/pkb.py", line 324, in RunBenchmarks
RunBenchmark(benchmark, collector)
File "/home/csb/prog/github/PerfKitBenchmarker/perfkitbenchmarker/pkb.py", line 229, in RunBenchmark
benchmark.Prepare(benchmark_specification)
File "/home/csb/prog/github/PerfKitBenchmarker/perfkitbenchmarker/benchmarks/fio_benchmark.py", line 65, in Prepare
disk_size_kb = vm.GetDeviceSizeFromPath(vm.GetScratchDir())
File "/home/csb/prog/github/PerfKitBenchmarker/perfkitbenchmarker/static_virtual_machine.py", line 91, in GetScratchDir
scratch_dir = self.scratch_disk_mountpoints[disk_num]
IndexError: list index out of range

No way to specify SSD disks on GCE

In AWS and Azure, the type of storage (standard vs. ssd) is baked into the instance type. On GCE, you specify the type of disk you want to backend your VM at creation. There needs to be a parameter for starting GCE benchmarks backended by SSD disks.

Error resource not found when running cassandra benchmark

root@instance-1:~/PerfKitBenchmarker# ./pkb.py --project=elite-firefly-786 --benchmarks=cassandra --machine_type=n1-standard-2
2015-01-12 22:56:44,931 vm_util.py:344 DEBUG Ran gcloud compute instances describe perfkit-8da3b103-3 --project elite-firefly-786 --format json --quiet --zone us-central1-a. Got return code (1). STDOUT: STDERR: ERROR: (gcloud.compute.instances.describe) Could not fetch resource:

  • The resource 'projects/elite-firefly-786/zones/us-central1-a/instances/perfkit-8da3b103-3' was not found
    ...
    2015-01-12 22:56:21,184 INFO Authorizing loader[0] permission to access all other vms.
    2015-01-12 22:56:21,184 INFO Running: /usr/bin/scp -P 22 -pr -2 -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o IdentitiesOnly=yes -o PreferredAuthentications=publickey -o PasswordAuthentication=no -o ConnectTimeout=5 -o GSSAPIAuthentication=no -o ServerAliveInterval=30 -o ServerAliveCountMax=10 -i /tmp/perfkitbenchmarker/run_8da3b103/perfkitbenchmarker_keyfile /tmp/perfkitbenchmarker/run_8da3b103/perfkitbenchmarker_keyfile [email protected]:.ssh/id_rsa
    2015-01-12 22:56:21,502 INFO Preparing data files and Java on all vms.
    2015-01-12 22:56:21,503 WARNING File resource loader root data is not a directory.
    2015-01-12 22:56:21,505 WARNING File resource loader root data is not a directory.
    2015-01-12 22:56:21,505 WARNING File resource loader root data is not a directory.
    2015-01-12 22:56:21,506 WARNING File resource loader root data is not a directory.
    2015-01-12 22:56:21,507 ERROR Error during benchmark cassandra
    Traceback (most recent call last):
    File "/root/PerfKitBenchmarker/perfkitbenchmarker/pkb.py", line 219, in RunBenchmark
    benchmark.Prepare(benchmark_specification)
    File "/root/PerfKitBenchmarker/perfkitbenchmarker/benchmarks/cassandra_benchmark.py", line 116, in Prepare
    vm_util.RunThreaded(PrepareVm, benchmark_spec.vms)
    File "/root/PerfKitBenchmarker/perfkitbenchmarker/vm_util.py", line 244, in RunThreaded
    '\n'.join([stacktrace for stacktrace in exceptions]))
    ThreadException: The following exceptions occurred during threaded execution: Traceback (most recent call last):
    File "/root/PerfKitBenchmarker/perfkitbenchmarker/vm_util.py", line 231, in RunThreaded
    thread.join(1000) # Set timeout so that join is interruptable.
    File "/root/PerfKitBenchmarker/perfkitbenchmarker/vm_util.py", line 174, in join
    raise errors.VmUtil.ThreadException(self.exception)
    ThreadException: Exception occured in thread 140086647011072:
    Traceback (most recent call last):
    File "/root/PerfKitBenchmarker/perfkitbenchmarker/vm_util.py", line 161, in run
    self.RunWithExceptions()
    File "/root/PerfKitBenchmarker/perfkitbenchmarker/vm_util.py", line 168, in RunWithExceptions
    super(ThreadWithExceptions, self).run()
    File "/usr/lib/python2.7/threading.py", line 763, in run
    self.*target(_self.__args, _self.__kwargs)
    File "/root/PerfKitBenchmarker/perfkitbenchmarker/benchmarks/cassandra_benchmark.py", line 85, in PrepareVm
    vm.PushDataFile(CASSANDRA_TAR)
    File "/root/PerfKitBenchmarker/perfkitbenchmarker/virtual_machine.py", line 412, in PushDataFile
    file_path = data.ResourcePath(data_file)
    File "/root/PerfKitBenchmarker/perfkitbenchmarker/data/__init
    .py", line 166, in ResourcePath
    '{0} (Searched: {1})'.format(resource_name, loaders))
    ResourceNotFound: dsc.tar.gz (Searched: [, , ])

2015-01-12 22:56:21,508 INFO Running: gcloud compute instances delete perfkit-8da3b103-3 --project elite-firefly-786 --format json --quiet --zone us-central1-a
2015-01-12 22:56:21,510 INFO Running: gcloud compute instances delete perfkit-8da3b103-2 --project elite-firefly-786 --format json --quiet --zone us-central1-a
2015-01-12 22:56:21,513 INFO Running: gcloud compute instances delete perfkit-8da3b103-1 --project elite-firefly-786 --format json --quiet --zone us-central1-a
2015-01-12 22:56:21,514 INFO Running: gcloud compute instances delete perfkit-8da3b103-0 --project elite-firefly-786 --format json --quiet --zone us-central1-a
2015-01-12 22:56:43,988 INFO Running: gcloud compute instances describe perfkit-8da3b103-3 --project elite-firefly-786 --format json --quiet --zone us-central1-a
2015-01-12 22:56:44,030 INFO Running: gcloud compute instances describe perfkit-8da3b103-1 --project elite-firefly-786 --format json --quiet --zone us-central1-a
2015-01-12 22:56:44,932 INFO Running: gcloud compute disks delete perfkit-8da3b103-3 --project elite-firefly-786 --format json --quiet --zone us-central1-a
2015-01-12 22:56:45,012 INFO Running: gcloud compute disks delete perfkit-8da3b103-1 --project elite-firefly-786 --format json --quiet --zone us-central1-a
2015-01-12 22:56:49,132 INFO Running: gcloud compute instances describe perfkit-8da3b103-0 --project elite-firefly-786 --format json --quiet --zone us-central1-a
2015-01-12 22:56:49,268 INFO Running: gcloud compute instances describe perfkit-8da3b103-2 --project elite-firefly-786 --format json --quiet --zone us-central1-a
2015-01-12 22:56:50,064 INFO Running: gcloud compute disks delete perfkit-8da3b103-0 --project elite-firefly-786 --format json --quiet --zone us-central1-a
2015-01-12 22:56:50,196 INFO Running: gcloud compute disks delete perfkit-8da3b103-2 --project elite-firefly-786 --format json --quiet --zone us-central1-a
2015-01-12 22:56:52,505 INFO Running: gcloud compute disks describe perfkit-8da3b103-3 --project elite-firefly-786 --format json --quiet --zone us-central1-a
2015-01-12 22:56:52,598 INFO Running: gcloud compute disks describe perfkit-8da3b103-1 --project elite-firefly-786 --format json --quiet --zone us-central1-a
2015-01-12 22:56:53,495 INFO Running: gcloud compute disks delete perfkit-8da3b103-1-scratch-0 --project elite-firefly-786 --format json --quiet --zone us-central1-a
2015-01-12 22:56:57,530 INFO Running: gcloud compute disks describe perfkit-8da3b103-0 --project elite-firefly-786 --format json --quiet --zone us-central1-a
2015-01-12 22:56:57,653 INFO Running: gcloud compute disks describe perfkit-8da3b103-2 --project elite-firefly-786 --format json --quiet --zone us-central1-a
2015-01-12 22:56:58,441 INFO Running: gcloud compute disks delete perfkit-8da3b103-0-scratch-0 --project elite-firefly-786 --format json --quiet --zone us-central1-a
2015-01-12 22:56:58,554 INFO Running: gcloud compute disks delete perfkit-8da3b103-2-scratch-0 --project elite-firefly-786 --format json --quiet --zone us-central1-a
2015-01-12 22:57:00,714 INFO Running: gcloud compute disks describe perfkit-8da3b103-1-scratch-0 --project elite-firefly-786 --format json --quiet --zone us-central1-a
2015-01-12 22:57:06,011 INFO Running: gcloud compute disks describe perfkit-8da3b103-0-scratch-0 --project elite-firefly-786 --format json --quiet --zone us-central1-a
2015-01-12 22:57:06,080 INFO Running: gcloud compute disks describe perfkit-8da3b103-2-scratch-0 --project elite-firefly-786 --format json --quiet --zone us-central1-a
2015-01-12 22:57:07,007 INFO Running: gcloud compute firewall-rules delete perfkit-firewall-8da3b103-22 --project elite-firefly-786 --format json --quiet
Traceback (most recent call last):
File "./pkb.py", line 21, in
sys.exit(Main())
File "/root/PerfKitBenchmarker/perfkitbenchmarker/pkb.py", line 340, in Main
return RunBenchmarks()
File "/root/PerfKitBenchmarker/perfkitbenchmarker/pkb.py", line 314, in RunBenchmarks
RunBenchmark(benchmark, collector)
File "/root/PerfKitBenchmarker/perfkitbenchmarker/pkb.py", line 219, in RunBenchmark
benchmark.Prepare(benchmark_specification)
File "/root/PerfKitBenchmarker/perfkitbenchmarker/benchmarks/cassandra_benchmark.py", line 116, in Prepare
vm_util.RunThreaded(PrepareVm, benchmark_spec.vms)
File "/root/PerfKitBenchmarker/perfkitbenchmarker/vm_util.py", line 244, in RunThreaded
'\n'.join([stacktrace for stacktrace in exceptions]))
perfkitbenchmarker.errors.ThreadException: The following exceptions occurred during threaded execution: Traceback (most recent call last):
File "/root/PerfKitBenchmarker/perfkitbenchmarker/vm_util.py", line 231, in RunThreaded
thread.join(1000) # Set timeout so that join is interruptable.
File "/root/PerfKitBenchmarker/perfkitbenchmarker/vm_util.py", line 174, in join
raise errors.VmUtil.ThreadException(self.exception)
ThreadException: Exception occured in thread 140086647011072:
Traceback (most recent call last):
File "/root/PerfKitBenchmarker/perfkitbenchmarker/vm_util.py", line 161, in run
self.RunWithExceptions()
File "/root/PerfKitBenchmarker/perfkitbenchmarker/vm_util.py", line 168, in RunWithExceptions
super(ThreadWithExceptions, self).run()
File "/usr/lib/python2.7/threading.py", line 763, in run
self.*target(_self.__args, _self.__kwargs)
File "/root/PerfKitBenchmarker/perfkitbenchmarker/benchmarks/cassandra_benchmark.py", line 85, in PrepareVm
vm.PushDataFile(CASSANDRA_TAR)
File "/root/PerfKitBenchmarker/perfkitbenchmarker/virtual_machine.py", line 412, in PushDataFile
file_path = data.ResourcePath(data_file)
File "/root/PerfKitBenchmarker/perfkitbenchmarker/data/__init
.py", line 166, in ResourcePath
'{0} (Searched: {1})'.format(resource_name, loaders))
ResourceNotFound: dsc.tar.gz (Searched: [, , ])

The iperf text results dont label the internal and external IP results

When the final results come out of iperf on the command line the results dont differentiate the internal and external IP results. You can likely tell from the diff in performance but we should be explicit.

This may also hold for the network benchmarks.

The detailed JSON block does including the right information so it should be a matter of updating the output text to be more detailed.

unixbench fails with multiple permisison/command not found error on centos6

test.json
[
 {"ip_address": "INTERNAL",
  "os_type": "rhel",
  "keyfile_path": "ME",
  "user_name": "ADMIN",
  "scratch_disk_mountpoints": "/tmp/google-pkb",
  "zone": "Siberia"}
]

./pkb.py --benchmarks=unixbench --machine_type=YOLO --static_vm_file=test.json --zone=YOLO --ip_addresses=EXTERNAL
make all
make[1]: Entering directory `/tmp/pkb/UnixBench'
Checking distribution of files
./pgms  exists
./src  exists
./testdir  exists
./results  exists
make[1]: Leaving directory `/tmp/pkb/UnixBench'

   #    #  #    #  #  #    #          #####   ######  #    #   ####   #    #
   #    #  ##   #  #   #  #           #    #  #       ##   #  #    #  #    #
   #    #  # #  #  #    ##            #####   #####   # #  #  #       ######
   #    #  #  # #  #    ##            #    #  #       #  # #  #       #    #
   #    #  #   ##  #   #  #           #    #  #       #   ##  #    #  #    #
    ####   #    #  #  #    #          #####   ######  #    #   ####   #    #

   Version 5.1.3                      Based on the Byte Magazine Unix Benchmark

   Multi-CPU version                  Version 5 revisions by Ian Smith,
                                      Sunnyvale, CA, USA
   January 13, 2011                   johantheghost at yahoo period com


1 x Dhrystone 2 using register variables  1 2 3 4 5 6 7 8 9 10

1 x Double-Precision Whetstone  1 2 3 4 5 6 7 8 9 10

1 x Execl Throughput  1 2 3

1 x File Copy 1024 bufsize 2000 maxblocks  1STDERR: Warning: Permanently added '10.15.14.242' (RSA) to the list of known hosts.
./src/syscall.c: In function main:
./src/syscall.c:93: warning: null argument where non-null required (argument 2)
sh: //kill_run: Permission denied
sh: 3dinfo: command not found
sh: runlevel: command not found

**********************************************
Run: "File Copy 1024 bufsize 2000 maxblocks": fstime: creat: Permission denied; aborting

Add Windows Support

Today PKB handles Centos, Redhat, Ubuntu, Debian, ... but does not handle Windows.

There are some unique challenges here with the biggest being how to bootstrap Windows to allow remote command execution. On GCE we can do this with startup scripts.

For the actual command automation I think it makes the most sense to use powershell vs ssh like we do on LINUX. However that might tie running Windows workloads to running from Windows machines.

Other options include putty, etc.

Suggestions on how to architect this are welcome.

PerfKitExplorer Dashboard for every benchmark

The benchmarker can output the final results in "Text", "Json", etc. One output it can do is to Google BigQuery. This allows for big data mining.

To make this dashboarding on BQ easier we released a tool called PerfKitExplorer.

The work for this issue is to create a PerfKitExplorer dashboard for every benchmark.

Side note - the explorer is extensible so we could just as easily export to MySQL and create a MySQL widget for the Explorer to analyze results. We just have not done that work yet.

Collect CPUINFO has part of runs

Today we collect some uname info but not cpu info. It would be good to know the feature sets, etc the benchmarks are running on.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.