gluster / glusterfs-perf Goto Github PK

View Code? Open in Web Editor NEW

2.0 7.0 5.0 99 KB

Ansible roles and tools to create a test environment and run performance tests

License: GNU General Public License v3.0

Python 66.10% Shell 33.90%

glusterfs-perf's Introduction

glusterfs.perf

On a given set of hosts, create a testbed to test GlusterFS builds. glusterfs.perf role:

Installs the necessary packages to build GlusterFS from source.
Clones the GlusterFS repository
Buils the filesystem from source and installs
Creates a GlusterFS volume and mounts it

Requirements

Ansible >= 2.7 gluster-ansible

Role Variables

Name	Choices	Default value	Comments
glusterfs_perf_volume_state	present/absent/started/stopped	present	GlusterFS volume state.
glusterfs_perf_volume		UNDEF	Name of the gluster volume
glusterfs_perf_bricks		UNDEF	GlusterFS brick directories for volume creation
glusterfs_perf_hosts		UNDEF	List of hosts that will be part of the cluster
glusterfs_perf_transport	tcp/tcp,rdma	tcp	Transport to be configured while creating volume
glusterfs_perf_replica_count		Omitted by default	Replica count for the volume
glusterfs_perf_arbiter_count		Omitted by default	Arbiter count for the volume
glusterfs_perf_disperse_count		Omitted by default	Disperse count for the volume
glusterfs_perf_redundancy_count		Omitted by default	Redundancy count for the volume
glusterfs_perf_force	yes/no	no	Whether GlusterFS volume should be created by force
glusterfs_perf_mountpoint		/mnt/glusterfs	GlusterFS mount point
glusterfs_perf_server		UNDEF	Server to use while mounting GlusterFS volume
glusterfs_perf_clients			Clients on which to mount the volume and run the tests
glusterfs_perf_client		First among the list of clients	Client on which to mount. This will the client where the perf test is launched.
glusterfs_perf_resdir		/var/tmp/glusterperf	Directory to store perf results
glusterfs_perf_mail_sender		[email protected]	email address which has to be listed in the from field of the status email.
glusterfs_perf_to_list		UNDEF	email addresses of the list of people to whom the report has to be sent. Not this is not comma separated addresses, but yaml list. Plese see playbooks/cluster_setup.yml for an example.
glusterfs_perf_ofile		/tmp/perf-results-	Output file where results have to be stored
glusterfs_perf_git_repo		https://github.com/gluster/glusterfs.git	Set the URL of new repo to be cloned
glusterfs_perf_git_refspec		-	Details of particular patch to be fetched. Check the details in 'Download' section in gerrit for refspec details

Example Playbook

---
- name: Setup a GlusterFS cluster from source tree
  remote_user: root
  gather_facts: true
  hosts: all
  vars:
    glusterfs_perf_volume: perfvol
    glusterfs_perf_bricks: /gluster_bricks/perfbrick
    glusterfs_perf_hosts: "{{ groups['all'] }}"
    glusterfs_perf_replica_count: 2
    glusterfs_perf_server: "{{ groups['all'][0] }}"

  roles:
    - glusterfs.perf

Setting up and running the tests

Bootstrapping: Ensure Ansible >= 2.7 is present. Install the roles gluster-ansible-infra and glusterfs-perf. Copy the playbook under playbooks/cluster_setup.yml directory in the glusterfs-perf role and change the variables appropriately and run the command:

# gluster-ansible -i <inventory-file> cluster_setup.yml

Scripts to run a particular patch

If you need to run the whole tests on any of the patch from https://review.gluster.org use the below command.

# cd /etc/ansible/roles/glusterfs.perf
# ./playbooks/run_perfs-with-patch.sh -t new-tag -s refs/changes/76/22576/3 -e [email protected],[email protected],[email protected],[email protected]

# git stash
# ./playbooks/run_perfs-with-patch.sh -t another-tag -v release-6 -e [email protected],[email protected],[email protected],[email protected]

License

GPLv3

glusterfs-perf's People

Contributors

Stargazers

Watchers

Forkers

amarts poornimag sac-urs kshithijiyer rishubhjain

glusterfs-perf's Issues

Perf tests for both hot cache and cold cache

It would be good if we run benchmark for both cold cache and hot cache, this will help us in identifying regressions in the cache xlators, and also give us an idea on the benefits of cache layers.

Pass the volume options as input

Should be able to input volume profile or options to run the tests against

how did we decide which hosts are clients?

This is just a question:

We may end up with less or more 'inventory' over time, so, how are we deciding how many clients are there and how many servers are there?

Isn't it better to define some tests, and clearly identify minimum inventory required for the test? If not satisfied, the test shouldn't even run.

Collect more metrics for perf runs

For each perf run, along with iops, latency etc. it's good to collect the following if possible:

profile info
network bandwidth usage
disk usage
CPU consumption of Gluster processes
memory consumption of Gluster processes

Save additional information with each run

Each run should produce a unique set of information that should be kept together. This is what I would record:

Commit used for the test
Date and time of the run (start and end)
Configuration settings (any gluster options, system options, or anything else changed for the test)
Environment (kernel, version, free memory, list of processes, cache status (dropped, not dropped), ...) both at the start and end of the test, from all nodes
Output of all information captured during the test on all nodes (iostat, top, ...)
stdout and stderr of all executed commands
A user provided comment for this particular test

We should create this in some organized way so that it's easily searchable.

This is very useful when we have lots of runs and want to use them for deeper analysis or extract statistics.

Run benchmark for different volume types

Currently the tests are run for only replica 3. Need to be able to run it for different volume types

Need to cleanup /var/lib/glusterd

While trying to run a test with v5.0, hit this error. Looks like we had previous glusterd workdir.

[2019-02-21 05:01:16.847028] W [rpcsvc.c:1789:rpcsvc_create_listener] 0-rpc-service: cannot create listener, initing the transport failed
[2019-02-21 05:01:16.847044] E [MSGID: 106244] [glusterd.c:1798:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport
[2019-02-21 05:01:18.059357] E [MSGID: 106022] [glusterd-store.c:2276:glusterd_restore_op_version] 0-management: wrong op-version (70000) retrieved [Invalid argument]

Perf tests for Gluster Block

Currently we only run it for Gluster volumes' fuse mount. We should be running for Gluster block as well.

How to present the perf data we capture

Once we run the performance tests, we have captured the files/sec, iops, bandwidth usage etc. We need to keep a history of all the runs and present in a visual format to be able to comprehend for the developers. One suggestion would be:

We have a github repo, where we push all the metrics collected on a nightly runs. The last 30 or n runs are compared plotted to get the trend of the performance.

The other option suggested is to have a database where we store all the runs metrics.

need to capture gluster profile info after each runs

as we are starting processes newly for the runs, we need to capture the gluster profile information after each run. Helps to debug later.

send email only if there are more than 5% regression in any tests.

We needn't get an email if there is no regression seen.

need to take avg of 3 runs as result.

right now, we are taking just 1 run as the result, as it is performance runs, we need to take the avg of each runs.

pbench as part of performance run.

We need to capture pbench[1] output as part of performance runs, and capture it somewhere on the resdir/

also see if we can expose the resdir through webserver.

Anyways, this issue is for installing and capturing details of pbench

[1] - https://github.com/distributed-system-analysis/pbench

plot the graph corresponding to baseline.

the baselines should be committed to the repository itself, where as the current result should be updated from local run, and a graph image should be emailed.

Run all the tests for different Gluster release version

Once the tests are all added and different volume types, it's good to run the benchmark for the last 3-4 major releases and plot the same. Make this available in Gluster readthe docs or release notes.