Giter VIP home page Giter VIP logo

solr-bench's Introduction

Solr Bench

A comprehensive Solr performance benchmark and stress test framework.

Benchmarking & stress test for standard operations (indexing, querying) for a specified Solr build.

Benchmarking

Prerequisites

If running on GCP, spin up a coordinator VM where this suite will run from. Make sure to use a service account that has permissions to create other VMs.

The VM should have the following:

  • Maven and other tools for building apt install wget unzip zip ant ivy lsof git netcat make openjdk-11-jdk maven jq (for Ubuntu/Debian) or sudo yum install wget unzip zip ant ivy lsof git nc make java-11-openjdk-devel maven jq (for Redhat/CentOS/Fedora)
  • Terraform. wget https://releases.hashicorp.com/terraform/0.12.26/terraform_0.12.26_linux_amd64.zip; sudo unzip terraform_0.12.26_linux_amd64.zip -d /usr/local/bin
  • Vagrant. Ensure that virtualization is enabled in your BIOS. Install the pre-requisites: sudo yum install VirtualBox vagrant ansible. Run VBoxManage --version to make sure VirtualBox is installed properly (it may prompt you to install kernel drivers, just follow as suggested there). To adjust the instance memory, edit the vagrant/Vagrant file.

Running the suite

In the coordinator VM, check out this solr-bench repository.

 1. mvn clean compile assembly:single
 2. ./cleanup.sh && ./start.sh <config-file>

Example: config.json (GCP), config-local.json (Local mode, it builds Solr from source), config-prebuilt (Local mode, uses provided tgz file). For GCP, you need to modify the "terraform-gcp-config" to provide a valid "project_id". For local, the JDK is downloaded but not used and instead the system installed JDK/JRE is used.

Results

  • Results are available after the benchmark in results-<timestamp>.json file.

Datasets

  • One can use either TSV files or JSONL files for indexing. Use "tsv" or "json" for the "file-format" section.
  • The configset should be zipped, and "index-benchmarks" section should have the name of the file (without the .zip) as "configset".
  • The query file should have GET parameters that will be queried against /select.
  • Lucene's benchmarks dataset can be found here: http://people.apache.org/~mikemccand/enwiki-20120502-lines-1k.txt.lzma

Stress tests

In the coordinator VM (GCP) or local machine (local mode), check out this solr-bench repository.

 1. mvn clean compile assembly:single
 2. ./cleanup.sh && ./stress.sh <config-file>

Examples

Visualization (WIP)

Plotting the results

Once you have the results in a dir, say, results/experiment1, and you want to plot heap usage of a Solr node (hardcoded to the 7th node at the moment), beginning the task2, then

./plot.sh results experiment1 task2 jvm/solr.jvm/memory.heap.used

will generate a experiment1.png file containing the plot.

Acknowledgement

This started as a project funded by Google Summer of Code (SOLR-10317), later supported by FullStory.

solr-bench's People

Contributors

chatman avatar dependabot[bot] avatar donjaime avatar fs-github-snyk avatar hiteshk25 avatar justinrsweeney avatar shanhe-fullstory avatar snyk-bot avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.