Giter VIP home page Giter VIP logo

benchmarking's Introduction

Benchmarking

Benchmarking & Scaling Studies of the Pangeo Platform

Creating an Environment

To run the benchmarks, it's recommended to create a dedicated conda environment by running:

conda env create -f ./binder/environment.yml

This will create a conda environment named pangeo-bench with all of the required packages.

You can activate the environment with:

conda activate pangeo-bench

and then run the post build script:

./binder/postBuild

Benchmark Configuration

The benchmark-configs directory contains YAML files that are used to run benchmarks on different machines. So far, the following HPC systems' configs are provided:

$ tree ./benchmark-configs/
benchmark-configs/
├── cheyenne.yaml
└── hal.yaml
└── wrangler.yaml

In case you are interested in running the benchmarks on another system, you will need to create a new YAML file for your system with the right configurations. See the existing config files for reference.

Running the Benchmarks

from command line

To run the benchmarks, a command utility pangeobench is provided in this repository. To use it to benchmark Pangeo computation, you need to specify subcommand run and the location of the benchmark configuration

./pangebench run benchmark-configs/cheyenne.computation.yaml

To use it to benchmark Pangeo IO with weak scaling analysis, you need to specify subcommand run and the location of the benchmark configuration

./pangeobench run benchmark-configs/cheyenne.readwrite.yaml

To use it to benchmark Pangeo IO with strong scaling analysis, you need the following three steps

First, create data files:

./pangeobench run benchmark-configs/cheyenne.write.yaml

Second, upload data files to S3 object store if you need to benchmark S3 object store:

./pangebench upload --config_file benchmark-configs/cheyenne.write.yaml

Last, read data files:

./pangeobench run benchmark-configs/cheyenne.read.yaml
$ ./pangeobench --help
Usage: pangeobench [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  run     Run benchmarking
  upload  Upload benchmarking files from local directory to S3 object store

Running the Benchmarks

from jupyter notebook.

To run the benchmarks from jupyter notebook, install 'pangeo-bench' kernel to your jupyter notebook enviroment, then start run.ipynb notebook. You will need to specify the configuration file as described above in your notebook.

To install your 'pangeo-bench' kernel to your jupyter notebook enviroment you'll need to connect a terminal of your HPC enviroment and run following command.

source activate pangeo-bench
ipython kernel install --user --name pangeo-bench

Before starting your jupyternotebook, you can verify that if your kernel is well installed or not by follwing command

jupyter kernelspec list

Benchmark Results

Benchmark results are persisted in the results directory by default. The exact location of the benchmark results depends on the machine name (specified in the config file) and the date on which the benchmarks were run. For instance, if the benchmarks were run on Cheyenne supercomputer on 2019-09-07, the results would be saved in: results/cheyenne/2019-09-07/ directory. The file name follows this template: compute_study_YYYY-MM-DD_HH-MM-SS.csv

Visualization

Visualisation can be done using jupyter notebooks placed in analysis directories.

benchmarking's People

Contributors

halehawk avatar tinaok avatar andersy005 avatar jmunroe avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.