Giter VIP home page Giter VIP logo

perf's Introduction

Perf

A simple tool for repeatedly running Java MapReduce jobs and Java applications for the purpose of coarse-grained performance measurements.

The tool support three kinds of execution modes: plain (Java on the local machine), local (pseudo distributed MR jobs) and could (EMR jobs).

Arguments are passed to it via the command line and a configuration file. None of this is very general, the tool is tailored to a particular set of assignments.

Building

Typing

mvn clean package

will create a self-contained JAR file with App.

Running

Create a file named perf.txt in the local directory. Invoke Java with, for example,

java -cp uber-perf-0.0.1.jar  neu.perf.App -num=5 \
    -jar=job.jar  -kind=plain  -main=neu.otp.Main \
    -arguments="-query=A_3_Median" -input=../data  \
    -output=/tmp  -results=results.csv -name=i-median 

As sample cloud run looks like this

java -cp uber-perf-0.0.1.jar  neu.perf.App -num=3 \
  -jar=s3n://mrclassvitek/job.jar    -kind=cloud -main=neu.otp.Main_A3 \
  -arguments="-median -fsroot=s3n://mrclassvitek/ -awskeyid=$(AWS_ACCESS_KEY) -awskey=$(AWS_SECRET_KEY)" \
  -results=results.csv -input=s3n://mrclassvitek/years14-15 \
  -output=s3n://mrclassvitek/output -name=iv-medianfast 

A sample perf.txt file could be:


num = 1
# how many time do we run this job?

results = results.csv
# where to store the time of each run?

jar = job.jar
# where is the code of the job? 

name = iii-median
# how to call this job in the results.csv?

main = opt.neu.Main
# where is this job's main() method?

input = input
# where is the input data?

output = output
# which directory to use for outputs?

arguments ="-query=AvgPrice"
# additional arguments (other than -input and -output)

############### Hadoop Initialization ###################

hadoop.home = /usr/local/hadoop
# where is hadoop on this machine?

############ AWS S3 Initialization #######################

region = us-east-1
# where are we running? 

check.bucket = s3://mrclassvitek
# what bucket are we using?

check.input = s3://mrclassvitek/years14-15
# where is the input data?

check.logs = s3://mrclassvitek/logs
# where are the logs?

delete.output = s3://mrclassvitek/output
# which directory should we delete before running?

upload.jar = s3://mrclassvitek
# where should we upload the jar?

######### AWS EMR Cluster Configuration #################
cluster.name = MyCluster
# how to name the cluster?

step.name = Step
# how to name the step?

release.label = emr-4.3.0
# what EMR release?

log.uri = s3://mrclassvitek/logs
# where are the logs?

service.role = EMR_DefaultRole
# role?

job.flow.role = EMR_EC2_DefaultRole
# role?

ec2.key.name = jankey
# which ec2 key should we use?

instance.count = 2
# how many nodes to provision?

keep.job.flow.alive = false
# keep cluster alive?

master.instance.type = m3.xlarge
# which master instance type?

slave.instance.type = m3.xlarge
# which node instance type?

perf's People

Contributors

janvitek avatar

Watchers

James Cloos avatar Mayuri Keskar avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.