Giter VIP home page Giter VIP logo

spelk's Introduction

Spelk - reporting Apache Spark metrics to Elasticsearch

Fork Notes

Note, this is a derivative fork of the original IBM code, which is no longer under development. As such it contains significant changes:

  • uses Elasticsearch Java API - required a bit of shading to get working with Spark
  • makes use dynamic template mappings
  • added support for date-based index names for easier aging off with Curator
  • upgraded to support Elasticsearch 5 and beyond
  • dropped the SparkListener which I currently don't see much benefit in

However, to-date I have only tested this on Spark 1.6.3, although plan to support Spark 2.x versions soon.


Spelk (spark-elk) is an add-on to Apache Spark (http://spark.apache.org/) to report metrics to an Elasticsearch server and allow visualizations using Kibana.

Building Spelk

Spelk is built using Apache Maven. You can specify the Spark versions.

mvn clean package -P spark1.6.3

Configuration

Elastic index mapping

PUT the dynamic mapping in mapping.json against the following URL:

http://{{hostname}}:9200/_template/spark-metrics-template

This will enable the dynamic generation of any index that starts with spark-metrics-. This can be updated to match your preferred naming convention.

Spark configuration

To use Spelk you need to add the Spelk jar to the Spark driver/executor classpaths and enable the metrics sink. You need to do the following:

  • add the spelk jar file to your spark.files property
  • add the spelk jar file to your spark.driver.extraClassPath property
  • add the spelk jar file to your spark.executor.extraClassPath property

Enable the Elasticsearch sink

You will need to create a metrics.properties file, as shown below. Then add this to the following properties:

  • spark.files
  • spark.metrics.conf

Create a metrics.properties file, based on the template below:

driver.sink.elk.class=org.apache.spark.elk.metrics.sink.ElasticsearchSink
executor.sink.elk.class=org.apache.spark.elk.metrics.sink.ElasticsearchSink
#   Name:           Default:      Description:
#   clusterName     none          Elasticsearch cluster name	
#   host            none          Elasticsearch server host	
#   port            none          Elasticsearch server port 
#   index           spark         Elasticsearch index name
#   indexDateFormat spark         Elasticsearch index date format, if required - gets appended onto the index name, e,g, yyyy-MM-dd causes the index name to become spark-metrics-2018-12-19
#   period          10            polling period
#   units           seconds       polling period units
*.sink.elk.clusterName=elasticsearch
*.sink.elk.host=localhost
*.sink.elk.port=9200
*.sink.elk.index=spark
*.sink.elk.period=10
*.sink.elk.unit=seconds

spelk's People

Contributors

owenrh avatar a-roberts avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.