Giter VIP home page Giter VIP logo

spark-databricks-observability-demo's Introduction

Contributors Forks Stargazers Issues MIT License LinkedIn


Databricks Spark Observability Demo

Monitoring and profiling Spark applications in Databricks with Prometheus, Grafana and Pyroscope

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Roadmap
  5. Contributing
  6. License
  7. Contact
  8. Acknowledgments

About The Project

Dive deeply into performance details and uncover what Spark Execution Plan doesn't typically show.

Product Name Screen Shot

(back to top)

Built With

Databricks Prometheus Grafana Pyroscope Spark

(back to top)

Getting Started

This project demonstrates how to monitor and profile Spark applications in Databricks using Prometheus, Grafana and Pyroscope. This is applicable to any Spark application running on Databricks, including batch, streaming, and interactive workloads (including ephemeral Jobs).

Besides Prometheus, Pyroscope and Grafana, this project will create a small single-node Spark Cluster and a set of init scripts to configure it to push metrics to Prometheus Pushgateway and Pyroscope.

Prerequisites

This demo uses Terraform to create all necessary resources in your Databricks Workspace. You will need Terraform version 1.40 or later installed on your machine.

You'll also need a VM with the network connectivity to the Databricks Workspace. This VM should preferably be created in the same virtual network as the Databricks Workspace, or the peered network.

Databricks

You will need a Databricks account to run the demo if you don't have one already. You can sign up for a free account at https://databricks.com/try-databricks.

Tooling

In order to send metrics and traces to Prometheus and Pyroscope, they need to be set up and running. For the convenience of the demo, the complete setup is done using Docker Compose, which you can find in docker directory. The included Terraform configuration won't create these resources for you, so you will need to set them up.

It can be started with the following command:

docker compose up

Setup

You will need a Databricks Personal Access Token to run the demo. Once you have the token, you can create a profile in the Databricks CLI or configure the provider explicitly (using PAT or any other form of authentication).

(back to top)

Usage

Terraform setup has only two variables that need to be set, we can provide them through Environment (or through a file), making sure to replace the values with the actual ones:

export TF_VAR_prometheus_pushgateway_host={pushgateway_host}:9091
export TF_VAR_pyroscope_host={prometheus_host}:4040

Prometheus Demo

If configured, you'll be able to see all relevant metrics in Grafana. If you're using tagging, you are also able to filter by cluster, job, and other tags.

The example below shows the CPU usage of each executor in the Spark cluster.

Prometheus Demo

Pyroscope Demo

If set correctly, here's what you should get at the end. The following example demonstrates profiling a Spark application that is bottlenecked by reading lzw compressed files, as well as using regex to process the data.

Pyroscope Demo

(back to top)

License

Distributed under the MIT License. See LICENSE.txt for more information.

(back to top)

Contact

Project Link: https://github.com/rayalex/spark-databricks-observability-demo

(back to top)

spark-databricks-observability-demo's People

Contributors

rayalex-dbc avatar rayalex avatar

Stargazers

Khanh Tran avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.