Giter VIP home page Giter VIP logo

streaming-at-scale's Introduction

page_type languages products statusNotificationTargets description
sample
azurecli
csharp
json
sql
scala
azure
azure-container-instances
azure-cosmos-db
azure-databricks
azure-event-hubs
azure-functions
azure-sql-database
azure-stream-analytics
azure-storage
How to setup an end-to-end solution to implement a streaming at scale scenario using a choice of different Azure technologies.

Streaming at Scale

The samples shows how to setup an end-to-end solution to implement a streaming at scale scenario using a choice of different Azure technologies. There are many possible way to implement such solution in Azure, following Kappa or Lambda architectures, a variation of them, or even custom ones. Each architectural solution can also be implemented with different technologies, each one with its own pros and cons.

More info on Streaming architectures can also be found here:

Here's also a list of scenarios where a Streaming solution fits nicely

A good document the describes the Stream Technologies available on Azure is the following one:

Choosing a stream processing technology in Azure

The goal of this repository is to showcase all the possible common architectural solution and implementation, describe the pros and the cons and provide you with sample script to deploy the whole solution with 100% automation.

Running the samples

Please note that the scripts have been tested on Ubuntu 18 LTS, so make sure to use that environment to run the scripts. You can run it using Docker, WSL or a VM:

Just do a git clone of the repo and you'll be good to go.

Each sample may have additional requirements: they will be listed in the sample's README.

Streamed Data

Streamed data simulates an IoT device sending the following JSON data:

{
    "eventId": "b81d241f-5187-40b0-ab2a-940faf9757c0",
    "complexData": {
        "moreData0": 57.739726013343247,
        "moreData1": 52.230732688620829,
        "moreData2": 57.497518587807189,
        "moreData3": 81.32211656749469,
        "moreData4": 54.412361539409427,
        "moreData5": 75.36416309399911,
        "moreData6": 71.53407865773488,
        "moreData7": 45.34076957651598,
        "moreData8": 51.3068118685458,
        "moreData9": 44.44672606436184,
        [...]
    },
    "value": 49.02278128887753,
    "deviceId": "contoso://device-id-154",
    "type": "CO2",
    "createdAt": "2019-05-16T17:16:40.000003Z"
}

Available solutions

At present time the available solutions are

Event Hubs Capture Sample

Implement stream processing architecture using:

  • Event Hubs (Ingest)
  • Event Hubs Capture (Store)
  • Azure Blob Store (Data Lake)
  • Apache Drill (Query/Serve)

Event Hubs + Azure Databricks + Azure SQL

Implement a stream processing architecture using:

  • Event Hubs (Ingest / Immutable Log)
  • Azure Databricks (Stream Process)
  • Azure SQL (Serve)

Event Hubs + Azure Databricks + Cosmos DB

Implement a stream processing architecture using:

  • Event Hubs (Ingest / Immutable Log)
  • Azure Databricks (Stream Process)
  • Cosmos DB (Serve)

Event Hubs Kafka + Azure Databricks + Cosmos DB

Implement a stream processing architecture using:

  • Event Hubs (Ingest / Immutable Log) with Kafka endpoint
  • Azure Databricks (Stream Process)
  • Cosmos DB (Serve)

Event Hubs + Azure Databricks + Delta

Implement a stream processing architecture using:

  • Event Hubs (Ingest / Immutable Log)
  • Azure Databricks (Stream Process)
  • Delta Tables (Serve)

Event Hubs + Azure Functions + Azure SQL

Implement a stream processing architecture using:

  • Event Hubs (Ingest / Immutable Log)
  • Azure Functions (Stream Process)
  • Azure SQL (Serve)

Event Hubs + Azure Functions + Cosmos DB

Implement a stream processing architecture using:

  • Event Hubs (Ingest / Immutable Log)
  • Azure Functions (Stream Process)
  • Cosmos DB (Serve)

Event Hubs + Stream Analytics + Cosmos DB

Implement a stream processing architecture using:

  • Event Hubs (Ingest / Immutable Log)
  • Stream Analytics (Stream Process)
  • Cosmos DB (Serve)

Event Hubs + Stream Analytics + Azure SQL

Implement a stream processing architecture using:

  • Event Hubs (Ingest / Immutable Log)
  • Stream Analytics (Stream Process)
  • Azure SQL (Serve)

Event Hubs + Stream Analytics + Event Hubs

Implement a stream processing architecture using:

  • Event Hubs (Ingest / Immutable Log)
  • Stream Analytics (Stream Process)
  • Event Hubs (Serve)

HDInsight Kafka + Azure Databricks + Azure SQL

Implement a stream processing architecture using:

  • HDInsight Kafka (Ingest / Immutable Log)
  • Azure Databricks (Stream Process)
  • Azure SQL Data Warehouse (Serve)

Event Hubs + Azure Data Explorer

Implement a stream processing architecture using:

  • Event Hubs (Ingest / Immutable Log)
  • Azure Data Explorer (Stream Process / Serve)

Event Hubs + Data Accelerator + Cosmos DB

Implement a stream processing architecture using:

  • Event Hubs (Ingest / Immutable Log)
  • Microsoft Data Accelerator on HDInsight and Service Fabric (Stream Process)
  • Cosmos DB (Serve)

Note

Performance and Services change quickly in the cloud, so please keep in mind that all values used in the samples were tested at them moment of writing. If you find any discrepancies with what you observe when running the scripts, please create an issue and report it and/or create a PR to update the documentation and the sample. Thanks!

Roadmap

The following technologies could also be used in the end-to-end sample solution. If you want to contribute, feel free to do so, we'll be more than happy to get some help!

Ingestion

  • IoT Hub

Stream Processing

  • Azure Data Explorer

Batch Processing

  • Azure Data Explorer

Serving Layer

  • Azure Data Explorer
  • Azure DW

streaming-at-scale's People

Contributors

algattik avatar carlbrochu avatar chetanmsft avatar dubansal avatar jcocchi avatar rasavant-ms avatar supernova-eng avatar yorek avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.