Giter VIP home page Giter VIP logo

openaq-streamer's Introduction

OpenAQ streamer

Build

  • Build docker image
make image
  • Run docker
docker run --rm -p 8888:8888 -p 4040:4040 -v mountPoint:/home/jovyan/work balis/openaq-streamer

Usage

  1. Create an AWS SQS queue and subscribe to SNS topic arn:aws:sns:us-east-1:470049585876:OPENAQ_NEW_MEASUREMENT.

  2. To reduce the volume of data, go to the created SNS subscription and set up a filter policy, e.g.:

{
  "country": [
    "PL"
  ]
}
  1. Run spark structured streaming query. Python example:
# read AWS credentials from the 'credentials' file (format as generated by 'aws configure')
# Note: the file needs to be copied to the directory mounted in the Docker container
# Alternatively, environment variables can be used via 'docker --env AWS_ACCESS_KEY=...'

import configparser
config = configparser.ConfigParser()

cfg = config.read('credentials')
access_key=config.get('default', 'aws_access_key_id')
secret_key=config.get('default', 'aws_secret_access_key')
session_token=config.get('default', 'aws_session_token')

from pyspark.sql import SparkSession
spark = SparkSession\
    .builder\
    .config("spark.sql.streaming.schemaInference", True)\
    .getOrCreate()

stream = spark\
    .readStream\
    .format("sqs")\
    .option("queueUrl", "https://sqs.us-east-1.amazonaws.com/...")\  # insert your SQS queue URL
    .option("accessKey", access_key)\
    .option("secretKey", secret_key)\
    .option("sessionToken", session_token)\
    .option("region", "us-east-1")\
    .load()
    
stream.select("city",  "parameter", "value", "date.local").writeStream\ 
    .format("console")\
    .outputMode("append")\
    .start()

openaq-streamer's People

Contributors

balis avatar mucharafal avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.