Giter VIP home page Giter VIP logo

sr's Introduction

Mk

Description

Given the problematic, I opted for an Airflow solution which could appear excessive for one job only but it could handle whole workflow based process in a company.

This solution is only a POC that is destined to be run locally.

For a production environment, I would have used Terraform to deploy Airflow and the service (the service image would have been hosted on a private repository). The DAGs would have been synchronized with the GIT repository.

I decided to store all stats in a PostgreSQL database and schedule an Airflow task to aggregate the data in a second table (stats).

Another cleaning Job could delete stats older than 10min for example in the non aggregated table.

The API workers could scale behind a Load balancer and the Celery workers aswell (this article looked promising: https://medium.com/back-market-engineering/how-to-improve-scalability-for-celery-on-kubernetes-16280ce547fb).

The docker-compose will show a connection error with PostgreSQL but the pod will retry. A better way to handle this would have been an health check preventing from starting the container until the connection is OK.

This POC lacks of test and logging aswell.

Requirements

- Minikube - Tested with:
    minikube version: v1.21.0
    commit: 76d74191d82c47883dc7e1319ef7cebd3e00ee11
- Helm - Tested with:
    Version:"v3.6.1", GitCommit:"61d8e8c4a6f95540c15c6a65f36a6dd0a45e7a2f"
- Docker - Tested with:
    Docker version 20.10.7, build f0df350

How To ?

Start the stack

    make start

make start will start the minikube server with the DAGs mount point and install Airflow via Helm. The docker-compose stack will then be started.

Once the stack is up, make forward_port will make the AiflowUI available at localhost:8000.

The stats image will have to be built in order to be used by the KubernetesPodOperator: make build_image.

Stop the stack

    make stop

make stop will stop the docker-compose stack, delete Airflow from minikube then delete the minikube cluster.

Build the ingestor Docker image

    make build_image

This command will synchronize the local / minikube Docker registry. The built image will thus be available from the KubernetesPodOperator.

Note: Basically, the process is to have two terminals, one to start the stack, and one to build the image run by the KubernetesPodOperator.

Endpoints

- localhost:8000: Airflow WebUI (The port needs to be forwarded !)
- localhost:80/stats POST

sr's People

Watchers

Valérian Kinyock avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.