Giter VIP home page Giter VIP logo

data-engineering-stack's Introduction

Infrastructure Airflow

Ce repository a pour objectif de mettre en place rapidement une infrastructure Airflow permettant à chacun de tester son DAG avant mise en production.

L'infrastructure actuelle est basée sur du LocalExecutor (le scheduler, le webserver et worker sont hébergés sur le même container)

Installation

git clone [email protected]:etalab/data-engineering-stack.git
cd data-engineering-stack

# Create directories necessary for Airflow to work
./1_prepareDirs.sh

# Prepare .env file 
./2_prepare_env.sh
nano .env 
# Edit POSTGRES_USER ; POSTGRES_PASSWORD ; POSTGRES_DB ; AIRFLOW_ADMIN_MAIL ; AIRFLOW_ADMIN_FIRSTNAME ; AIRFLOW_ADMIN_NAME ; AIRFLOW_ADMIN_PASSWORD

# Launch services
docker-compose up --build -d

# After few seconds, you can connect to http://localhost:8080 with login : AIRFLOW_ADMIN_MAIL and password : AIRFLOW_ADMIN_PASSWORD

Refresh dags

# Airflow used to have a little time before dag refreshing when dag is created. You can force refreshing with :
./refreshBagDags.sh

Connections

Connections can be created manually or with python scripts createConn.py (using Airflow API) inside each projects. You need also to add your ssh key inside ssh folder of repository for the container to be able to see it in /home/airflow/.ssh/ folder of container.

data-engineering-stack's People

Contributors

geoffreyaldebert avatar haekadi avatar pierlou avatar sixtedemaupeou avatar

Stargazers

Quentin L avatar Kevin Masseix avatar  avatar  avatar Thomas Gratier avatar

Watchers

Mathieu Agopian avatar Thomas Gratier avatar Thibaud Dauce avatar  avatar Kevin Masseix avatar  avatar  avatar  avatar  avatar

data-engineering-stack's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.