Giter VIP home page Giter VIP logo

airflow-testing's Introduction

Airflow Testing

This project contains different categories of tests with examples.

Five Categories of Tests

  1. DAG Validation Tests: To test the validity of the DAG, checking typos and cyclicity.
  2. DAG/Pipeline Definition Tests: To test the total number of tasks in the DAG, upstream and downstream dependencies of each task, etc.
  3. Unit Tests: To test the logic of custom Operators, custom Sensor, etc.
  4. Integration Tests: To test the communication between tasks. For example, task1 pass some information to task 2 using Xcoms.
  5. End to End Pipeline Tests: To test and verify the integration between each task. You can also assert the data on successful completion of the E2E pipeline.

Clone this repo to run these test in your local machine.

Unit Tests

Unit tests cover all tests falls under teh first four categories.

How to run?

  1. Build the airflow image. Go to project root directory and run

    docker build . -t airflow-test

  2. Run the unit tests from the docker. Use your repository location for {SourceDir} (Eg. If you cloned your repo at /User/username/airflow-testing/ then SourceDir is /User/username.)

    docker run -ti -v {SourceDir}/airflow-testing:/opt --entrypoint /mnt/entrypoint.sh airflow-test run_unit_tests

End-to-End Tests

End-to-End tests cover all tests of category five. To run these tests, we need to set up airflow environment in minikube. Also, we need to set up all the component required by your DAGs.

Minikube set up

Prerequisite:

git clone https://github.com/chandulal/airflow-testing.git
brew cask install virtualbox (run if you don't have virtual box installed)

Install minikube

brew cask install minikube
brew install kubernetes-cli
minikube start --cpus 4 --memory 8192

Mount DAGs, Plugins, etc.

Mount all your DAGs,Plugins, etc. in minikube

minikube mount {project dir}/src/main/python/:/data

Deploy Airflow in minikube

Open new terminal. Go to project root dir and run:

kubectl apply -f airflow.kube.yaml

wait for 3-4 min to start all airflow components.

This will set up following components:

  • Postgres (To store the metadata of airflow)
  • Redis (Broker for celery executors)
  • Airflow Scheduler
  • Celery Workers
  • Airflow Web Server
  • Flower

Access Airflow

Get minikube ip by running minikube ip command

Use minikube ip and access:

**Airflow UI:** {minikube-ip}:31317 

**Flower:** {minikube-ip}:32081

How Airflow works in minikube?

minkube_airflow_architecture

How to run these tests?

  1. Install all required components to run your DAGs in minikube. To run integration tests, available in this repo, we required MySQL and Presto on minikube.

     kubectl apply -f {SourceDir}/k8s/mysql/mysql.kube.yaml
     kubectl apply -f {SourceDir}/k8s/presto/presto.kube.yaml
     
  2. Run the integration tests from the docker. Use absolute path of this repository in your machine for {SourceDir}

    docker run -ti -v {SourceDir}/airflow-testing:/opt --entrypoint /mnt/entrypoint.sh airflow-test run_integration_tests {minikube-ip}

airflow-testing's People

Contributors

chandulal avatar svsarang avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.