Giter VIP home page Giter VIP logo

dbt_template's Introduction

Cookiecutter dbt

Powered by Cookiecutter, Cookiecutter dbt is a framework for jumpstarting production-ready dbt projects quickly.

Features

  • For dbt >= 0.18.0
  • Works with Python 3.8
  • SQLFluff linting
  • Pre-built example models and tests

Installation

Using Cookiecutter CLI

pip install "cookiecutter>=1.7.0"
cookiecutter https://github.com/gabrieltoscani/dbt_template.git

Usage

  • The models/staging folder is where we will apply most of the data cleaning and business rules
  • The models/staging_marts folder is the staging area for our data marts. Here we build the tables that will feed the Data Warehouse, such as facts and dimension tables.
  • In the models/marts the models will consist of select * from staging_marts_table, meaning that if we have a stg_fact_table in the staging_marts folder, it will need a counterpart in the marts folder built as select * from {{ref('stg_fact_table'}}

Reasoning

In an ideal dbt pipeline, we want to run all the models targeted to a development schema, test them and only after run to the prod schema. In this structure, all models in the staging and staging_marts folder are tagged as staging and models in the marts folder are tagged as prod. This way, we can run and test all our staging using dbt run --models tag:staging and dbt test --models tag:staging.

Having the production tables separated in a different environment and built as simple selects from the staging area prevents running possibly heavy processes twice, while removing the need to test again because the tables are exact the same as the previously tested ones. To ensure all tables exist in staging_marts and marts folder, we developed a test to compare the total tables in both schemas, which should be equal. This way, a CI/CD pipeline will not allow conflicting information between staging_marts and marts.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.