Giter VIP home page Giter VIP logo

dbt-project-maturity's Introduction

Building a Mature dbt Project from Scratch (for BigQuery)

Some changes were made to dbt-labs/dbt-project-maturity for BigQuery connection.

How-to use

  1. Install dbt-bigquery following the instructions
  2. If this is the first time you install dbt on your machine, copy ./.dbt/ to home directory
cp -r .dbt ~

Otherwise, add the contents in ./.dbt/profiles.yml to your existing ~/.dbt/profiles.yml

  1. Request BigQuery project permission from me and login upon first connection, or Create your own BigQuery project with data from 0-raw-data/data
  2. Have fun going through the levels.

The following contents are from the original repo


image

Hello! This is the companion repo to the 2021 Coalesce Talk - Building a Mature dbt Project from Scratch

Introduction

With the explosion in popularity of dbt, and the coinciding explosion in features and capabilities in the tool, it's natural for many of us to find ourselves unsure of where to start. Many people come across dbt through a recommendation of a particularly powerful feature that dbt can support, like complex macros or intricate incremental model logic, but it's both intimidating and unwise to dive directly into the deep end. Like with any tool, it's best to walk before you run, and learn how these features both complement and build on each other so you can be confident you've developed a strong, sustainable, and scalable dbt project.

Purpose of this Repo

The goal of this repository is to show a single dbt project at different lifecycle stages, showing opinionated view of when to introduce certain dbt features into your project. Each stage has a particular theme/purpose, and the listed feature sets connect to that learning goal. This is intended to be both a resource for new dbt users to use as a jumping off point for starting a new project from scratch, and a rubric for existing dbt users to peg their own use of dbt features against this model to find opportunities for growth.

In each stage listed below (and in the accompanying talk), you'll see:

  1. A theme/purpose for the life stage
  2. Features relevant to the stage (with links to the relevant dbt docs)
  3. A picture of the DAG of the example project in that stage
  4. Links to slack channels on the dbt Community Slack that would be of interest!

Some caveats and assumptions:

  • There are real life use cases where some features get introduced into projects out of the order described here, and that is perfectly reasonable. There are often very justifiable reasons to introduce more advanced dbt features earlier in the development cycle.
  • There is no sense of timescale in this presentation! Some teams may mature their project in weeks rather than months, depending on a wide range of factors. It's more important to think about how features build upon themselves (and each other) rather than how quickly they do so.
  • This presentation assumes familiarity and comfortability with git and version control, and that all of the projects are already managed in a repository

Projects

Each project is built on a mock data set of patients, doctors, claims, and other billing data. It was generated via the Mockaroo API. Huge hat-tip to @krevitt for building a sweet G-sheet x Mockaroo integration! In the 0-raw-data project, you can find the sample dataset this was built from, so you can load them into your warehouse and run each project to get a feel for how the functionality works!

Infancy

Congratulations! It's (sorta!) a DAG!!

This project represents truly the bare minimum needed to have dbt do anything of use. It's really only technically a dbt project, but is going to need a lot of hand holding to do anything useful and keep it alive.

Theme: ๐Ÿผ Bare Necessities ๐Ÿงท

Features

Relevant Commands

  • dbt seed
  • dbt run

DAG

image

Relevant Community Slack Channels

  • #advice-dbt-for-beginners

Toddlerhood

This project is just starting to play with its blocks, and see how the world fits together. It can now handle multiple models, and it's able to see the difference between raw and transformed data.

Theme: ๐ŸŸฉ Building Blocks ๐ŸŸฆ

Features

  • Models
    • adds {{ ref() }} functionality! Modularize your model!
  • Sources
    • uses {{ source() }} functionality, builds a layer of abstraction between source data and your transformations
  • dbt Macros
    • Start to understand some of the key built-in macros that make dbt work.
  • Docs
    • single model documentation for critical models
  • Tests
    • last-mile testing for final reporting objects

Relevant Commands

  • dbt seed
  • dbt run
  • dbt test
  • dbt docs generate
  • dbt docs serve

DAG

image

Relevant Community Slack Channels

  • #advice-dbt-for-beginners
  • #advice-data-testing

Childhood

Now we're starting to let our project free into the world. Time to set some ground rules! You wouldn't send your project to school without a list of allergies, so it's time to let people know how they should be interacting with your project

Theme: ๐Ÿ—๏ธ Structure and Rules ๐Ÿ“

Features

Relevant Commands

  • dbt compile
  • dbt seed
  • dbt run
  • dbt test
  • dbt build
  • dbt docs generate
  • dbt docs serve

Relevant Community Slack Channels

  • #advice-dbt-for-beginners
  • #advice-data-testing
  • #advice-data-modeling

DAG

image

Adolescence

Look at your beautiful project, all grown up, about to go to prom. At this stage, your project is learning things fast, and is looking to figure out ways to work smarter not harder (so it can spend more time at 7/11 with their friends)

Theme: ๐Ÿ‹๏ธ Growth and Optimization ๐Ÿš€

Features

Relevant Commands

  • dbt deps
  • dbt compile
  • dbt seed
  • dbt run
  • dbt test
  • dbt build
  • dbt docs generate
  • dbt docs serve

DAG

image

Relevant Community Slack Channels

  • #advice-dbt-for-beginners
  • #advice-data-testing
  • #advice-data-modeling
  • #advice-dbt-for-power-users
  • Relevant tool specific channels (i.e. #tools-looker, #tools-meltano)

Adulthood

By the time your project reaches adulthood, the basics of dbt should be humming along just fine, and that should buy it time to think back on its life, look inward, and fingure out how it fits into the world. How has your project grown and changed? How does it relate to the world around it?

Theme: ๐Ÿ““ Self Reflection ๐Ÿ”ฌ

Features

Relevant Commands

  • dbt deps
  • dbt compile
  • dbt source freshness
  • dbt seed
  • dbt run
  • dbt test
  • dbt build
  • dbt run-operation
  • dbt docs generate
  • dbt docs serve

DAG

image

Relevant Community Slack Channels

  • #advice-dbt-for-beginners
  • #advice-data-testing
  • #advice-data-modeling
  • #advice-dbt-for-power-users
  • Relevant tool specific channels (i.e. #tools-looker, #tools-meltano, #db-snowflake)
  • #towards-analytics-engineering
  • #metadata

These things are advanced level (middle aged?)!

Omitted Features

Some features are not included in this project, not because they are unimportant, but because they generally are only used as-needed when the specifics of your data/project call for it.

dbt-project-maturity's People

Contributors

dave-connors-3 avatar na399 avatar patkearns10 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.