Giter VIP home page Giter VIP logo

dbt_artifacts's Introduction

dbt Artifacts Package

This package builds a mart of tables from dbt artifacts loaded into a table. It is compatible with Snowflake only. The models are based off of the v1 schema introduced in dbt 0.19.0.

Models included:

  • dim_dbt__models
  • fct_dbt__model_executions
  • fct_dbt__latest_full_model_executions
  • fct_dbt__critical_path
  • fct_dbt_run_results

The critical path model determines the slowest route through your DAG, which provides you with the information needed to make a targeted effort to reducing dbt run times. For example:

Critical Path

Installation

  1. Add this package to your packages.yml following these instructions

  2. Configure the following variables in your dbt_project.yml:

vars:
  dbt_artifacts:
    dbt_artifacts_database: your_db # optional, default is your target database
    dbt_artifacts_schema: your_schema # optional, default is 'dbt_artifacts'
    dbt_artifacts_table: your_table # optional, default is 'artifacts'

models:
  ...
  dbt_artifacts:
    +schema: your_destination_schema
    staging:
      +schema: your_destination_schema

Note that the model materializations are defined in this package's dbt_project.yml, so do not set them in your project.

  1. Run dbt deps.

Generating the source table

This package requires that the source data exists in a table in Snowflake.

Option 1: Loading local files

Snowflake makes it possible to load local files into your warehouse. We've included a number of macros to assist with this. This method can be used by both dbt Cloud users, and users of other orchestration tools.

  1. To initially create these tables, execute dbt run-operation create_artifact_resources (source). This will create a stage and a table named {{ target.database }}.dbt_artifacts.artifacts โ€” you can override this name using the variables listed in the Installation section, above.

  2. Add operations to your production run to load files into your table, via the upload_artifacts macro (source). You'll need to specify which files to upload through use of the --args flag. Here's an example setup.

$ dbt  seed
$ dbt  run-operation upload_dbt_artifacts --args '{filenames: [manifest, run_results]}'

$ dbt  run
$ dbt  run-operation upload_dbt_artifacts --args '{filenames: [manifest, run_results]}'

$ dbt  test
$ dbt  run-operation upload_dbt_artifacts --args '{filenames: [run_results]}'

$ dbt  source snapshot-freshness
$ dbt  run-operation upload_dbt_artifacts --args '{filenames: [sources]}'

$ dbt  docs generate
$ dbt  run-operation upload_dbt_artifacts --args '{filenames: [catalog]}'

Option 2: Loading cloud-storage files

If you are using an orchestrator, you might instead upload these files to cloud storage โ€” the method to do this will depend on your orchestrator. Then, link the cloud storage destination to a Snowflake external stage, and use a snowpipe to copy these files into the source table:

copy into ${snowflake_table.dbt_artifacts.database}.${snowflake_table.dbt_artifacts.schema}.${snowflake_table.dbt_artifacts.name}
from (
    select
    $1 as data,
    $1:metadata:generated_at::timestamp_tz as generated_at,
    metadata$filename as path,
    regexp_substr(metadata$filename, '([a-z_]+.json)$') as artifact_type
    from @${snowflake_stage.dbt_artifacts.database}.${snowflake_stage.dbt_artifacts.schema}.${snowflake_stage.dbt_artifacts.name}
)
file_format = (type = 'JSON')

Usage

The models will be picked up on your next dbt run command. You can also run the package specifically with dbt run -m dbt_artifacts.

Acknowledgements

Thank you to Tails.com for initial development and maintenance of this package. On 2021/12/20, the repository was transferred from the Tails.com GitHub organization to Brooklyn Data Co.

The macros in this package have been adapted from code shared by Kevin Chan and Jonathan Talmi of Snaptravel.

Thank you for sharing your work with the community!

dbt_artifacts's People

Contributors

niallrees avatar alanmcruickshank avatar clrcrl avatar nbandre avatar gaurav-atheon avatar gavh3 avatar jsnb-devoted avatar jtcohen6 avatar kon-mal avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.