Giter VIP home page Giter VIP logo

dbt-codegen's Introduction

dbt-codegen

Macros that generate dbt code, and log it to the command line.

Contents

Installation instructions

New to dbt packages? Read more about them here.

  1. Include this package in your packages.yml file โ€” check here for the latest version number:
packages:
  - package: dbt-labs/codegen
    version: X.X.X ## update to latest version here
  1. Run dbt deps to install the package.

Macros

generate_source (source)

This macro generates lightweight YAML for a Source, which you can then paste into a schema file.

Arguments

  • schema_name (required): The schema name that contains your source data
  • database_name (optional, default=target.database): The database that your source data is in.
  • table_names (optional, default=none): A list of tables that you want to generate the source definitions for.
  • generate_columns (optional, default=False): Whether you want to add the column names to your source definition.
  • include_descriptions (optional, default=False): Whether you want to add description placeholders to your source definition.
  • include_data_types (optional, default=False): Whether you want to add data types to your source columns definitions.
  • table_pattern (optional, default='%'): A table prefix / postfix that you want to subselect from all available tables within a given schema.
  • exclude (optional, default=''): A string you want to exclude from the selection criteria
  • name (optional, default=schema_name): The name of your source

Usage:

  1. Copy the macro into a statement tab in the dbt Cloud IDE, or into an analysis file, and compile your code
{{ codegen.generate_source('raw_jaffle_shop') }}

Alternatively, call the macro as an operation:

$ dbt run-operation generate_source --args 'schema_name: raw_jaffle_shop'

or

# for multiple arguments, use the dict syntax
$ dbt run-operation generate_source --args '{"schema_name": "jaffle_shop", "database_name": "raw", "table_names":["table_1", "table_2"]}'

Including data types:

$ dbt run-operation generate_source --args '{"schema_name": "jaffle_shop", "generate_columns": "true", "include_data_types": "true"}'
  1. The YAML for the source will be logged to the command line
version: 2

sources:
  - name: raw_jaffle_shop
    database: raw
    tables:
      - name: customers
        description: ""
      - name: orders
        description: ""
      - name: payments
        description: ""
  1. Paste the output in to a schema .yml file, and refactor as required.

generate_base_model (source)

This macro generates the SQL for a base model, which you can then paste into a model.

Arguments:

  • source_name (required): The source you wish to generate base model SQL for.
  • table_name (required): The source table you wish to generate base model SQL for.
  • leading_commas (optional, default=False): Whether you want your commas to be leading (vs trailing).
  • case_sensitive_cols (optional, default=False): Whether your source table has case sensitive column names. If true, keeps the case of the column names from the source.
  • materialized (optional, default=None): Set materialization style (e.g. table, view, incremental) inside of the model's config block. If not set, materialization style will be controlled by dbt_project.yml

Usage:

  1. Create a source for the table you wish to create a base model on top of.
  2. Copy the macro into a statement tab in the dbt Cloud IDE, or into an analysis file, and compile your code
{{ codegen.generate_base_model(
    source_name='raw_jaffle_shop',
    table_name='customers',
    materialized='table'
) }}

Alternatively, call the macro as an operation:

$ dbt run-operation generate_base_model --args '{"source_name": "raw_jaffle_shop", "table_name": "customers"}'
  1. The SQL for a base model will be logged to the command line
with source as (

    select * from {{ source('raw_jaffle_shop', 'customers') }}

),

renamed as (

    select
        id,
        first_name,
        last_name,
        email,
        _elt_updated_at

    from source

)

select * from renamed
  1. Paste the output in to a model, and refactor as required.

create_base_models (source)

This macro generates a series of terminal commands (appended with the && to allow for subsequent execution) that execute the base_model_creation bash script. This bash script will write the output of the generate_base_model macro into a new model file in your local dbt project.

Note: This macro is not compatible with the dbt Cloud IDE.

Arguments:

  • source_name (required): The source you wish to generate base model SQL for.
  • tables (required): A list of all tables you want to generate the base models for.

Usage:

  1. Create a source for the table you wish to create a base model on top of.
  2. Copy the macro into a statement tab into your local IDE, and run your code
dbt run-operation codegen.create_base_models --args '{source_name: my-source, tables: ["this-table","that-table"]}'

base_model_creation (source)

This bash script when executed from your local IDE will create model files in your dbt project instance that contain the outputs of the generate_base_model macro.

Note: This macro is not compatible with the dbt Cloud IDE.

Arguments:

  • source_name (required): The source you wish to generate base model SQL for.
  • tables (required): A list of all tables you want to generate the base models for.

Usage:

  1. Create a source for the table you wish to create a base model on top of.
  2. Copy the macro into a statement tab into your local IDE, and run your code
source dbt_packages/codegen/bash_scripts/base_model_creation.sh "source_name" ["this-table","that-table"]

generate_model_yaml (source)

This macro generates the YAML for a list of model(s), which you can then paste into a schema.yml file.

Arguments:

  • model_names (required): The model(s) you wish to generate YAML for.
  • upstream_descriptions (optional, default=False): Whether you want to include descriptions for identical column names from upstream models.

Usage:

  1. Create a model.
  2. Copy the macro into a statement tab in the dbt Cloud IDE, or into an analysis file, and compile your code
{{ codegen.generate_model_yaml(
    model_names=['customers']
) }}

You can use the helper function codegen.get_models and specify a directory and/or prefix to get a list of all matching models, to be passed into model_names list.

{% set models_to_generate = codegen.get_models(directory='marts', prefix='fct_') %}
{{ codegen.generate_model_yaml(
    model_names = models_to_generate
) }}

Alternatively, call the macro as an operation:

$ dbt run-operation generate_model_yaml --args '{"model_names": ["customers"]}'
  1. The YAML for a base model(s) will be logged to the command line
version: 2

models:
  - name: customers
    columns:
      - name: customer_id
        description: ""
      - name: customer_name
        description: ""
  1. Paste the output in to a schema.yml file, and refactor as required.

generate_model_import_ctes (source)

This macro generates the SQL for a given model with all references pulled up into import CTEs, which you can then paste back into the model.

Arguments:

  • model_name (required): The model you wish to generate SQL with import CTEs for.
  • leading_commas (optional, default = false): Whether you want your commas to be leading (vs trailing).

Usage:

  1. Create a model with your original SQL query
  2. Copy the macro into a statement tab in the dbt Cloud IDE, or into an analysis file, and compile your code
{{ codegen.generate_model_import_ctes(
    model_name = 'my_dbt_model'
) }}

Alternatively, call the macro as an operation:

$ dbt run-operation generate_model_import_ctes --args '{"model_name": "my_dbt_model"}'
  1. The new SQL - with all references pulled up into import CTEs - will be logged to the command line
with customers as (

    select * from {{ ref('stg_customers') }}

),

orders as (

    select * from {{ ref('stg_orders') }}

),

payments as (

    select * from {{ ref('stg_payments') }}

),

customer_orders as (

    select
        customer_id,
        min(order_date) as first_order,
        max(order_date) as most_recent_order,
        count(order_id) as number_of_orders
    from orders
    group by customer_id

),

customer_payments as (

    select
        orders.customer_id,
        sum(amount) as total_amount
    from payments
    left join orders on
         payments.order_id = orders.order_id
    group by orders.customer_id

),

final as (

    select
        customers.customer_id,
        customers.first_name,
        customers.last_name,
        customer_orders.first_order,
        customer_orders.most_recent_order,
        customer_orders.number_of_orders,
        customer_payments.total_amount as customer_lifetime_value
    from customers
    left join customer_orders
        on customers.customer_id = customer_orders.customer_id
    left join customer_payments
        on  customers.customer_id = customer_payments.customer_id

)

select * from final
  1. Replace the contents of the model's current SQL file with the compiled or logged code

dbt-codegen's People

Contributors

clrcrl avatar graciegoheen avatar davesgonechina avatar joellabes avatar matt-winkler avatar erkanncelen avatar angelica-lastra avatar jtcohen6 avatar djbelknapaw avatar dbeatty10 avatar jtalmi avatar sjrusso8 avatar rahulj51 avatar maayan-s avatar kbrock91 avatar jaypeedevlin avatar fivetran-joemarkiewicz avatar gsokol avatar dave-connors-3 avatar calvingiles avatar bodschut avatar b-per avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.