Giter VIP home page Giter VIP logo

dbt_bigquery's Introduction

dbt and Google Bigquery

This repository contains the code that reads data from a Google BigQuery project, creates an ETL pipeline that pushes the data as a view to a different project. It extracts, transforms and loads it into the required format using dbt. There are 3 layers - the source, staging and mart layer. It also implements generic and singular tests using dbt to ensure the data is clean and processed correctly. It also implements a Gitlab CI pipeline that checks if the code compiles.

Prerequisites

Requirements for the software and other tools to build, test and push

Installation and Setup

  1. Clone the repository. Create a virtual environment (preferred) before completing the next steps.

  2. To install dbt-bigquery, follow the instructions here.

  3. To install dbt-utils and dbt-expectations, make sure you have the packages.yml file in the root folder and run the following command in the terminal.

     dbt deps
    
  4. To connect dbt to your Google Cloud project using a service account file, follow the instructions here. This step will generate a profiles.yml file.

  5. For this project, set the project to analytics-data and the dataset to dbt_your_name (replace your_name with your name). Obtain the service account file from your project admin.

Model structure

The models directory is structured as follows:

models
    ├── staging                         
        ├── analytics_case       
            ├── sources.yml        
            ├── stg_commission.sql    
            ├── stg_orders.sql
            ├── stg_tables.yml
    ├── marts                            
        ├── sales       
            ├── yearly_sales_and_gross_profit.sql      
            ├── yearly_sales_and_gross_profit.yml    

Running the model, tests

cd to the directory assignment1_dbt.

  1. To run the model, run the following command:

     dbt run
    
  2. To test the model, run

     dbt test
    
  3. To run and test the model, run

     dbt build
    

View Docs

To view the dbt docs, run

dbt docs generate

Once the catalog.json file is created, run

dbt docs serve

This will open up the documentation locally and you can view the description, tests, lineage.

Contributing

Please read CONTRIBUTING.md for details on the process for submitting pull requests.

Authors

dbt_bigquery's People

Contributors

sonuranjitjacob avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.