This repository contains the code that reads data from a Google BigQuery project, creates an ETL pipeline that pushes the data as a view to a different project. It extracts, transforms and loads it into the required format using dbt. There are 3 layers - the source, staging and mart layer. It also implements generic and singular tests using dbt to ensure the data is clean and processed correctly. It also implements a Gitlab CI pipeline that checks if the code compiles.
Requirements for the software and other tools to build, test and push
- dbt-bigquery
- dbt-utils
- dbt-expectations
- Service account file for a Google Cloud project with the right IAM roles
-
Clone the repository. Create a virtual environment (preferred) before completing the next steps.
-
To install
dbt-bigquery
, follow the instructions here. -
To install
dbt-utils
anddbt-expectations
, make sure you have the packages.yml file in the root folder and run the following command in the terminal.dbt deps
-
To connect dbt to your Google Cloud project using a service account file, follow the instructions here. This step will generate a profiles.yml file.
-
For this project, set the project to
analytics-data
and the dataset todbt_your_name
(replaceyour_name
with your name). Obtain the service account file from your project admin.
The models directory is structured as follows:
models
├── staging
├── analytics_case
├── sources.yml
├── stg_commission.sql
├── stg_orders.sql
├── stg_tables.yml
├── marts
├── sales
├── yearly_sales_and_gross_profit.sql
├── yearly_sales_and_gross_profit.yml
cd to the directory assignment1_dbt.
-
To run the model, run the following command:
dbt run
-
To test the model, run
dbt test
-
To run and test the model, run
dbt build
To view the dbt docs, run
dbt docs generate
Once the catalog.json file is created, run
dbt docs serve
This will open up the documentation locally and you can view the description, tests, lineage.
Please read CONTRIBUTING.md for details on the process for submitting pull requests.
- Sonu Ranjit Jacob - sonuranjitjacob