Giter VIP home page Giter VIP logo

dq-tools's People

Contributors

il-dat avatar paulg66 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

dq-tools's Issues

Migrate dbt metric to new semantic model

Describe the feature

Migrate dbt metric to the new definition here

Describe alternatives you've considered

Legacy dbt metric is still available up to dbt v1.5

Additional context

If we try to use the package within dbt v1.6, it will fail even when the metrics are disabled

Who will this benefit?

People want to use dbt v1.6

Add Test Coverage

Describe the feature

Able to report the Test Coverage of the project on each Invocation (possibly report on each Job)

Test Coverage will have new metrics:

  • Percentage of data scanned by tests (coverage_pct)
  • Ratio of Tests vs Columns (ratio_test_to_column)

Test Coverage should have option to exclude models from its calculation.

Suggested formula:

๐Ÿ”ข Percentage of data scanned by tests (coverage_pct)
= (1) * (2) * (3)

  1. avg (row scanned by tests / total rows) of all columns scanned by tests
  2. ratio of no of columns scanned by tests / total columns
  3. ratio of no of tables scanned by tests / total tables

๐Ÿ”ข Ratio of Tests vs Columns (ratio_test_to_column)
= (1) / (2)

  1. No of tests
  2. No of columns

Input required following columns in dq_issue_log:

  • no_of_records
  • no_of_records_scanned ๐Ÿ†•
  • no_of_records_failed
  • no_of_table_columns ๐Ÿ†•
  • no_of_tables ๐Ÿ†•
  • test_unique_id ๐Ÿ†•

[BUG] Accepted Values test failing when kpi_category is not one of the 6 main categories

Describe the bug

If a test is added that does not default to one of the six kpi categories, build fails because of this test. The test doesn't allow "Other" as a KPI category or any customer kpi category that is defined in a test

image

Steps to reproduce

Add a test with a different kpi_category, example:
tests: - dq_tools.not_null_where_db: severity_level: error kpi_category: TEST

Expected results

The test should not fail

Actual results

The test fails

Screenshots and log output

16:09:21 On test.dq_tools.accepted_values_bi_column_analysis_kpi_category__Validity__Timeliness__Accuracy__Uniqueness__Completeness__Consistency.567ad2dc7e: select
      count(*) as failures,
      count(*) != 0 as should_warn,
      count(*) != 0 as should_error
    from (
   with all_values as (
    select
        kpi_category as value_field,
        count(*) as n_records
    from ANALYTICS_DEV.dbt_pgallagher.bi_column_analysis
    group by kpi_category
)
select *
from all_values
where value_field not in (
    'Validity','Timeliness','Accuracy','Uniqueness','Completeness','Consistency'
)
    ) dbt_internal_test
16:09:21 Opening a new connection, currently in state closed
16:09:22 SQL status: SUCCESS 1 in 0.0 seconds
16:09:22 Timing info for test.dq_tools.accepted_values_bi_column_analysis_kpi_category__Validity__Timeliness__Accuracy__Uniqueness__Completeness__Consistency.567ad2dc7e (execute): 16:09:21.747536 => 16:09:22.268494
16:09:22 On test.dq_tools.accepted_values_bi_column_analysis_kpi_category__Validity__Timeliness__Accuracy__Uniqueness__Completeness__Consistency.567ad2dc7e: Close
16:09:22 105 of 187 FAIL 1 accepted_values_bi_column_analysis_kpi_category__Validity__Timeliness__Accuracy__Uniqueness__Completeness__Consistency  [FAIL 1 in 0.64s]

System information

The contents of your packages.yml file:

  - package: infinitelambda/dq_tools
    version: [">=1.4.0", "<1.5.0"]

Which database are you using dbt with?

  • postgres
  • redshift
  • bigquery
  • snowflake
  • other (specify: ____________)

The output of dbt --version:

1.6

Additional context

All that is needed to add "Other" to the list of excepted values in this test
(https://github.com/infinitelambda/dq-tools/blob/11b696b9f9c066183619563b9ca5a850bf3f50f0/models/03_mart/data-quality-score/bi_column_analysis.yml#L23C6-L23C6)

Are you interested in contributing the fix?

I can update the test, will need someone to deploy it.

Adding database configuration

  • Allow to configure database of test result table with new variable dbt_dq_tool_database
  • Realign readme
  • Update tests

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.