Giter VIP home page Giter VIP logo

dbt-datamocktool's People

Contributors

chaimt avatar d-levin avatar david-ramp avatar epapineau avatar ewengillies avatar imkehno avatar millyleadley avatar mjirv avatar pettersoderlund avatar pishposh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

dbt-datamocktool's Issues

Pattern for multiple test cases, avoiding repetition in the yaml file

Describe the feature

I would like to be able to define the dbt_datamocktool test using placeholder values, so multiple test cases can be captured by different input/output seeds, but without requiring repetition in the yaml file

Describe alternatives you've considered

I can achieve this by copying and pasting the entire dbt_datamocktool test block, applying the specific input mapping/expected output for each seed

Example:

models:
  - name: model_name_here
     tests:
      - dbt_datamocktool.unit_test:
          tags: 
            - topic_unit_test 
          input_mapping:
            ref('existing_model'): ref('existing_model__input__test_case_{{ i }}')
          expected_output: ref('expect__existing_model__test_case_{{ i }}')

So, for each test case I have to copy-and-paste the entire block only to replace {{i}} with the specific test case number

Additional context

N/A

Who will this benefit?

This benefits all users with real case applications/complex logic.
By splitting the test into multiple test cases, it is easy to come up with the test data/output; as well as easily identify which specific test/logic is failing

Are you interested in contributing this feature?

I would love to contribute extending this feature. Not sure what is the best way to approach it though.
I imagine it would be possible to parameterise the underlying datamocktool macro to receive an additional and optional parameter with a list of values

Note: due to a current legacy project using dbt version 0.19.2, I am using datamocktool revision 0.1.2 (from this repo)

Allow `where:` functionality to work on tests

Describe the feature

dbt specifies that you can add where: to a test to filter the results that you are testing on. I wanted to do this with your cool mock tool so I gave it a go.

You get an error

 Compilation Error in test dbt_datamocktool_xxxxx (models/gold/gold.yml)
12:12:58    'str object' has no attribute 'name'
12:12:58    
12:12:58    > in macro get_unit_test_sql (macros/dmt_get_test_sql.sql)
12:12:58    > called by macro test_unit_test (macros/dmt_unit_test.sql)
12:12:58    > called by test dbt_datamocktool_unit_test_scorecard_results_orb

I've tracked this down to when you call model.name within get_unit_test_sql. This is because dbt replaces the "model" parameter with a string with the where clause applied, like so:
(select * from <Table> where <where_clause>) dbt_subquery

It no longer has a name attribute.

Describe alternatives you've considered

It could be possible to override this functionality to take a where_clause parameter instead and handle it differently within the code (so where: would still not work)

Would then need to modify the code to call audit_helper's compare_queries instead in this case, with the compare_model being SELECT <cols> from <mocked_table> WHERE <where_clause>.

I can't think of an alternative that wouldn't involve a PR.

Additional context

Is this feature database-specific? Which database(s) is/are relevant? Please include any other relevant context here.
I doubt it's DB specific, I am using Snowflake

Who will this benefit?

Anyone else who wants to only test certain results are present in the table we are mocking

Are you interested in contributing this feature?

Yes I could give this a go.

Warning on different lifecycles

Describe the bug

when running dbt seed, dbt run you get a warning on sources missing
they are missing since they have not yet been created.
check should be only on dbt test

Steps to reproduce

delete db
run dbt seed

Expected results

no warnings

Actual results

[WARNING]: Test 'test.dbt_poc.dbt_datamocktool_unit_test_stg_loyalty_merchants_enrichment_ref_stg_swell_yotpo_platform_accounts___ref_stg_swell_merchants___ref_stg_swell_vip_tiers_settings___Example_of_unit_test__ref_dmt_expected_dim_opportunity___ref_dmt_loyalty_stg_swell_yotpo_platform_accounts___ref_dmt_loyalty_stg_swell_merchants___ref_dmt_loyalty_stg_swell_vip_tiers_settings___stg_loyalty_merchants_enrichment.5891bd3b17' (models/staging/loyalty/stg_loyalty_merchants_enrichment.yml) depends on a source named 'default.stg_swell_vip_tiers_settings' which was not found

Screenshots and log output

System information

The contents of your packages.yml file:

Which database are you using dbt with?

  • postgres
  • redshift
  • bigquery
  • snowflake
  • [ x] other (specify: databricks__________)

The output of dbt --version:

0.21

Allow unit tests to work if the tested model's schema doesn't exist

Describe the feature

Currently unit tests work if the tested model's schema already exists. It would be helpful to extend this to work even if the model's schema doesn't exist so that unit tests can be run in "empty" databases. This would be beneficial for things like data warehouse testing pipelines that involve spinning up an isolated test database to validate model transformation code.

Describe alternatives you've considered

The test pipeline mentioned above could be made to work by first adding a subset of source data to the isolated test database and then running the models once. But this would add extra time and complexity to the pipeline that could potentially be mitigated.

Additional context

Currently the tested model's schema needs to exist because the mock_model_relation temp table is created in the same schema. I'm not exactly sure if the resulting "schema does not exist" error applies to databases other than MSSQL but I can do further investigation.

To prevent the error from happening, one option is make the temp table's schema the same as one of the mock input seed's schemas, because we know that the seed schemas must already exist at test time. This could be done with the following code:

{% set temp_schema = input_mapping.values() | list | first | attr("schema") %}
{% set mock_model_relation = mock_model_relation.incorporate(path={"schema": temp_schema}) %}

However, as mentioned by @mjirv, this modification would not work in situations where the input replacement is a macro or CTE.

Who will this benefit?

CI/CD pipelines might have an automated testing component that involves spinning up a lightweight database instance, seeding the mock input and expected output tables, and then running unit tests. These types of pipelines would benefit from the requested feature.

Are you interested in contributing this feature?

Yes, I am happy to contribute.

DBT-core 1.3 incompatibility

Describe the bug

dbt_datamocktool.unit_test does not compile after upgrading to dbt-core 1.3. dbt test results to SQL compilation error.

Steps to reproduce

Upgrade dbt-core -> 1.3 and dbt test

Expected results

Would expect dbt test to run with success.

Actual results

SQL Compilation error.

Screenshots and log output

System information

The contents of your packages.yml file:

Which database are you using dbt with?
Snowflake

The output of dbt --version:

Core:
- installed: 1.3.0
- latest:      1.3.0 - Up to date!

Plugins:
- snowflake: 1.3.0 - Up to date!

Additional context

I'm quite certain that this is due to a new feature addition in dbt-core: python_model, which allows dbt run -m to execute files with .py extension.

Feature and the breaking changes were initially added here. Name of the Node attribute raw_sql is changed to raw_code:
dbt-labs/dbt-core#5421

And this leads to issues as get_unit_test_sql() macro uses the original attribute to fetch compiled sql. Quick fix to verify: {% set ns.test_sql = ns.graph_model.raw_code %}

{% set ns.test_sql = ns.graph_model.raw_sql %}

Are you interested in contributing the fix?

Sure. Though quick fix breaks backwards compatibility.

end-to-end tests

Describe the feature

Right now, AFAIK, there is only the possibility to unit testing one model. Have you ever considered to develop an end-to-end test? Instead of having an input and output with the same model, having an input for one model starting a data pipeline and then an output for a model at the end of the pipeline.

Describe alternatives you've considered


Additional context


Who will this benefit?

I think all the community could benefit on this feature.

Are you interested in contributing this feature?


unable to use model selection syntax in tests

Describe the bug

I'm unable to get these tests picked up in model selection syntax, but my test do get picked up when I run the entire test suite

Steps to reproduce

core/intermediate/schema.yml

- name: chat_message_windowed
  +tags: chat_messages
  tests:
    - dbt_datamocktool.unit_test:
        input_mapping:
          ref('dim__date'): ref('dim__date_seed')
          ref('chat_message_extended'): ref('chat_message_extended_seed')
        expected_output: ref('chat_message_windowed_seed')
  columns:
  - name: chat_id
    tests:
      - not_null
  - name: chat_message_id
    tests:
      - unique
      - not_null

running dbt test -m chat_message_windowed picks up the three schema tests, but not the mock test

15:07:34 | Concurrency: 1 threads (target='dev')
15:07:34 |
15:07:34 | 1 of 3 START test not_null_chat_message_windowed_chat_id............. [RUN]
15:07:35 | 1 of 3 PASS not_null_chat_message_windowed_chat_id................... [PASS in 1.06s]
15:07:35 | 2 of 3 START test not_null_chat_message_windowed_chat_message_id..... [RUN]
15:07:37 | 2 of 3 PASS not_null_chat_message_windowed_chat_message_id........... [PASS in 1.37s]
15:07:37 | 3 of 3 START test unique_chat_message_windowed_chat_message_id....... [RUN]
15:07:38 | 3 of 3 PASS unique_chat_message_windowed_chat_message_id............. [PASS in 1.19s]
15:07:38 |
15:07:38 | Finished running 3 tests in 4.86s.

Completed successfully

Done. PASS=3 WARN=0 ERROR=0 SKIP=0 TOTAL=3

However, if i run all the test, i do see these tests get picked up - furthermore, running the test explicitly is see the tests do run

dbt test -m dbt_datamocktool_unit_test_chat_message_windowed_ref_chat_message_windowed_seed___ref_dim__date_seed___ref_chat_message_extended_seed_

yields the following

Running with dbt=0.20.2
Found 498 models, 418 tests, 0 snapshots, 4 analyses, 378 macros, 3 operations, 8 seed files, 93 sources, 0 exposures

15:09:46 | Concurrency: 1 threads (target='dev')
15:09:46 |
15:09:46 | 1 of 1 START test dbt_datamocktool_unit_test_chat_message_windowed_ref_chat_message_windowed_seed___ref_dim__date_seed___ref_chat_message_extended_seed_ [RUN]
15:09:52 | 1 of 1 FAIL 636 dbt_datamocktool_unit_test_chat_message_windowed_ref_chat_message_windowed_seed___ref_dim__date_seed___ref_chat_message_extended_seed_ [FAIL 636 in 6.11s]
15:09:52 |
15:09:52 | Finished running 1 test in 8.18s.

Completed with 1 error and 0 warnings:

Failure in test dbt_datamocktool_unit_test_chat_message_windowed_ref_chat_message_windowed_seed___ref_dim__date_seed___ref_chat_message_extended_seed_ (models/marts/core/intermediate/intermediate.yml)
  Got 636 results, configured to fail if != 0

  compiled SQL at target/compiled/nursefly_dwh/models/marts/core/intermediate/intermediate.yml/schema_test/dbt_datamocktool_unit_test_cha_4fc1bb97185c98b5c4d5b0a4068bf79c.sql

Done. PASS=0 WARN=0 ERROR=1 SKIP=0 TOTAL=1

Expected results

I would expect that when i run dbt test -m chat_message_windowed that in addition to the three schema tests that the dbt mock test would run

System information

The contents of your packages.yml file:

packages:
  - package: dbt-labs/dbt_utils
    version: 0.7.3
  - package: mjirv/dbt_datamocktool
    version: 0.1.4

Which database are you using dbt with?

  • snowflake

The output of dbt --version:

installed version: 0.20.2
   latest version: 0.20.2

Up to date!

Plugins:
  - bigquery: 0.20.2
  - snowflake: 0.20.2
  - redshift: 0.20.2
  - postgres: 0.20.2

Are you interested in contributing the fix?

Absolutely!

Does not work on SQL Server due to nested CTEs

Describe the bug

In order to do SQL replacement, we create a CTE within a CTE already created by the dbt_utils.equality test. Some databases do not support this, including SQL Server.

Steps to reproduce

Try to use datamocktool on a dbt project that uses the dbt-sqlserver adapter. You will get an error like the following:

('42000', "[42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Incorrect syntax near the keyword 'with'. (156) (SQLExecDirectW); [42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Incorrect syntax near the keyword 'with'. If this statement is a common table expression, an xmlnamespaces clause or a change tracking context clause, the previous statement must be terminated with a semicolon. (319); [42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Incorrect syntax near ','. (102); [42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Incorrect syntax near ','. (102); [42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Incorrect syntax near ')'. (102); [42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Incorrect syntax near ','. (102); [42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Incorrect syntax near ','. (102); [42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Incorrect syntax near ','. (102)")

Expected results

The package should work on SQL Server.

Actual results

See "steps to reproduce"

Screenshots and log output

See "steps to reproduce"

System information

N/A

Which database are you using dbt with?

  • postgres
  • redshift
  • bigquery
  • snowflake
  • other (SQL Server)

The output of dbt --version:

<output goes here>

Additional context

See dbt-msft/dbt-sqlserver#137 for potentially related issue and discussion.

Are you interested in contributing the fix?

Support for testing Bigquery arrays

Describe the feature

Support for testing columns with data type array in unit tests when working with Bigquery.

Describe alternatives you've considered

In our organisation we have used datamocktool to test arrays by specifying "compare_columns: - TO_JSON_STRING(array_col)".
This has been working fine with 0.2.1 of dmt but was no longer working in >=0.2.2.

Additional context

Providing function-wrapped columns to compare columns should be useful in many databases. However I only have experience doing this in BigQuery.

The change of the handling of compare_columns was changed 0.2.2 where we went from a straight passing on of parameter compare columns to dbt_utils.test_equality to the new way of "finding excluded columns" to provide to audit_helper.compare_relations with parameter exclude_columns.

Who will this benefit?

Anyone using arrays or structs in tables wanting to unit test also these columns.

Are you interested in contributing this feature?

I feel that i need some input to be able to get started on this.

Enable post-hook testing behavior

Describe the feature

I would like our dmt tool to understand the post-hook behavior that is defined in config block in the model.

I have a source and target model (say dmt__source and dmt__target) and want to see dmt__target and dmt__expected_table is equal due to the post-hook. Here is how my source.yml looks like.

models:
  - name: dmt__target
    tests:
      - dbt_datamocktool.unit_test:
          input_mapping:
            ref('dmt__source'): ref('dmt__source')
          expected_output: ref('dmt__expected_table')

This is how dmt__target looks like.

{{ make_final_model('test_schema', 'dmt__target',
[
		'{{ update_id_fields("test_schema", "dmt__target", "id_one") }}',
		'{{ update_id_fields("test_schema", "dmt__target", "id_two") }}']) }}

Lastly, this is how macro for the post-hook looks like.

{% macro update_id_fields(schema, table, id_field) %}

UPDATE {{ schema }}_testing.dmt__{{ table }} tbl
...
...

The expected outcome is; through the macro contents in the table will be updated. When I run test, it looks like post-hook is not triggered and I would like to know if this feature is missing in dmt or I miss something. If this feature does not exist yet, then I think it would be a great thing to add.

Describe alternatives you've considered

N/A

Additional context

This is not a database-specific feature request. I'm currently using a Bigquery.

Who will this benefit?

It will be beneficial for anyone who leverages hooks and test their functionality to gain more confidence before shipping to production.

Are you interested in contributing this feature?

Upversion to audit_helper 0.8.0

Describe the feature

A clear and concise description of what you want to happen.
dbt-audit-helper recently released version 0.8.0. The request is to up the version dependency to audit helper 0.8.0 :)

Describe alternatives you've considered

A clear and concise description of any alternative solutions or features you've considered.
Not updating audit_helper :'(

Additional context

Is this feature database-specific? Which database(s) is/are relevant? Please include any other relevant context here.

Who will this benefit?

What kind of use case will this feature be useful for? Please be specific and provide examples, this will help us prioritize properly.
Anyone with a dependency on both projects in their packages.yml

Are you interested in contributing this feature?

Heck yea~

Add integration tests

Describe the feature

We should have integration tests similar to other packages that allow us to verify that the ref(), source(), and test_unit_test() macro are working as expected.

Describe alternatives you've considered

N/A

Additional context

See https://docs.getdbt.com/docs/guides/building-packages#4-add-integration-tests

Who will this benefit?

It will be easier to ensure changes we make in the future don't break things.

Are you interested in contributing this feature?

Yes, I am going to work on this next.

Support for incremental models

Describe the feature

Right now, datamocktool has no special handling for incremental models. While you can run a unit test on an incremental model, I am not sure what will happen (i.e. whether it will run on the initial step or the incremental step).

We should support running on incremental models - presumably against the incremental and non-incremental steps separately.

A good UX would be that the user can:

  1. add 2 mock data sets for an incremental model: the initial and the incremental
  2. add 2 expected outputs (the initial and the final)
  3. datamocktool will automatically check that both steps succeed

Describe alternatives you've considered

  • You can probably test just the initial step right now, though I am not totally sure

Additional context

Who will this benefit?

  • Multiple people have mentioned wanting this, either in the context of datamocktool or more generally

Are you interested in contributing this feature?

I plan to, but would love if someone else would like to as well!

How to know which records are failing a given test?

Hello!

Having a failing test how to know which records make the test fail?

I am trying to run the query compiled by datamocktool but it seems that during a test run it used a table that no longer exists in the database

-- setup

with a as (

    select * from "postgres"."dev_analytics"."test_fxt_dim_ads__expected"

),

b as (

    select * from "dim_ads_dmt_33969964105433987363"

),

a_minus_b as (

    select "ad_id", "list_id", "account_id", "user_type", "subcategory_id", "type", "condition", "status", "region_id", "area_id", "platform_initial", "has_phone_hidden", "inserted_at", "published_at", "last_deactivated_at", "deleted_at", "was_sold_on_site", "price", "remuneration_type", "currency", "has_halva", "name", "ad_params" from a
    

    except


    select "ad_id", "list_id", "account_id", "user_type", "subcategory_id", "type", "condition", "status", "region_id", "area_id", "platform_initial", "has_phone_hidden", "inserted_at", "published_at", "last_deactivated_at", "deleted_at", "was_sold_on_site", "price", "remuneration_type", "currency", "has_halva", "name", "ad_params" from b

),

b_minus_a as (

    select "ad_id", "list_id", "account_id", "user_type", "subcategory_id", "type", "condition", "status", "region_id", "area_id", "platform_initial", "has_phone_hidden", "inserted_at", "published_at", "last_deactivated_at", "deleted_at", "was_sold_on_site", "price", "remuneration_type", "currency", "has_halva", "name", "ad_params" from b
    

    except


    select "ad_id", "list_id", "account_id", "user_type", "subcategory_id", "type", "condition", "status", "region_id", "area_id", "platform_initial", "has_phone_hidden", "inserted_at", "published_at", "last_deactivated_at", "deleted_at", "was_sold_on_site", "price", "remuneration_type", "currency", "has_halva", "name", "ad_params" from a

),

unioned as (

    select 'a_minus_b' as which_diff, a_minus_b.* from a_minus_b
    union all
    select 'b_minus_a' as which_diff, b_minus_a.* from b_minus_a

)

select * from unioned

ERROR:  relation "dim_ads_dmt_33969964105433987363" does not exist

Support for models that use ephemeral models (mocking input to the ephemeral models)

Describe the feature

I assume that what I'm about to describe isn't supported right now, as I can't get it to work.

I have a model (A) that uses an ephemeral model (B), that in turn also uses an ephemeral model. (C). Model C uses a regular model (D) I'd like to unit test A, and mock the input from D, since ephemeral models are effectively just CTEs under the hood.

I get an error:
12:08:46 002003 (42S02): SQL compilation error:
12:08:46 Object '__DBT__CTE__B' does not exist or not authorized.

When I tried this on the off chance that it might work!

Describe alternatives you've considered

Mocking and testing the ephemeral models, this is annoying for me though as they have array inputs and outputs, so it's fiddly.

Additional context

I don't think it will be DB specific, but I am using Snowflake.

Who will this benefit?

Anyone who wants to do what I describe above.

Are you interested in contributing this feature?

I could do, but I'm not sure where to start!

When using alias and having duplicate alias names the test cannot find the correct model

Describe the bug

this is related to bug: add support for aliases and test model with an alias
When we have multiple models in different DB's with same alias it will fail

Steps to reproduce

Create two models in two DB's with different file names. E.g. aaa.sql and bbb.sql
Then give them both the same alias. The alias is different than the names of both models.
E.G. {{config(alias='ccc')}}

Expected results

When running dbt test we expect the correct model to be tested

Actual results

In an arbitrary way, one of the two models that have the same alias might be selected randomly.

System information

Which database are you using dbt with?

  • postgres
  • redshift
  • bigquery
  • snowflake
  • other (specify: ____________)

The output of dbt --version:

Core:
  - installed: 1.4.1
  
Plugins:
  - snowflake: 1.4.0 - Update available!

Additional context

We have a big project with hundreds of files. the alias is not unique across the project. only file name is unique.
Therefore we should not select by alias.

Are you interested in contributing the fix?

To overcome this, I have added another attribute to pass to the test the exact name of the model so the test does not need to look it out from the model name (which is wrong when using alias)

Ability to write mock_model_relation to custom schema

Describe the feature

(Caveat: not sure if this functionality already exists!)

I would like to be able to write the mock_model_relation to a separate schema to the model.

Currently, the model_dmt_{timestamp} models are showing up in our production dataset (alongside model) which are being surfaced in our BI tool. I'd like to put these into a separate schema. Happy to use the custom_schema_name with the generate_schema_name macro if this is possible.

I've done this with the seed files in dbt_project.yml using the seeds/dmt/schema config in my own project.

Describe alternatives you've considered

Tried a few things:

  • Overriding the default___get_model_to_mock macro to replace model with ref("dmt_schema", model.name)
  • Overriding the schemas for models/dbt_datamocktool in dbt_project.yml (doesn't seem to work as it's picking up the schema from the model itself which is defined in my project)
  • Overriding generate_schema_name to check for any models with dmt in the name (again I think this has the same issue as above as it's picking up the schema before adding dmt + suffix)
    My final option here is to have a Python script run after each run of dmt to delete the tables.

Additional context

Working on BigQuery

Are you interested in contributing this feature?

Happy to make the change given some pointers.

When running dbt test on a specific model it doesn't run the unit tests only the dbt tests.

Describe the bug

When running DBT test on the specific model you want to test it only picks up the DBT tests and ignores the unit tests from dbt mocktool. With a large amount of tests this makes development quite impractical.

Steps to reproduce

Create a model, add a dbt_datamocktool.unit_test:
Run dbt test -m "model_name"
Notice how the unit test is ignored.

Expected results

Unit test should run.

Actual results

Unit tests don't run.

System information

The contents of your packages.yml file:

Which database are you using dbt with?
BigQuery

The output of dbt --version:

installed version: 0.20.2
   latest version: 1.0.0

Your version of dbt is out of date! You can find instructions for upgrading here:
https://docs.getdbt.com/docs/installation

Plugins:
  - postgres: 0.20.2
  - bigquery: 0.20.2
  - redshift: 0.20.2
  - snowflake: 0.20.2

Failure to replace refs introduced by commit 5a4c311

Describe the bug

__render_sql_and_replace_references will fail to replace refs with rendered input mappings when the ref in the model is not formatted with precisely one leading and one following space

Steps to reproduce

  • Create a model with a ref formatted without that spacing. e.g. {{ref('some_model')}}
  • Create a unit test for that model specifying an input mapping for ref('some_model')
  • Run the test

Expected results

The ref should be replaced successfully and the test should run

Actual results

__render_sql_and_replace_references fails to replace the ref due to the code below (lines 148 to 151) matching on the specific format {{ ref('') }} and not dealing with varying or missing whitespace between the ref call and the brackets

    {#- Replace the keys first, before the sql code is rendered -#}
    {% for k, v in ns.rendered_mappings.items() %}
        {% set ns.test_sql = ns.test_sql|replace("{{ "~render(k)~" }}", v) %}
    {% endfor %}

The standard dependency failure error is thrown from render
dbt was unable to infer all dependencies for the model
This typically happens when ref() is placed within a conditional block
...
To fix this, add the following hint to the top of the model
-- depends_on: {{ ref('some_model) }}

System information

The contents of your packages.yml file:
The relevant information is that the issue was introduced in the above referenced commit, first present in v0.3.2 v0.3.5. The issue does not occur in v0.3.1

Which database are you using dbt with?

  • postgres
  • redshift
  • bigquery
  • snowflake
  • [ x ] other (specify: databricks, but this isn't relevant for this issue )

The output of dbt --version:

We're still on 1.4.6 waiting for some library updates but this also isn't relevant to this issue

Are you interested in contributing the fix?

Sure!

Edits:

  • 2023-07-31 - Introduced in v0.3.5, not v0.3.2

All tests doesn't run

Describe the bug

I've added a new test in a yml file but when I run dbt test..... it only run my old tests.

Steps to reproduce

I basically just added a new

tests:
      - dbt_datamocktool.unit_test:
          input_mapping:
            ref('xx'): ref('yy')
            source('xxx', 'xxxx'): ref('yyyy')
          expected_output: ref('xxx')
          name: "xxx"
          description: "xxxx"

### Expected results
Run the new test as well :)

### Actual results
The test doesn't run at all.

### System information
**The contents of your `packages.yml` file:**

```yml
packages:
  - package: dbt-labs/dbt_utils
    version: 0.8.2
  - package: mjirv/dbt_datamocktool
    version: [">=0.1.9"]

Which database are you using dbt with?

  • postgres
  • redshift
  • bigquery
  • snowflake
  • other (specify: ____________)

The output of dbt --version:

installed version: 1.0.3
   latest version: 1.0.3

Up to date!

Plugins:
  - bigquery: 1.0.0 - Up to date!

Maybe I'm missing something here?

[TODO] Allow expectations other than table equality

Describe the feature

e.g. contains, columns sum to certain value, etc

Describe alternatives you've considered

A clear and concise description of any alternative solutions or features you've considered.

Additional context

Is this feature database-specific? Which database(s) is/are relevant? Please include any other relevant context here.

Who will this benefit?

What kind of use case will this feature be useful for? Please be specific and provide examples, this will help us prioritize properly.

Are you interested in contributing this feature?

error on install

Looking in the wrong directory for the installed dbt packages.

after dbt init:

# packages.yml
packages:
- package: mjirv/dbt_datamocktool
  version: [">=0.3.0"]

install packages with: dbt deps

> Executing task: dbt list

An error occurred while trying to refresh the project configuration: Error: ENOENT: no such file or directory, open '/Users/danielbartley/my_dbt_project/dbt_packages/dbt-utils-1.0.0/dbt_project.yml'
An error occurred while trying to refresh the project configuration: Error: ENOENT: no such file or directory, open '/Users/danielbartley/my_dbt_project/dbt_packages/dbt-utils-1.0.0/integration_tests/dbt_project.yml'
> Executing task: dbt list


> Executing task: dbt list

An error occurred while trying to refresh the project configuration: Error: ENOENT: no such file or directory, open '/Users/danielbartley/my_dbt_project/dbt_packages/dbt-audit-helper-0.7.0/dbt_project.yml'
An error occurred while trying to refresh the project configuration: Error: ENOENT: no such file or directory, open '/Users/danielbartley/my_dbt_project/dbt_packages/dbt-audit-helper-0.7.0/integration_tests/dbt_project.yml'

inconsistency between temporary view and select

Describe the bug

when using the unit test, it creates incorrect selects:

view created


create temporary view stg_customers_dmt_07737153_ as
with source as (
select * from dbt_chaim.raw_customers

),

renamed as (

select
    id as customer_id,
    first_name,
    last_name

from source

)

select * from renamed

select


select

count(*) + coalesce(abs(
    sum(case when which_diff = 'a_minus_b' then 1 else 0 end) -
    sum(case when which_diff = 'b_minus_a' then 1 else 0 end)
), 0)

as failures,

count(*) + coalesce(abs(
    sum(case when which_diff = 'a_minus_b' then 1 else 0 end) -
    sum(case when which_diff = 'b_minus_a' then 1 else 0 end)
), 0)

!= 0 as should_warn,

count(*) + coalesce(abs(
    sum(case when which_diff = 'a_minus_b' then 1 else 0 end) -
    sum(case when which_diff = 'b_minus_a' then 1 else 0 end)
), 0)

!= 0 as should_error
from (

-- setup

with a as (

select * from dbt_chaim.dmt__expected_stg_customers_1

),

b as (

select * from 

dbt_chaim.stg_customers_dmt_07737153_
    -- depends_on: dbt_chaim.raw_customers

)

Expected results

expected select without schema name

Which database are you using dbt with?

  • postgres
  • redshift
  • bigquery
  • snowflake
  • [x ] other (specify: databricks_)

**The output of dbt --version:0.21

<output goes here>

Add support for inline (non-materialized) unit tests

Describe the feature

The current way of running tests is still a little more complex than I would like. We should have a simpler, more dbt-onic solution.

Specifically, one should:

  1. Be able to define the entire unit test inline under the tests block for a model in schema.yml rather than relying on global variables
  2. Be able to run unit tests alongside their other tests without having to materialize new models (other than the seeds)
  3. Not have to use any additional flags unless desired

Practically, we can use a graph node's raw_sql property along with the replace() filter and the render() function to (1) grab the raw SQL for a model, (2) replace selected refs and sources with references to our mocks, and (3) compile the resulting JinjaSQL.

Describe alternatives you've considered

I considered keeping both the previous method and the new method in the package but decided that would be too confusing.

Additional context

See discussion at dbt-labs/dbt-core#2354 and https://getdbt.slack.com/archives/C2JRRQDTL/p1623417748059300. I'm especially indebted to Josh D. and Mikael R. for these ideas and the raw_sql implementation.

Who will this benefit?

This should make it much easier for people to get started and run unit tests since they don't need to set up separate targets, runs, etc.

The main downside is that you now have to define mappings for each individual test, so it is somewhat less DRY (though there may be a way to use global variables still).

Are you interested in contributing this feature?

Yes

Failed to replace if input_mapping keys do not exactly match with the reference in the sql file

Describe the bug

In older versions, the input mapping keys are rendered and then replace in the rendered SQL. However, in newer versions, the input mapping keys are replaced in the raw SQL. Therefore, the format of the input mapping keys must exactly match the format in the SQL file in order to be find and replace.

Steps to reproduce

input_mappings:
   ref('model'): ref("mock_reference')

SQL file:

SELECT * FROM {{ ref("model") }}

datamocktool can't replace ref('model') with ref("mock_reference') for the SQL file

Expected results

  • datamocktool replace ref("model") with ref("mock_reference') during the test.
  • Or raises error if any of the input mapping keys are not found in the SQL file.

Actual results

  • ref("model") is not replaced during test.
  • No warning/error raises for mismatch between input_mapping keys with the reference in the sql.

Screenshots and log output

System information

The contents of your packages.yml file:

Which database are you using dbt with?

  • postgres
  • redshift
  • [ x] bigquery
  • snowflake
  • other (specify: ____________)

The output of dbt --version:

1.5.1

Additional context

Are you interested in contributing the fix?

yes

Multiple tests on same model fails

Describe the bug

When I add multiple unit tests for the same model, they both seem to go towards the same "xxx__dbt_tmp" table. But they should have different answers. This makes the tests fail.

Steps to reproduce

Add 2 unit tests to the same model with different input and output.

Expected results

Both tests should run and have their own result from the base table.

Actual results

Always one of the tests fail, randomly depending on who did the __dbt_tmp table first I guess?

System information

packages:
  - package: dbt-labs/dbt_utils
    version: 0.7.3
  - package: mjirv/dbt_datamocktool
    version: [">=0.1.4"]

Which database are you using dbt with?

  • postgres
  • redshift
  • bigquery
  • snowflake
  • other (specify: ____________)

The output of dbt --version:

installed version: 0.20.1

Not sure if I'm doing something wrong, but when I've rerun a few times this seems to be the problem, the tmp table contains one of the tables expected output and they are both using the same one.

Ability to mock {{ this }} even when accessed inside macros

Describe the feature

Sometimes, in Incremental testing, we use call macros from a model to access {{ this }} table. In this case, the test fails to replace this table with the mock file

Describe alternatives you've considered

The alternative is completely to abandon the macro and have {{ this }} only in the model itself.

Additional context

Here is an example of the situation:
inside a block of {% if is_incremental %}
there might be a line like:
where id > (select max(column_name) from {{ this }})
but in other cases, we may just call a macro that does the same:
for example, we might implement it like this:
in the is_incremental block we write:

{% if is_incremental %}
{% set max_value = get_nax_value(column_name) %}

and then call:

where id > ('{{ max_value }}')
{% endif %}

the macro function get_max_value() then implements select max(column_name) from {{ this }}

Problem is, that in compilation time this fails, because it tries to run the SELECT statement in compile time in order to render the model, but the variable this was not yet replaced by the mock table, because this happens only in runtime after rendering.

Who will this benefit?

Anyone who delegate functions from the module that uses {{ this }} into a macro

Are you interested in contributing this feature?

I would love to. Even though I do not have a suggestion on how to resolve it.

run unit tests independent from dbt test

Describe the feature

As a software engineer I would like to unit test my Macros/UDFs/SQL Models, so that I can be confident that they work correctly.
I want to unit tests before any dbt processing, so that I dont need to clean my database if something went wrong.
I do not want to run dbt test (which also can be a data test like source_freshness) at the same time, because data can be outdated at this point.

Describe alternatives you've considered

Run tests provided by this package with
dbt test --unit-test
in a 'standalone' mode.
When you call
dbt test
they should be included.

With this feature I can run a Seequence like
dbt seed
dbt test --unit-test --> I can be confident that macros work well
dbt run --> prozessing
dbt test --> data tests

Who will this benefit?

Everyone who want to get confident that his code works correctly before dbt runs any transformations.

Are you interested in contributing this feature?

I need to clearify this

Test passes if actual result contains duplicate rows but expected result doesn't

Describe the bug

Because of the use of DISTINCT in set operations when comparing the expected and actual outputs, a test will pass even if one of the two contains duplicate rows. I was writing a test specifically to ensure that my model handles duplicates properly and this seemingly isn't possible to do.

Steps to reproduce

  1. Write a passing unit test
  2. Duplicate the last row of the expected output seed file, so that this should now have one row more than the actual model results
  3. Run the test again

Expected results

The test should fail because the outputs don't have the same row count. The expected output has one row more than the actual.

Actual results

The test passes

Screenshots and log output

N/A

System information

The contents of your packages.yml file:

packages:
  - package: dbt-labs/codegen
    version: 0.10.0
  - package: calogica/dbt_expectations
    version: 0.9.0
  - package: mjirv/dbt_datamocktool
    version: 0.3.2

Which database are you using dbt with?

  • postgres
  • redshift
  • bigquery
  • snowflake
  • other (specify: ____________)

The output of dbt --version:

❯ dbt --version
Core:
  - installed: 1.6.0
  - latest:    1.6.1 - Update available!

Additional context

My first thought was to switch the except distinct to an except all but apparently BigQuery does not support that?

Are you interested in contributing the fix?

Add CI pipeline via GitHub Actions

Describe the feature

When you open a PR, it should automatically run the ./run_integration_tests script from the integration_tests/ folder via GitHub actions.

Additionally, it should block you from merging if they fail.

Unable to contribute

Hey, I would like to contribute a small section in the docs but I am not allowed to create a new branch. Can you please change your security policies?

Error was:

ERROR: Permission to mjirv/dbt-datamocktool.git denied to b0lle.
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

Thanks

SQL Server bug: Views or functions are not allowed on temporary tables

Describe the bug

See https://getdbt.slack.com/archives/C01UM2N7814/p1646744516633119

Steps to reproduce

- name: My_model
    tests:
      - dbt_datamocktool.unit_test:
          input_mapping:
            source('<schema>','<My_model_raw>'): ref('dmt__input')
          expected_output: ref('dmt__expected_output')

Expected results

Actual results

Screenshots and log output

Views or functions are not allowed on temporary tables. Table names that begin with '#' denote temporary tables. (4508)

System information

The contents of your packages.yml file:

Which database are you using dbt with?

  • postgres
  • redshift
  • bigquery
  • snowflake
  • other (SQL Server)

The output of dbt --version:

<output goes here>

Additional context

Are you interested in contributing the fix?

Option to choose which columns are used for the compare

Describe the feature

I would like to specify which columns to compare the result and expected table.
We will be generating the data, and only care about some of the columns

Describe alternatives you've considered

changin the code

Who will this benefit?

big tables, and test is only on some of the columns

Are you interested in contributing this feature?

yes

How to mock {{ this }} macro?

Hello!

Reading the documentation I did not understand whether there is a way to mock not only "sources and refs" but also {{ this }} dbt macro if a model uses it?

More info on test failure.

I would love to see more information on test failures (like what the tables that were created are called). I can try to help with this if you can give me code pointers.

error on build

Describe the bug

when i build my project i get the following error
Runtime Error
Compilation Error in test dbt_datamocktool_unit_test_stg_customers_ref_dmt__expected_stg_customers_1___ref_dmt__raw_customers_1_ (models/schema.yml)
dbt was unable to infer all dependencies for the model "dbt_datamocktool_unit_test_stg_customers_ref_dmt__expected_stg_customers_1___ref_dmt__raw_customers_1_".
This typically happens when ref() is placed within a conditional block.

To fix this, add the following hint to the top of the model "dbt_datamocktool_unit_test_stg_customers_ref_dmt__expected_stg_customers_1___ref_dmt__raw_customers_1_":

-- depends_on: {{ ref('raw_customers') }}

> in macro get_unit_test_sql (macros/dmt_get_test_sql.sql)
> called by macro test_unit_test (macros/dmt_unit_test.sql)
> called by test dbt_datamocktool_unit_test_stg_customers_ref_dmt__expected_stg_customers_1___ref_dmt__raw_customers_1_ (models/schema.yml)
> called by test dbt_datamocktool_unit_test_stg_customers_ref_dmt__expected_stg_customers_1___ref_dmt__raw_customers_1_ (models/schema.yml)

System information

The contents of your packages.yml file:

Which database are you using dbt with?

  • postgres
  • redshift
  • bigquery
  • snowflake
  • [ x] other (specify: databricks__)

The output of dbt --version:

0.21

Help on Unit test case for call statement in Macro

Hi DBT team,

I can writing a unit test case for dbt macros in which one of the macro has a call statement and get the result(load_result) and perform some business logic , Question : Is there any way i can mock the result of the call statement ?

`
{% macro get_max_level(src,product, default=1) -%}
{%- if not execute -%}
{{ return(default) }}
{% endif %}
{%- call statement('get_max_level', fetch_result=true) %}
SELECT
max(level) as level
FROM
{{ src.database }}.{{ src.schema }}.employee

{%- endcall -%}
{%- set level = load_result('get_max_level') -%}
{%- if level and level['data'] -%}
{%- set level_data = product_level['data'] | map(attribute=0) | list %}
{{ return(product_level_data[0] | as_number ) }}
{%- else -%}
{{ return(default | as_number ) }}
{%- endif -%}
{%- endmacro %}`

Help with mocking via macro

im trying to mock data using a macro following the example

{% macro expected_invoices() %}

(
    {% set records = [
        [1,"aaaaaaaa","type"]
    ] %}

    {% for record in records %}
        select {{ record[0] }} as sys_id, '{{ record[1] }}' as id , '{{ record[2] }}' as type
        {% if not loop.last %}
            union all
        {% endif %}
    {% endfor %}
) expected_invoices

{% endmacro %}

but am getting the error when running of
Macro test_equality expected a Relation but received the value:
14:35:41 (
select 1 as sys_id, 'aaaaaaaa' as id, 'FAC' as type
) as expected_invoices

i have this set up in my .yml

      - dbt_datamocktool.unit_test:
          input_mapping:
            ref('table'): "{{ input_invoices() }}"
          expected_output: "{{ expected_invoices() }}"
          name: "expecting_type_of"
          description: "testing mocking data fro dbt unit testing"
          compare_columns:
           - sys_id
           - id
           - type

am i missing something else anywhere ? any help would be much appreciated
dbt version 1.3 and im using trino

Package creating a new table instead of using existing seed

Describe the bug

I was working on a personal project 5 months ago which had dbt, and this package, everything was working well.
Now as I was trying to run the project again, it is not going as well.

Steps to reproduce

      - dbt_datamocktool.unit_test:
          input_mapping:
            source('historical_data', 'yellow_historical'): ref('test_dbt__bi__yellow_m2m')
          expected_output: ref('dbt__bi__yellow_m2m_expected')
          depends_on:
            - ref('seed_zones')
          # tags: ["unit-test"]

test_dbt__bi__yellow_m2m and dbt__bi__yellow_m2m_expected being seeds that have same configuration and are in the same directory and are also loaded to the same bigquery dataset.
the 2 seeds originate from 2 csv files with the same names.

Expected results

A big query job combining the 2 seeds and testing the results of apply a model to one of the input and comparing it to the output.

Actual results

the tool uses only 1 of the two table, the expected_output table, and attempts to create the other table.

/* {"app": "dbt", "dbt_version": "1.5.2", "profile_name": "default", "target_name": "dev", "node_id": "test.nytaxi.dbt_datamocktool_unit_test_dbt__bi__yellow_m2m_ref_seed_zones___ref_dbt__bi__yellow_m2m_expected___ref_test_dbt__bi__yellow_m2m_.ac26a1c36a"} */

    











with a as (

    
select

  

   
    `vendors` 
    
      , 
     
   
    `year` 
    
      , 
     
   
    `month` 
    
      , 
     
   
    `monthly_fare_amount` 
    
      , 
     
   
    `average_passenger_count` 
    
      , 
     
   
    `average_trip_distance` 
    
      , 
     
   
    `monthly_extra_amount` 
    
      , 
     
   
    `monthly_mta_tax` 
    
      , 
     
   
    `monthly_tip_amount` 
    
      , 
     
   
    `monthly_tolls_amount` 
    
      , 
     
   
    `monthly_improvement_surcharge` 
    
      , 
     
   
    `monthly_total_amount` 
    
      , 
     
   
    `monthly_congestion_surcharge` 
    
      , 
     
   
    `pickup_borough` 
    
      , 
     
   
    `pickup_zone` 
    
      , 
     
   
    `dropoff_borough` 
    
      , 
     
   
    `dropoff_zone` 
    
      , 
     
   
    `avg_duration_minutes` 
    
      , 
     
   
    `trip_count` 
    
      , 
     
   
    `total_duration_hours` 
    
      , 
     
   
    `busiest_hour_of_day` 
    
      , 
     
   
    `airport_trip` 
    
      , 
     
   
    `payment_type` 
     
  



from `stellarismusv5`.`models_ut`.`dbt__bi__yellow_m2m_expected`


),

b as (

    
select

  

   
    `vendors` 
    
      , 
     
   
    `year` 
    
      , 
     
   
    `month` 
    
      , 
     
   
    `monthly_fare_amount` 
    
      , 
     
   
    `average_passenger_count` 
    
      , 
     
   
    `average_trip_distance` 
    
      , 
     
   
    `monthly_extra_amount` 
    
      , 
     
   
    `monthly_mta_tax` 
    
      , 
     
   
    `monthly_tip_amount` 
    
      , 
     
   
    `monthly_tolls_amount` 
    
      , 
     
   
    `monthly_improvement_surcharge` 
    
      , 
     
   
    `monthly_total_amount` 
    
      , 
     
   
    `monthly_congestion_surcharge` 
    
      , 
     
   
    `pickup_borough` 
    
      , 
     
   
    `pickup_zone` 
    
      , 
     
   
    `dropoff_borough` 
    
      , 
     
   
    `dropoff_zone` 
    
      , 
     
   
    `avg_duration_minutes` 
    
      , 
     
   
    `trip_count` 
    
      , 
     
   
    `total_duration_hours` 
    
      , 
     
   
    `busiest_hour_of_day` 
    
      , 
     
   
    `airport_trip` 
    
      , 
     
   
    `payment_type` 
     
  



from -- depends_on: `stellarismusv5`.`models_seeds`.`seed_zones`
    
    
    `stellarismusv5`.`models_historical`.`dbt__bi__yellow_m2m_dmt_49567973`


),

a_intersect_b as (

    select * from a
    

    intersect distinct


    select * from b

),

a_except_b as (

    select * from a
    

    except distinct


    select * from b

),

b_except_a as (

    select * from b
    

    except distinct


    select * from a

),

all_records as (

    select
        *,
        true as in_a,
        true as in_b
    from a_intersect_b

    union all

    select
        *,
        true as in_a,
        false as in_b
    from a_except_b

    union all

    select
        *,
        false as in_a,
        true as in_b
    from b_except_a

),

final as (
    
    select * from all_records
    where not (in_a and in_b)
    order by  in_a desc, in_b desc

)

select * from final


dbt__bi__yellow_m2m_dmt_49567973 is not a table that I have created. It is basically an identical copy of the table that is being tested, and it is even created in the same dataset as the original table.

while stellarismusv5.models_ut.dbt__bi__yellow_m2m_expected is the correct seed for expected output.

Screenshots and log output

image image

System information

packages:
  - package: calogica/dbt_date
    version: 0.7.2
  - package: calogica/dbt_expectations
    version: 0.8.2
  - package: elementary-data/elementary
    version: 0.8.2
  - package: dbt-labs/dbt_utils
    version: 1.0.0
  - package: mjirv/dbt_datamocktool
    version: 0.2.0

Which database are you using dbt with?

  • postgres
  • redshift
  • [ x] bigquery
  • snowflake
  • other (specify: ____________)

The output of dbt --version:

1.5.2

Additional context

I checked issues first, and it does not look like this is a common issue, so I'd wager that this is something incorrect on my end.
I have spent the better part of the day fiddling around, trying things out, changing directories, names, nothing works. I cannot seem to
pinpoint the cause, so any hints would be appreciated.

Integration tests fail on compare columns test

Describe the bug

Integration tests of dbt-datamocktool does not run with success at the moment.

Steps to reproduce

Run integration tests

cd integration_tests
./run_integration_tests.sh

or more specifically run

dbt test -s stg_customers

in integration_tests folder.

Expected results

Integration tests to run successfully.

Actual results

Failed test.

Screenshots and log output

(venv) petter@Petters-iMac integration_tests % dbt test -s stg_customers                                                                                                                                          
18:10:07  Running with dbt=1.5.2
18:10:07  Registered adapter: postgres=1.5.2
18:10:07  Found 6 models, 25 tests, 0 snapshots, 0 analyses, 458 macros, 0 operations, 10 seed files, 1 source, 0 exposures, 0 metrics, 0 groups
18:10:07  
18:10:07  Concurrency: 1 threads (target='dev')
18:10:07  
18:10:07  1 of 5 START test Raw Customers 2 .............................................. [RUN]
18:10:08  1 of 5 PASS Raw Customers 2 .................................................... [PASS in 0.20s]
18:10:08  2 of 5 START test dbt_datamocktool_unit_test_stg_customers_customer_id__first_name__last_name__ref_dmt__expected_stg_customers_1___ref_dmt__raw_customers_1_  [RUN]
18:10:08  2 of 5 ERROR dbt_datamocktool_unit_test_stg_customers_customer_id__first_name__last_name__ref_dmt__expected_stg_customers_1___ref_dmt__raw_customers_1_  [ERROR in 0.07s]
18:10:08  3 of 5 START test dbt_datamocktool_unit_test_stg_customers_description__ref_dmt__expected_stg_customers_1___ref_dmt__raw_customers_1_  [RUN]
18:10:08  3 of 5 PASS dbt_datamocktool_unit_test_stg_customers_description__ref_dmt__expected_stg_customers_1___ref_dmt__raw_customers_1_  [PASS in 0.10s]
18:10:08  4 of 5 START test not_null_stg_customers_customer_id ........................... [RUN]
18:10:08  4 of 5 PASS not_null_stg_customers_customer_id ................................. [PASS in 0.03s]
18:10:08  5 of 5 START test unique_stg_customers_customer_id ............................. [RUN]
18:10:08  5 of 5 PASS unique_stg_customers_customer_id ................................... [PASS in 0.04s]
18:10:08  
18:10:08  Finished running 5 tests in 0 hours 0 minutes and 0.57 seconds (0.57s).
18:10:08  
18:10:08  Completed with 1 error and 0 warnings:
18:10:08  
18:10:08  Database Error in test dbt_datamocktool_unit_test_stg_customers_customer_id__first_name__last_name__ref_dmt__expected_stg_customers_1___ref_dmt__raw_customers_1_ (models/staging/schema.yml)
18:10:08    column "description" does not exist
18:10:08    LINE 72:     "description" 
18:10:08                 ^
18:10:08  
18:10:08  Done. PASS=4 WARN=0 ERROR=1 SKIP=0 TOTAL=5
(venv) petter@Petters-iMac integration_tests % 

System information

The contents of your packages.yml file:
packages:

  • local: ../

Which database are you using dbt with?

  • postgres
  • redshift
  • bigquery
  • snowflake
  • other (specify: ____________)

The output of dbt --version:

Core:
  - installed: 1.5.2
  - latest:    1.5.2 - Up to date!

Plugins:
  - postgres: 1.5.2 - Up to date!


Additional context

Something with compare_columns seems to be the problem.

Are you interested in contributing the fix?

Yes

Tests should have readable names and/or descriptions

Describe the feature

From Kareem A on Slack:

If I am going to use this I will likely need to make multiple test cases and consequently need to mock each source multiple times, one for each use case
so having a name/description feature for each unit test (like in dtspec) would be great so that from logs it can be easily identified which case exactly did fail and what is it supposed to test without needing to jump between input and expected output files to understand

Describe alternatives you've considered

N/A

Additional context

N/A

Who will this benefit?

Users with multiple tests for the same models

Are you interested in contributing this feature?

Support for json as mock and expected output

Describe the feature

How do we support for json as input for mock and expected output

Describe alternatives you've considered

I do not think there are alternatives

Additional context

Is this feature database-specific? Which database(s) is/are relevant? Please include any other relevant context here.

Who will this benefit?

What kind of use case will this feature be useful for? Please be specific and provide examples, this will help us prioritize properly.

Are you interested in contributing this feature?

With support I can contribute

Allow for overriding of current_timestamp to a static value

Describe the feature

Allow for overriding of {{ dbt_utils.current_timestamp() }} to a static value

Describe alternatives you've considered

I have not been able to think of straight forward alternatives to easily do this.

This could also be useful for overriding of dbt functions in general (e.g. {{ var('my_variable') }})

Who will this benefit?

This is useful for testing dbt transformations that are dependent on the current timestamp

Are you interested in contributing this feature?

No

compare_columns not excluding columns since > 0.2.1 on BigQuery

Describe the bug

When using datamocktools >= 0.2.2 with dbt-bigquery==1.3.1 and dbt-core==1.3.2, dbt compile` reports an error as a result of attempting to compare columns with unsupported types.

Steps to reproduce

  1. Create test inputs with ARRAY types.
  2. Create test output.
  3. Select compare_columns to not include the ARRAY type columns.
  4. run dbt compile

Expected results

dbt compile should not raise an error.

Actual results

19:02:43  BigQuery adapter: ***
19:02:43  Encountered an error:
Runtime Error
  Database Error in test test_census_aggregations (models/marts/my_model/_models.yml)
    Column 1 in INTERSECT DISTINCT has type that does not support set operation comparisons: ARRAY at [341:5]

Clicking on the link provided by the bigquery adapter shows that all columns are being compared.

Screenshots and log output

System information

The contents of your packages.yml file:

packages:
  - package: dbt-labs/dbt_utils
    version: 1.0.0
  - package: mjirv/dbt_datamocktool
    version: 0.2.2

Which database are you using dbt with?

  • postgres
  • redshift
  • [ x] bigquery
  • snowflake
  • other (specify: ____________)

The output of dbt --version:

Core:
  - installed: 1.3.2
  - latest:    1.4.3 - Update available!

  Your version of dbt-core is out of date!
  You can find instructions for upgrading here:
  https://docs.getdbt.com/docs/installation

Plugins:
  - bigquery: 1.3.1 - Update available!

  At least one plugin is out of date or incompatible with dbt-core.
  You can find instructions for upgrading here:
  https://docs.getdbt.com/docs/installation

Additional context

Looks like it might be related to the switch to audit_helper.compare_relations. Maybe just a bug with the way the exclude_columns list is being generated. Not sure why it's being mapped to upper.

Are you interested in contributing the fix?

Happy to contribute the fix. Pretty experienced with python/SQL, but not familiar with the dbt python API or bigquery adapters. Would need help getting started.

Using aliases

Describe the bug

Not sure if this is a bug, but we use aliases and when we do, this library can't find the model, since the model itself is still called what the sql file is called and the alias just exists inside of that model.

Steps to reproduce

Rename a model with {{config(alias='something_else')}}

Expected results

Run the tests

Actual results

'dict object' has no attribute 'model.<dbt_project_name>.<alias>'

System information

The contents of your packages.yml file:

packages:
  - package: dbt-labs/dbt_utils
    version: 0.7.3
  - package: mjirv/dbt_datamocktool
    version: [">=0.1.5"]

Which database are you using dbt with?

  • postgres
  • redshift
  • bigquery
  • snowflake
  • other (specify: ____________)

The output of dbt --version:

0.21.0

Comparing all columns fails

Describe the bug

As far as I know, when we son't specify the columns to compare in the compare_columns, all columns should be compared. When the code for the test is generated, a * should be added, but this is just not the case sometimes... Which results in the test failing. (screenshot below)

Steps to reproduce

I'm not exactly sure how to reproduce the issue, as this is pretty flaky and seems to fail randomly (I haven't been able to find a pattern as of now).

Screenshots and log output

Screenshot from 2022-12-02 18-05-57

System information

The contents of your packages.yml file:

packages:
  - package: mjirv/dbt_datamocktool
    version: 0.2.0
  - package: re-data/re_data
    version: 0.10.1
  - package: calogica/dbt_expectations
    version: 0.5.6
  - package: dbt-labs/codegen
    version: 0.6.0
  - package: dbt-labs/dbt_utils
    version: 0.8.6

Which database are you using dbt with?

  • postgres
  • redshift
  • bigquery
  • snowflake
  • other (specify: ____________)

The output of dbt --version:

Core:
  - installed: 1.2.3
  - latest:    1.3.1 - Update available!
  Your version of dbt-core is out of date!
  You can find instructions for upgrading here:
  https://docs.getdbt.com/docs/installation
Plugins:
  - bigquery: 1.2.0 - Update available!
  At least one plugin is out of date or incompatible with dbt-core.
  You can find instructions for upgrading here:
  https://docs.getdbt.com/docs/installation

Are you interested in contributing the fix?

Maybe not specifically me, but someone from my org could probably help

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.