Comments (7)
Yeah, thanks. I suspected it's a complex issue. It's not hard to imagine a scenario where continuing on in the parse actually creates a bunch of false errors.
Scripts! I'm definitely game. Perhaps someday we can have a toolbox of these in the Cloud IDE. Thanks!!
from dbt-core.
@jtcohen6 @dbeatty10 - Is this syntax actually still "pointless" but valid? We are throwing a pretty hard parser error here as part of a version upgrade path..
from dbt-core.
@will-sargent-dbtlabs I'm going to move this to the dbt core repo - the Parsing Error you're seeing here is not being returned from the JSON Schema validation (which is informative only) but from core itself.
from dbt-core.
we had to painfully work through the parse and reparse to find each error, because the parser dies on these errors. (Find one and die, fix, full parse, find next and die).
@will-sargent-dbtlabs Oooch 😬
I'm not sure yet if we'll choose to change anything in 1.7 or not to directly address this type of situation.
But as a workaround, see below for a simple Python script that will recursively search for this type of thing within a dbt project in your current working directory.
Toggle to see Python code
python search_none_tests.py
import yaml
import glob
import os
def find_none_tests(data, path=[]):
"""
Recursively search for keys named 'tests' with None values in the given data.
:param data: The current part of the data to search through.
:param path: The current path to this point in the data.
:return: A list of paths to 'tests' keys with None values.
"""
if isinstance(data, dict):
for key, value in data.items():
if key == "tests" and value is None:
yield path + [key]
else:
yield from find_none_tests(value, path + [key])
elif isinstance(data, list):
for index, item in enumerate(data):
yield from find_none_tests(item, path + [index])
def examine_yaml_file(yaml_file_path):
if os.path.isfile(yaml_file_path):
with open(yaml_file_path, "r") as file:
data = yaml.safe_load(file)
none_tests_paths = list(find_none_tests(data))
if none_tests_paths:
print(f"Found `tests` key with None values in {yaml_file_path}:")
for path in none_tests_paths:
print(" " + " -> ".join(map(str, path)))
def search_and_examine_yaml_files():
# Search for all YAML files in the current directory and all subdirectories
yaml_files = glob.glob("**/*.yaml", recursive=True) + glob.glob(
"**/*.yml", recursive=True
)
if not yaml_files:
print("No YAML files found.")
return
for yaml_file in yaml_files:
examine_yaml_file(yaml_file)
if __name__ == "__main__":
search_and_examine_yaml_files()
Then run it like this:
python search_none_tests.py
And get output like this:
Found `tests` key with None values in models/schema.yaml:
models -> 0 -> columns -> 0 -> tests
Found `tests` key with None values in models/_properties.yml:
models -> 0 -> tests
models -> 0 -> columns -> 0 -> tests
from dbt-core.
I took a look at this scenario in version 1.4 vs. 1.5, and it started giving the following error in 1.5 (whereas it was allowed in 1.4):
00:19:44 Encountered an error:
Parsing Error
Invalid models config given in models/_models.yml @ models: {'name': 'my_model', 'tests': None, 'columns': [{'name': 'id', 'tests': ['not_null']}], 'original_file_path': 'models/_models.yml', 'yaml_key': 'models', 'package_name': 'my_project'} - at path ['tests']: None is not of type 'array'
Since this scenario is explicitly called out in the migration guide for 1.5 (see screenshot below), I'm going to close this as "not planned".
from dbt-core.
Thanks for the docs link @dbeatty10.
Makes sense to me on the not planned on allowing it.
Also, thanks for providing the script!
from dbt-core.
However, is there a way that the parser won't die completely each time it hits an error like this?
We hear you on how painful this was 😢
Due to the complexities involved, I don't see us moving off the "die upon first parsing failure" approach.
We also had no idea how many of these we would hit because of that, so we are kind of like, how long do we keep this up, (can we fix in a few minutes) or is this like a sprint-impacting spike we need to do...
If you were to do this migration from 1.4 over again from scratch, I'd suggest running the Python script in #9845 (comment) to see how many of these you were facing. Then that would help you assess and estimate how much effort it would take to resolve this particular upgrade edge case.
Alternatively, I could imagine some of the YAML-editing strategies used in dbt-meshify or dbt-osmosis being adopted in a custom program that tries to perform an updated. i.e., it would attempt to edit all the relevant YAML files in-place to achieve this part of the migration from 1.4 to 1.5.
from dbt-core.
Related Issues (20)
- [Flaky Test] test_deps_default HOT 1
- [Flaky Test] test_deps_add
- [Flaky Test] test_simple_dependency
- [testing] Reduce flaky tests by retrying git failures HOT 2
- [source freshness] FreshnessConfigProblem log does not disambiguate by table name
- Drop custom `_error_tag` and use `error_tag` provided by dbt-common
- [dbt.artifacts] Remove dbt.core imports from dbt.artifacts
- [Feature] CLI Parameter for `packages-install-path` HOT 2
- [Feature] Update `agate` version HOT 1
- [Bug] dbt run stuck at python models HOT 3
- [Regression] `show` command should more gracefully handle/fail queris that fail to execute
- Create a standard way to check events being fired in Unittest HOT 3
- [Feature] Support sqlparse 0.5.0 HOT 3
- Include all DynamicLevel events in warn-error-options
- [Flaky Test] test_concurrency
- [Feature] parse model variables to manifest response
- Refactor manifest validations at the end of `get_full_manifest` into rules that are iterated over HOT 1
- [Robust Testing] Reorganize tests to match source code
- [Bug] `--empty` flag generates SQL that conflicts with subquery aliases HOT 3
- Add security policy to project
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dbt-core.