Comments (5)
🤖
Hey @jlaneve! Good to see you here again, your insights are always thought-provoking. Thanks for this detailed issue! I'm diving into your proposal now and will get back to you with a comprehensive response shortly.
from astronomer-cosmos.
from @tatiana: another workaround in the meantime could be creating manual dependencies between table A and C, using something like #577
from astronomer-cosmos.
I really like your proposal, @jlaneve ! Once we work on this, follow up in #airflow-astronomer:
https://apache-airflow.slack.com/archives/C03T0AVNA6A/p1702309820139329
from astronomer-cosmos.
Love it!
from astronomer-cosmos.
Nothing wrong with this as a feature that users can enable, and as long as I'm aware of this feature I'll just do whatever I need in my own setup, so default behavior discussions decisions won't impact me much. That said, if I may, my $0.02 would be that it is probably better to not have Cosmos skip views as a default behavior.
Main reasons:
- Some amount of astonishment as it is non-standard
dbt
behavior: The default behavior for dbt is that it will attempt to recreate any view by default. I think by default Cosmos should attempt to more or less emulate would happen if you just randbt build
, and this would include recreating any views. - Unclear what problem this is solving: If the issue is compile time, I think there are much better ways to solve this. For example, #785. (<- Although this issue doesn't address performance issues for small, "just getting started" style projects, it does help for larger more complex projects that have
dbt compile
as part of their CICD, which are the ones more likely to hit performance issues anyway.) - Not skipping is likely better behavior for new users: It seems like the trade-off being made here between setting this as a default behavior vs not is potentially reducing complaints about performance / DagRun landing times, at the risk of potentially increasing complaints that Cosmos is unexpectedly skipping things. For users just getting started with Cosmos, performance shouldn't be an issue as their projects are likely to be small. So not skipping should optimize the new user experience.
- Backwards compatibility: This will be a huge gotcha for a lot of people who are upgrading Cosmos without looking at the release notes.
A few other reasons:
- Some users update their views a lot: I think the assumption is that views don't change, but a lot of workflows may involve making many changes to views. Obviously not on each run, but many users may be constantly making code changes 5 days a week and those code changes may be woven through the DAG.
- Lack of
Param
s in existingDbtDag
, combined with being a non-standarddbt
feature: The proposal here suggests adding aParam
, presumably to theDbtDag
. I think addingParam
s to theDbtDag
would be fine, but it feels like there should be more of an all-or-nothing; most notably,--full-refresh
is aParam
that makes a lot of sense to include as well.
Again, this doesn't really impact me, I'm just chiming in on what I think would be the better way to proceed for Cosmos's sake.
from astronomer-cosmos.
Related Issues (20)
- Decouple `LoadMode.AUTOMATIC` from `load()` method in `DbtGraph`
- Cosmos tasks randomly getting marked as zombie tasks HOT 3
- Fix hard to read AirflowException error logs from dbt command HOT 2
- bug please help me this is my project (ubuntu) HOT 6
- Permission issue with Cosmos cache in some restricted environments HOT 2
- [Bug]: Trial Issue using Bug report template HOT 1
- This a new feature request using Feature request Issue template HOT 1
- Link to dbt docs from task HOT 1
- [Cosmos CI] 2.7 unit tests taking a long time to run
- Remove dependency on Pydantic
- Improve how we authorise non-commiters to run Cosmos integration tests
- Review dbt in Airflow Survey results
- Support docs generated with `dbt docs generate --static`
- Test task is generated though it is not defined HOT 2
- Reduce task queueing latency when using Cosmos HOT 5
- Emit Airflow metrics to support analysing Cosmos performance HOT 2
- `LoadMode.AUTOMATIC` does not seem to work as documented HOT 4
- Using EMR transient cluster with Cosmos HOT 1
- AthenaAccessKeyProfileMapping does not work as expected locally HOT 3
- [Bug] cosmos 1.4.1 does not create virtualenv when using `ExecutionMode.VIRTUALENV` HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from astronomer-cosmos.