Comments (2)
Hi Jan!
Thanks for sending this in. This is actually a known issue with the current implementation of condenser, thought not quite as described β allow me to explain.
When you select a direct target, such as Meals, the current algorithm does one pass "upstream", meaning following foreign (parent) keys that point to Meals (in this case, MealFood and MealI18n), and any grandparents etc until you reach the "root tables", i.e. tables with no keys pointing to them. In this case, MealFood and MealI18n are the tables at the top.
From there, we make a downstream pass all the way down to the "leaf tables", i.e. tables that don't point to anything else. In this case, from MealFood and MealI18n, we go back down to Meal and Food, and then we call it a day. This is a referentially intact subset. Notably though, we never go back upstream! This means when we hit Food (which is first hit on the downstream pass), we don't go back upstream to grab FoodI18n. The intent was for us to minimize the amount of data collected to make a good subset βΒ there's a wide variety of opinions of what makes a good subset.
That being said, we've actually written a new algorithm for subsetting traversal, and we're currently working on adding it to the core Tonic product. It will in fact hit MealI18n, and it ensures every related table will be subset.
We're not sure if we'll be adding this to condenser any time soon, but hopefully this context is helpful! I heard you spoke with Jake recently, happy to help where we can.
from condenser.
Hi Johnny
Thanks for the detailed explanation. I did take a brief look at the description of the algorithm, but your explanation makes it a lot clearer
from condenser.
Related Issues (20)
- SQL Server Source and Dest DB getting out of sync HOT 1
- Self referential tables are not supported HOT 1
- FastSubset is missing HOT 1
- PostgreSQL sequences need to be reset after subsetter is completed HOT 2
- Undocumented option HOT 1
- Ignore certain relations HOT 1
- Multiple foreign keys issue HOT 3
- Bundling as package? HOT 2
- Condenser hanging -- debugging options? HOT 1
- Measuring Loss of Data when cutting an edge to remove a cycle HOT 1
- subset_downstream generates incorrect SELECT queries if target table has multiple fk references HOT 3
- access denied on 'tonic_subset_temp_db_...' HOT 2
- Insert gives foreign key constraint error when correct dependency breaks are listed HOT 1
- ignore some columns on copy_rows HOT 1
- Writing subset to SQL file instead of to destination DB HOT 2
- Redundancy between db_name and prefacing every table with the name of the database? HOT 2
- Doesn't seem to work on ubuntu HOT 1
- Error when inserting into GENERATED ALWAYS AS IDENTITY columns (PostgreSQL) HOT 2
- Anonymization support? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from condenser.