Comments (2)
It makes most sense to my mind to structure this as two separate stored procedures, one for events and one for contexts.
Structs and arrays can be handled by creating new objects of the same type, or flattening - not sure which makes most sense at the moment.
I think it's acceptable to pear it down to a minimal implementation first (eg. only top-level structs and arrays), since the vast majority of users don't heavily use structs or arrays. Those fields not handled can be omitted or included but not coalesced - perhaps the latter makes most sense since that allows people to handle those cases themselves.
from data-models.
A possible approach:
- Javascript UDF takes an array of struct or array field paths as input, outputs the superset key-value pairs it finds at those paths (prioritising latest version) - as a struct.
- Call that UDF within the stored procedure on any array/struct fields found.
Need to check whether the key is always present even if one row has no value for a given key. If not, we may need something more complex in the UDF.
Also the output can change across runs, so it could only be used to produce scratch tables (but I don't think there's any way to avoid this).
from data-models.
Related Issues (20)
- Introduce start_tstamp in events_this_run / events_staged HOT 2
- Add stored procedures to more automatically generate custom event-level tables HOT 3
- Rename Snowplow Insights to Snowplow BDP
- Add backfill functionality HOT 3
- All models: Fix logic in users_sessions_this_run to account for sparse data HOT 1
- Snowflake web: Fix logic in users_sessions_this_run to account for sparse data
- All Redshift and Snowflake models: Fix varchar length for pseudonymized fields
- Snowflake web: Fix varchar length for pseudonymized fields
- Snowflake web: Remove start_date variable from users module
- Snowflake Web: Update copyright notices
- Snowflake web: Update column check stored procedure
- Update Readme HOT 1
- All models, base module: Fail with a warning when there's no data
- Redshift web: Change SORTKEY encoding to RAW
- Redshift Web: Fix column lengths in manifest tables
- Add migrations for previous releases
- Redshift Web: load_tstamp missing from table definition
- Redshift Mobile: load_tstamp missing from table definition
- Update dbt section within readme
- Snowflake Web: Optimise partition pruning HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from data-models.