Comments (5)
I have got the infrastructure for this to work. I am now working on the actual inference logic. It would be helpful to know what is/are the format(s) of date supported by SQL? I am working under the assumption that its just YYYY-MM-DD
for now.
from mimir.
*supported by Mimir
from mimir.
A basic version of this is working. Types are inferred by pattern matching them against ints, floats, dates and booleans and collecting votes for matches. The type with the max votes wins if votes for that type is greater than 95% of the total votes, otherwise it defaults to String. Need a coercion policy for dealing with values that do not conform to the inferred type (the other 5%).
from mimir.
Kickass. Just replace non-conforming types with NULL values for now.
from mimir.
Done.
CSV imports now work with headers (no types, just attribute names), and type inference I think works (tested with just the Employee table in awesomedb, needs further testing).
from mimir.
Related Issues (20)
- Mimir still creating spark-warehouse and metastore_db
- Support for Vizual within Mimir
- Support table mutation operations in Mimir
- OFFSET queries are painfullly slow / do not complete HOT 1
- Catch UserInterruptedException (and others) in Mimir Command Line
- crash on use of 'like' HOT 2
- Error compiling float/int addition
- row_number is incorrectly pulled into lazy_row HOT 1
- Interpolation Model impossibly slow HOT 1
- Add NLP lenses, e.g., for Date Extraction
- Sanitize sheet name and headers for google sheet datasource HOT 1
- Replace UDFs/UDAs with Spark's Catalog
- Stratified Sampling Operator HOT 3
- The shape detector lens is not producing sensible caveats HOT 1
- CAST behavior inconsistent between Mimir and Spark
- Detect Headers not properly removing header row
- Order-by resolves (group-by) attributes against pre-aggregate schema, not post-aggregate schema.
- Replace typesystem with Spark-/Hive- types
- Switch to HyperLogLog for domain tests
- SystemCatalog is sloooooow. HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mimir.