uralian / ignition Goto Github PK
View Code? Open in Web Editor NEWCreating reusable workflows for Apache Spark
License: Apache License 2.0
Creating reusable workflows for Apache Spark
License: Apache License 2.0
Need to implement a step for importing data from a jdbc-compliant database.
Implement the following listeners:
Need to implement a step for writing data into a JDBC-compliant database.
Need to provide both options for further deployment:
The UpdateState function can be exposed as a Merger step for 2 inputs:
Currently there's only one artifact, ignition.jar
To make it flexible, need to refactor that into multiple jar files:
Plus one fat ignition-all.jar
Currently, each output value is recomputed every time the output is accessed. Need to implement the internal step cache to avoid that, and reset operation to reset the value and force the recomputation.
Date functions are not working because of the serialization to string.
Need to create IT configuration and move the appropriate unit tests there or create new ones:
Combine the two steps into one and extend its functionality to provide the following:
Implement a step to provide caching of intermediate results
Currently spark libraries and their dependencies are added to the distribution; change their scope to provided, but also allow them to be added at runtime when running the examples
how to use directionly
Implement a step for generating a set of numeric values in a given range
Implement a step which would allow increasing or decreasing the number of data partitions.
Need to add AddFields step to the mix
frame.Main and stream.Main initialize their own contexts, need to refactor that to use a single SC
Operator IN does not work for stream filter, because it cannot parse it from the String. IN needs to be added as a UDF function.
Currently, Formula will fail if the input step does not have any rows, because of the way the schema gets computed. This needs to be fixed.
There's inconsistency in naming various elements of step representations: "group-by" vs "groupBy", "columns" vs "fields" etc. Need to make it consistent throughout the app. Also, think of making the tags shorter (like "csv-input" vs "csv-file-input", "debug" vs "debug-output" etc.)
Implement an input step for streaming data from DSA
Upgrade Spark to the latest version (1.4.1 as of 9/15), update the Cassandra spark connector accordingly.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.