Giter VIP home page Giter VIP logo

ignition's People

Contributors

uralian avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

ignition's Issues

Implement step and flow listeners

Implement the following listeners:

  • Step listener to be notified on step computation
  • Flow listener to be notified when flow starts/stops
  • Stream data listener to be notified on each batch

Implement stream UpdateState function as a Merger construct

The UpdateState function can be exposed as a Merger step for 2 inputs:

  • the first argument is a DataFrame wrapper around Seq[Row] - the input data
  • the second argument is the optional state as a DataFrame (0 rows corresponds to None)
  • the output is the result state as a DataFrame (0 rows corresponds to None)

Rework artifact into multiple artifacts + all

Currently there's only one artifact, ignition.jar
To make it flexible, need to refactor that into multiple jar files:

  • ignition-core
  • ignition-db
  • ignition-dsa
  • etc.

Plus one fat ignition-all.jar

Implement step cache, to avoid recomputing outputs

Currently, each output value is recomputed every time the output is accessed. Need to implement the internal step cache to avoid that, and reset operation to reset the value and force the recomputation.

Create integration tests

Need to create IT configuration and move the appropriate unit tests there or create new ones:

  • RestClient
  • Cassandra
  • Mongo
  • Kafka

Combine CsvFileInput and TextFileInput

Combine the two steps into one and extend its functionality to provide the following:

  • Row separation strategy
    • newline - use textFile
    • regex - use Source.fromFile, then split
    • none - use Source.fromFile
  • Column separation strategy
    • regex - use split on each row
    • fixed - use take(), etc.
    • none - whole row
  • Column names/types
    • none - use COL0, COL1, etc. with String type
    • schema - validate and apply conversion

Change spark library scope to "provided"

Currently spark libraries and their dependencies are added to the distribution; change their scope to provided, but also allow them to be added at runtime when running the examples

Make XML and JSON tag names consistent

There's inconsistency in naming various elements of step representations: "group-by" vs "groupBy", "columns" vs "fields" etc. Need to make it consistent throughout the app. Also, think of making the tags shorter (like "csv-input" vs "csv-file-input", "debug" vs "debug-output" etc.)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.