Giter VIP home page Giter VIP logo

floe2's People

Contributors

kumbhare avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

lazycrazyowl

floe2's Issues

Channel grouping and message dispersion feature -- How messages are distributed when multiple pellets subscribe for same output stream.

Currently messages are duplicated to different pellets (and-split). e.g. (task parallelism)

This feature will add an ability to selectively send messages to desired downstream pellets. (multi-select split)

We will use named streams to implement this feature.

i.e. a pellet can generate more than one named stream. The emitter interface will have an additional parameter which refers to the stream name. The stream names should static and be predefined by the pellet and cannot be changed at runtime.

A downstream neighbour can then subscribe not only to the pellet, but also to the particular named stream of the pellet.

By default all pellets generate a stream named "DEFAULT_STREAM" and when subscribed without any parameters subscribe to the stream of the same name.

Enable (ZK) notifications for application start/stop.

Currently, once the resource mapping is obtained from the resource manager, it is stored in ZK where individual containers monitor the ZK location and deploy any flakes assigned to them as required.

However, there is no way for the coordinator (and in turn, the user) to know whether all flakes have been deployed successfully.

scale down stopped working.

Scale down does not work.
The resource mapping is correctly updated. The signal is received by the container but scale down action does not decrement the pellet.

Move the messaging and subscription logic from PelletExecutor to Flake.

Currently..
The predecessor pellet instance creates and send a message meant directly for the succeeding pellet's instance. This is nice since the notion of "flake" is completely hidden from the preceding flake's dispersion strategy. It works directly with the instances.

But, on the flip side, even though flake is the unit of communication, multiple subscriptions and backchannel messages are sent per flake since each flake creates multiple pellet instances.

This also has implications on fault tolerance (e.g. in case of reducer pellets). (Will add more details later).

Runtime scale down feature and commandline api.

Add the feature to perform scale down at runtime (i.e. gracefully terminate PE instances)
-Add command line api to do so.
-Use the updated data structure from scale up to hold resource mapping 'delta'
-Update Container/AppsAssignmentMonitor to look for changes in the app assignment and respond to the scale down request.

Note: It is important to cleanly close data connections to ensure no messages are lost during scale down, and any messages received but pending in the queue should be processed before terminating the instance.

scalable streaming map reduce.

Multiple features need to be added as part of this.

NOTE: This issue is only for scalable MR (and not elastic MR). i.e. the number of reducers are decided at deployment.

  1. channel grouping
  2. channel dispersion interface and strategies
  3. special pellet types for mapper and reducers
  4. appbuilder apis

Running multiple pellet instances per flake does not work correctly.

This is a stupid regression caused due to an effort to optimize by eliminating the middle layer of communication between the message emitter and the multiple backends (one per edge in the application).

Fix: bring the middle layer back!! (needs more re-factoring to keep the back channel and other things intact)

signallable pellets.

Ability to send signals to running pellets.

  1. Note this should be on the same thread as data messages. This will let the user to respond to the signal without worrying about threading issues.
  2. side effect is that during the time the signal is being processed, the data messages will start queuing in the zmq queues.

Scale up after all flakes have been terminated results in a blocked flake.

The flake first connects to the preceding flakes and then creates the pellet instance. But given the ZMQ's PUSH policy that the send method will block untill atleast one client connects, it leads to an issue since both the backend send and the control signal receive are on the same thread. Since backend send blocks, the control signal to increment pellet is never received and a deadlock occurs.

Dynamic Tasks (with a static list of alternate implementations).

Add support for dynamic pellets with AppBuilderAPI to easily create such dynamic tasks.

Add runtime support for dynamically switching the active alternate for a pellet.

NOTE: This issue is only related to dynamic alternate switching without synchronization/consistency issues. A different issue will be opened for that.

Runtime scale up feature and commandline api.

Add the feature to perform scale up/out at runtime depending on available resources.
-Add command line api to do so.
-Update data structure to hold resource mapping 'delta'
-Update Container/AppsAssignmentMonitor to look for changes in the app. assignment and respond to the scale up request.

Note: Active load balancing is not part of this and will be handled separately.

Decoupling from scale down feature since scaling down needs additional work to cleanly close data connections to ensure no messages are lost during scale down.

AndSplit (i.e. multiple subscribers) feature does not work.

When multiple subscribers subscribe to the same stream, the stream should be duplicated across the multiple subscribers.
Currently, the stream is split (in a round robin way) across them.

Expected:

  1. duplicate stream across multiple subscribers. (task parallelism)
  2. split (in round robin, or load balanced way) across multiple instances of the same subscriber pellet (data parallelism).

performing scale up command in the container after scale down (which terminates the flake) does not work.

Issue is with heartbeats.
(done) Fix for previously terminated flakes: The last heartbeat sent by the flake will mark the flake as terminated which will make the container cleanup the flake and hence on scale up initialize a new flake.

Fix for failed flakes (where the heartbeat may stop without the terminate flag): Reduce flake-to-container heartbeat duration. Periodic cleanup if heartbeat is not received in time. (will file and fix as a separate issue as part of the fault tolerance feature).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.