Light

ashwanthkumar / suuchi Goto Github PK

View Code? Open in Web Editor NEW

53.0 10.0 12.0 1.16 MB

सूचि - Toolkit to build Distributed Data Systems

Home Page: https://ashwanthkumar.github.io/suuchi/

Scala 98.31% Shell 1.69%

scala distributed-computing rocksdb toolkit

suuchi's Introduction

Suuchi - सूचि

Having inspired from tools like Uber's Ringpop and a strong desire to understand how distributed systems work - Suuchi was born.

Suuchi is toolkit to build distributed data systems, that uses gRPC under the hood as the communication medium. The overall goal of this project is to build pluggable components that can be easily composed by the developer to build a data system of desired characteristics.

This project is in beta quality and it's currently running couple of systems in production setting @indix. We welcome all kinds of feedback to help improve the library.

Read the Documentation at http://ashwanthkumar.github.io/suuchi.

Suuchi in sanskrit means an Index¹.

Presentations

Following presentations / videos explain motivation behind Suuchi

Why we built a distributed system at DSConf 2018.
Video by @brewkode on Suuchi - Toolkit to build distributed systems at Fifth Elephant, 2017.
Suuchi - Distributed Systems Primitives
Suuchi - Application Layer Sharding
Suuchi - Distributed Data Systems Toolkit

Notes

If you're getting ClassNotFound exception, please run mvn clean compile once to generate from the java classes from protoc files. Also, if you're using IntelliJ it helps to close the project when running the above command. It seems to auto-detect sources in target/ at startup but not afterwards.

Release workflow

Suuchi and it's modules follow a git commit message based release workflow. Use the script make-release.sh to push an empty commit to the repository which would trigger a release workflow on travis-ci. More information can be found at docs.

License

https://www.apache.org/licenses/LICENSE-2.0

suuchi's People

Contributors

Stargazers

Watchers

Forkers

gsriram7 bytearchive balajikandan indix ppiotrow santhoshkumarvs odidev ross1503 aethonx fellahst varunvats9 fengbaicanhe

suuchi's Issues

Parallel Write Replication

As part of #27 and #23 we get a SequentialReplication strategy, we should look to improvise it to support parallel writes on the nodes.

Documentation Setup using mkdocs

Some PRs have really nice commit messages and discussions (#27 being my favourite). We should start documenting the internal decisions and workings of the system in a markdown somewhere so we could generate a documentation if needed in a later point of time.

We should surface the examples much more as something like recipes which folks could just copy-paste and take it from there. Also with #12 we should write scaladocs for external facing APIs.

A quickstart guide for general users of the library would also be great. We've already made an attempt as part of #19 but we could do better.

Basic Documentation Infra
Add examples as recipes
Quick Start
Getting Started

Provide abstraction on ConsistentHashRing

Today we've a ConsistentHashRing which supports find, add and remove. While this is great, it would really help if we can build abstractions on top that would help us visualise it as a HashRing and token ranges between the nodes in the ring.

This would help us when we try out rebalancing of data among the nodes.

Get basic Cluster membership working

Membership
- Ability to register a set of nodes onto a cluster
- Tests with members going up & down
- Ability to query any node and check for the available members

Upgrade to Scala 2.11.x

We're still on scala 2.10.4, while scala 2.11.x is nearing it's end by end of this year. We should probably upgrade to get a lot of compiler improvements.

Reuse client ManagedChannel instances while replicating

Today we open and close a channel for every message especially while replicating or forwarding. Looks like we can reuse the same connection and they're automatically multiplexed.

Use InMemoryStore in ShardedStoreSpec

PR Comments from https://github.com/ashwanthkumar/suuchi/pull/53/files#r83402588

HTTP Service support for each node to talk to each other

gRPC <-> HTTP
GET - input key, return back the data as "bytes"
PUT - input key, data bytes
HEALTH - Report health of the node
~~SHARD_INFO - Should report current shards that this node manages~~

Integrate Cluster

Integrate Cluster into Server abstraction built as part of #13
Support for Shard related information to be used for replication / rebalancing (#4)
Support dynamic rebalancing of data

Replication support - Part 1

Add support for getting multiple nodes from CHRing. - #23
Replication Interceptor for doing synchronous replication as writes happen. - #23 and #27

Compiler errors on setup

Hi! I imported this project into intellij as sbt project and ran sbt compile. I get a lot of compiler errors across the project. Please help me with the steps to setup this repo. Thanks

Optimize scans in VersionedStore

Today as part of scanner and versionScanner we do a full store san and then filter for Data Keys or Version keys. This is very inefficient on very large stores and takes a long time. Instead we can always push the VERSION_KEY or DATA_KEY prefix to the underlying store there by reducing the search space of the whole scans.

Service must know if they're replication request or original request

When a RPC service is routed using ReplicationHandler, the service must have an ability to know if it's the replication invocation or the original invocation. We can use Context to store and retrieve that information.

This is especially useful when we're going to send a metric for that replication request, you don't want to end up doing double counts of that.

Improve / Add ScalaDoc for public facing APIs

Today we don't have much ScalaDocs written as part of our code. One of the things we might have to do for wider adoption is to write sensible documentation what the method does so it's easy for the developers to consume them.

Node Abstraction

Following needs to be done to get a working cluster with membership in place.

Build a Node abstraction
Compose ~~membership~~, partitioner & services as part of it.
When the node is started, it should expose a listen port & should be able to handle GET / PUT requests.

Anti Entropy for replica reconsilation

References

Replication support - Part 2

Would we need to support async replication?
- HintedHandoff
- Read Repair

Use versions in the membership update event

Use cluster versioning -- Node A was part of the cluster at version V.

Folks using Raft tend to use term as the version number (since it's unique).

Feedback from @sriram-srinivasan during https://dsconf.in/ talk.

Data Rebalancing

Add ability for the cluster to

scale out - addition of new nodes
scale down - nodes going down.
Anti Entropy - Refer #50 for more details

Tasks

It's relatively easy to do data rebalancing with ConsistentHashRing, but given the generic nature of the RoutingStrategy. We need to decide on the interactions of Rebalancer with that of RoutingStrategy
Implement / integrate membership with the Server. Once we've Membership, think about how would it integrate with Partitioner / RoutingStrategy for maintaining the list of nodes.

Dependencies

While re-balancing we need to know what keys to migrate which needs an Anti-entropy implementation #50

Test Issue

Checking Github notifications.

Lets use sbt instead of pom - less verbosity :)

Setup Release workflow on Travis

Now that we've moved away from SnapCI, we need a way to perform auto releases from travis. With SnapCI it was easy as a single click of a stage in the pipeline. We need a different mechanism to handle this on the travis world.

Store Written timestamp in VersionedStore

We need an ability to store written timestamp of the record which we can use for auto purging old versions in an upstream system.

Integration/Property based Testing

We should consider to write integration and/or property based tests for our components. Check out the below references for some motivation and contexts.

Ref

Support Streaming based aggregation

Current implementation is for Unary methods only. Need to add support for streaming aggregation as well.

Look at Error handling

Today we don't do any try .. catch anywhere in the project. gRPC seems to be throwing a lot of RuntimeException at different places. We need to track each of them and address them. This is more of an epic which will be on-going for a while.

For any PRs or commits that has discussions related to this, please tag them with this issue so it's automatically tracked.

Completely Consistent Reads

With #27 (and #23 ) we now have a pluggable replication in place. And we've special interceptors for writes, so it should be easy to do the same for reads as well.

We can do a digest query on all the nodes for starts to give out a very consistent response.

ToDOs

Membership using Atomix
Implementation a simple HandleOrForward using consistent hashing - #2 and #11
Build replication
Build rebalancing (and dynamic scale out)

Fix the VRecord.key in VersionedStore

Today we set the key of VRecord as V_. Now this is a little useless because if we want to get back the data from versionScanner().scan. It wouldn't be possible. We need to make sure we return the actual key in the VRecord along with the list of versions for that key.

Chained Write Replication

As part of #27 and #23 we get a SequentialReplication strategy, we should look to improvise it to support chained writes across the replica nodes.

Related - #30

Exclude the active node during aggregation

While doing scatter we send requests to all the nodes in the cluster (incl. the node that's doing the scatter).

Setup release to maven central

Publish the suuchi project to maven central.

Make the replicator pluggable

As part of #34 we seem to have Synchronous ParallelReplication support. But the withReplication on Server still has only SequentialReplicator hard-coded. We have to make it pluggable.

Namespacing & Scoping

Ensure that all classes have appropriate access control for us to split them later into modules.

HandleOrForward functionality.

Any operation(get or put) on the cluster, should translate to a HandleOrForward operation.
based on the key, the node should be able to localize if it can handle the request or it should be able to forward the request an appropriate node.

@ashwanthkumar below is my stab at at the contract. Thoughts?
def handleOrForward(key: K): Node

PS: Is this a good way to write down issues.

Use Github labels and milestones and tag all existing issues and PR

House keeping to enforce certain practices in the project.

Partitioner implementation

trait Partitioner {
   def shard(r: Request): Array[Byte]
   def find(key: Array[Byte], replicaCount: Int): List[NodeInfo]
   def find(key: Array[Byte]) = find(key, 1)
}

Implement in-memory store

Implement an in-memory store that support the operations of the Store trait.

trait Store {
  def get(key: Array[Byte]) : Array[Byte]
  def put(key: Array[Byte], data: Array[Byte]) : Unit 
}

Utility to convert Store -> ShardedStore

For users that have been using Store - a single store instance directly (like RocksDB) and want to migrate to ShardedStore (introduced as part of #53) because you can effectively parallelize the writes across multiple stores in parallel (on the same node).

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.