Giter VIP home page Giter VIP logo

dbus-1's Introduction

dbus

      $$\       $$\                                       
      $$ |      $$ |                                      
 $$$$$$$ |      $$$$$$$\        $$\   $$\        $$$$$$$\ 
$$  __$$ |      $$  __$$\       $$ |  $$ |      $$  _____|
$$ /  $$ |      $$ |  $$ |      $$ |  $$ |      \$$$$$$\  
$$ |  $$ |      $$ |  $$ |      $$ |  $$ |       \____$$\ 
\$$$$$$$ |      $$$$$$$  |      \$$$$$$  |      $$$$$$$  |
 \_______|      \_______/        \______/       \_______/ 

What is dbus?

dbus = distributed data bus

It is yet another lightweight versatile databus system that transfer/transform pipeline data between plugins.

dbus works by building a DAG of structured data out of the different plugins: from data input, via filter(optional), to the output.

Similar projects

  • logstash
  • flume
  • nifi
  • camel
  • beats
  • kettle
  • zapier
  • google cloud dataflow
  • canal
  • storm
  • yahoo pipes (dead)

Status

dbus is not yet a 1.0. We're writing more tests, fixing bugs, working on TODOs.

Use Case

  • mysql binlog dispatcher
  • multiple DC kafka mirror

Features

dbus supports powerful and scalable directed graphs of data routing, transformation and system mediation logic.

  • Designed for extension
    • plugin architecture
    • build your own plugins and more
    • enables rapid development and effective testing
  • Data Provenance
    • track dataflow from beginning to end
    • visualized dataflow
    • rich metrics feed into tsdb
    • online manual mediation of the dataflow
    • RESTful API
    • monitoring with alert
  • Distributed Deployment
    • shard/balance/auto rebalance
    • linear scale
  • Delivery Guarantee
    • loss tolerant
    • high throuput vs low latency
    • back pressure
  • Robustness
    • race condition detected
    • edge cases fully covered
    • network jitter tested
    • dependent components failure tested
  • Systemic Quality
    • hot reload
    • dryrun throughput 1.9M packets/s
  • Cluster Support
    • modelling borrowed from helix+kafka controller
    • currently only leader/standby with sharding, without replica
    • easy to write a distributed plugin

Getting Started

1. Installing

To start using dbus, install Go and run go get:

$ go get -u github.com/funkygao/dbus

2. Create config file

Please find sample config files in etc/ directory.

3. Run the server

$ $GOPATH/dbusd -conf $myfile

Dependencies

dbus uses zookeeper for sharding/balance/election.

Plugins

More plugins are listed under dbus-plugin.

Input

  • MysqlbinlogInput
  • KafkaInput
  • MockInput
  • StreamInput

Filter

  • MysqlbinlogFilter
  • MockFilter

Output

  • KafkaOutput
  • ESOutput
  • MockOutput
  • StreamOutput

Configuration

  • KafkaOutput async mode with batch=1024/500ms, ack=WaitForAll
  • KafkaOutput retry=3, retry.backoff=350ms
  • Mysql binlog positioner commit every 1s, channal buffer 100

FAQ

mysql binlog

  • is it totally data loss tolerant?

    if the binlog exceeds 1MB, it will be discarded(lost)

why not canal?

  • no Delivery Guarantee
  • no Data Provenance
  • no integration with kafka
  • only hot standby deployment mode, we need sharding load
  • dbus is a dataflow engine, while canal only support mysql binlog pipeline

compared with logstash

  • logstash has better ecosystem
  • dbus is cluster aware, provides delivery guarantee, data provenance

can there be more than 1 leaders at the same time?

Yes.

For example, 3 participants with 1 being the leader. Then 1 is network partitioned and zk session expires, [2, 3] found this event and re-elect 2 as new leader. Before 1 regain new zk session, [1] and [2] are leaders both. If [1] and [2] both found resources changes, they will both rebalance the cluster.

dbus uses epoch to solve this issue.

what if

  • zookeeper crash

    dbus continues to work, but Ack will not be able to persist

dbus-1's People

Contributors

funkygao avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.