Giter VIP home page Giter VIP logo

Comments (6)

antirez avatar antirez commented on June 16, 2024

Hello @vaizki, I find this feature very interesting, it could be one of the first additions I do once the original tests will pass all and the persistence gets implemented. Thanks.

from disque-module.

antirez avatar antirez commented on June 16, 2024

@vaizki please also check #5

from disque-module.

vaizki avatar vaizki commented on June 16, 2024

Your ideas in #5 look good for synchronous multi-AZ replication but my patch was trying to resolve a different set of problems:

  • When doing a synchronous REPLICATE across AZs, the RTT between them (over the Atlantic for example) was too much for the original project's replication timers. I explained some of it in antirez/disque#189 (comment)
  • Even with the timers adjusted to prevent retries etc, a 100ms RTT will result in a maximum client command rate of 10 ADDJOBs per second if confirmation is needed from both AZ/DCs.
  • Sometimes you only have 2x AZ (my usual use case) and one of them is not available. In this case you cannot use sync replication that needs confirmation from both AZs.

So in essence, I am proposing separating desired replication and write concern in ADDJOB.

  • Desired replication defines the number of replicas / AZs that the cluster will attempt to achieve as quickly as possible
  • Write concern defines the replication that needs to be confirmed before the command returns success to the client

I did also consider a model where the ADDJOB timeout would tie into this, allowing for a client to wait for full replication to happen or if the timeout is reached and partial replication has been achieved then a "semi-success" is returned and the replication completed asynchronously asap. But these ideas did not really work with the original project and I needed to move on.

from disque-module.

antirez avatar antirez commented on June 16, 2024

@vaizki thanks, I understood the problem, but what is not completely clear to me is how useful it is a desired replication level that sometimes is not reached. Does not offer any true guarantee, it can just improve the jobs rate survival in case of failures. May still be interesting, and I didn't mean for my issue to replace the functionality of the one that you proposed. #5 is more like a step towards providing more guarantees. #4 instead attempts to improve the percentage of jobs lost that are replicated with weaker guarantees. Both have their place I guess, but #4 is more like a "best effort safety" feature.

from disque-module.

vaizki avatar vaizki commented on June 16, 2024

At least for me, doing synchronous writes to two AZ with 100ms+ RTT between them is just not an option. It will kill the performance and occasionally kill the whole application if the other AZ is not responding. It would be like writing to REDIS key and waiting for a slave (maybe running or not somewhere far away) to ack the replication. Of course I would prefer 3x AZ with minimal latency between them but this is not an available option.

An availability + performance comparison can also be drawn with RAID1 disk mirroring; if one side of the mirror is down the remaining side will still accept writes just as quickly as before but as soon as the faulty side recovers it is synced and N=2 redundancy achieved again. So for replication, it is about being able to operate in a degraded state while requesting that eventually the desired replication will be achieved.

Because Disque is a message passing system which by nature decouples producers and consumers, there is not as much worry about split-brain or diverging data in cluster members as there is with RAID1, REDIS master-slave or an ACID RDBMS cluster. Working on the message level with the immutable nature of messages guarantees that IF there is a message in this cluster member, then it has the correct content so I can process it.

And that is exactly why I want to use Disque and I need this tunable "write concern", as I need availability and partition tolerance above all else and to get that I can tolerate for example the same job being processed more than once.

from disque-module.

antirez avatar antirez commented on June 16, 2024

It's not a matter of consistency indeed, the replication factor in Disque tunes the ability to survive after failures. In that regard, to tell: give me the green light when the message was replicated 2 times but actually try to replicate it 5 times, can be seen as a minimum guarantee of durability versus optimal durability versus latency concerns. So indeed it makes sense but still the messages will be guarantee you survive a single node failure. However with this option, if there is a two nodes failure the percentage of messages that will get lost will be minor.

from disque-module.

Related Issues (10)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.