Giter VIP home page Giter VIP logo

Comments (14)

krasserm avatar krasserm commented on August 29, 2024

This is discussed in #52.

from akka-persistence-cassandra.

chbatey avatar chbatey commented on August 29, 2024

Hi @krasserm assuming you're not already working on it I I'd like to update the wip-akka-2.4 branch for the API changes in 2.4-M2, if you're planning on doing it soon I'll hold off

from akka-persistence-cassandra.

krasserm avatar krasserm commented on August 29, 2024

@chbatey would be great if you could do the update. Thanks in advance for your contribution.

from akka-persistence-cassandra.

chbatey avatar chbatey commented on August 29, 2024

The interface has changed to take in a Seq of AtomicWrites, after reading through #48 I'll keep it as c* batch per AtomicWrite. This suffers from the existing issue that we could be writing to different partitions and thus no guarantee of atomicity at the c* level

from akka-persistence-cassandra.

krasserm avatar krasserm commented on August 29, 2024

That's ok for the moment. WDYT about including all AtomicWrites that go to the same partition into a single c* batch?

I plan to work on #48 combined with some other changes, such as removing partition headers (which where only needed for individual deletes). A lot of complexity in the current implementation comes from the early plugin API and can be cleaned up now.

from akka-persistence-cassandra.

chbatey avatar chbatey commented on August 29, 2024

In general batches in C* reduce performance (who knew?) as it gives a single coordinator a disproportionate amount of work to do so I'd avoid doing this. I had hoped to suggest we remove all batching and replace with multiple async writes but that completely breaks the contract with akka-persistence.

from akka-persistence-cassandra.

krasserm avatar krasserm commented on August 29, 2024

My proposal was to include all writes to the same partition into a c* batch (assuming that events of an AtomicWrite are never cross-partition). Of course, writes that go to different partitions should be included into different c* batches.

from akka-persistence-cassandra.

chbatey avatar chbatey commented on August 29, 2024

I agree this is the one use case were c* batches make sense. However you still generally get better perf from multiple writes as you make use of multiple coordinators. I have, in previous applications, made use of multiple small batches to get the best of both worlds but it is hard to come up with a general rule.

So I don't think it is worth the complexity of losing the 1:1 mapping between AtomicWrites:C* statements but if we do want to go ahead we should add a max batch size as big batches lead to unhappy c* nodes.

from akka-persistence-cassandra.

ktoso avatar ktoso commented on August 29, 2024

Some background info on the changes:
An AtomicWrite will only contain multiple events if the user explicitly called persistAll(these, atomically, please), otherwise when persist(a); persist(b) is called those are two separate AtomicWrites, so the atomic write really represents the user intent/need that these need to go together, and also we don't really encourage this style (we don't encourage persistAll).

So that's the only requirement Akka-wise here, other things can be stored without batches if you think it performs better - for an AtomicWrite that contains only one event it seems it would be good to avoid creating the batch - so you'd inspect the atomic write that comes in and decide; right?

from akka-persistence-cassandra.

krasserm avatar krasserm commented on August 29, 2024

However you still generally get better perf from multiple writes as you make use of multiple coordinators.

I thought there's only a single coordinator responsible for a given partition. So multiple writes to the same partition should always go to the same coordinator. Am I missing something?

from akka-persistence-cassandra.

chbatey avatar chbatey commented on August 29, 2024

@krasserm It depends on driver version/config, at a minimum there'll be N where N is the replication factor (which is nearly always 3 per DC), there's no master replica as they are all equal. If you switch off TokenAwareLoadBalancer or you are using a slightly older driver then any node in the cluster can be a coordinator for any partition. So basically a c* node has two roles: coordination and data storage.

It nearly always makes sense (and is now the default) that you always use a node for coordination that has the data you are trying to read/write. The only time I've turned this off is for "hot" partitions which can be helped if you stop using the nodes that own the "hot" partitions as coordinators.

@ktoso thanks for the clarification! I think given we're not expecting large AtomicWrites then merging them at the c* level based on partition is a good idea.

from akka-persistence-cassandra.

ktoso avatar ktoso commented on August 29, 2024

Which field is that @chbatey (that is included in PK here)? You mean persistence_id?

from akka-persistence-cassandra.

chbatey avatar chbatey commented on August 29, 2024

[edited - checked with c* dev]

Re-names are fine even if it is the primary key. However...

Tyler Hobbs
17:05
prepared statements will be invalidated (although there was a bug with that that was fixed semi-recently)
and, of course, if you're explicitly mentioning the column name anywhere (probably almost all queries do), it needs to be changed
and it's tough to change all statements in your codebase at just the right time
especially since the schema change takes a while to propagate across the cluster
so generally, you're going to need to stop writes for a bit

I am not sure how people have akka cluster/persistence deployed and whether they expect 0 downtime rolling upgrades so up to you guys :)

from akka-persistence-cassandra.

krasserm avatar krasserm commented on August 29, 2024

Superseded by #64

from akka-persistence-cassandra.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.