Light

'2.4-snapshot' incompatible with 'master' schema about akka-persistence-cassandra HOT 14 CLOSED

krasserm commented on August 29, 2024

'2.4-snapshot' incompatible with 'master' schema

from akka-persistence-cassandra.

Comments (14)

krasserm commented on August 29, 2024

This is discussed in #52.

from akka-persistence-cassandra.

chbatey commented on August 29, 2024

Hi @krasserm assuming you're not already working on it I I'd like to update the wip-akka-2.4 branch for the API changes in 2.4-M2, if you're planning on doing it soon I'll hold off

from akka-persistence-cassandra.

krasserm commented on August 29, 2024

@chbatey would be great if you could do the update. Thanks in advance for your contribution.

from akka-persistence-cassandra.

chbatey commented on August 29, 2024

The interface has changed to take in a Seq of AtomicWrites, after reading through #48 I'll keep it as c* batch per AtomicWrite. This suffers from the existing issue that we could be writing to different partitions and thus no guarantee of atomicity at the c* level

from akka-persistence-cassandra.

krasserm commented on August 29, 2024

That's ok for the moment. WDYT about including all AtomicWrites that go to the same partition into a single c* batch?

I plan to work on #48 combined with some other changes, such as removing partition headers (which where only needed for individual deletes). A lot of complexity in the current implementation comes from the early plugin API and can be cleaned up now.

from akka-persistence-cassandra.

chbatey commented on August 29, 2024

In general batches in C* reduce performance (who knew?) as it gives a single coordinator a disproportionate amount of work to do so I'd avoid doing this. I had hoped to suggest we remove all batching and replace with multiple async writes but that completely breaks the contract with akka-persistence.

from akka-persistence-cassandra.

krasserm commented on August 29, 2024

My proposal was to include all writes to the same partition into a c* batch (assuming that events of an AtomicWrite are never cross-partition). Of course, writes that go to different partitions should be included into different c* batches.

from akka-persistence-cassandra.

chbatey commented on August 29, 2024

I agree this is the one use case were c* batches make sense. However you still generally get better perf from multiple writes as you make use of multiple coordinators. I have, in previous applications, made use of multiple small batches to get the best of both worlds but it is hard to come up with a general rule.

So I don't think it is worth the complexity of losing the 1:1 mapping between AtomicWrites:C* statements but if we do want to go ahead we should add a max batch size as big batches lead to unhappy c* nodes.

from akka-persistence-cassandra.

ktoso commented on August 29, 2024

Some background info on the changes:
An AtomicWrite will only contain multiple events if the user explicitly called persistAll(these, atomically, please), otherwise when persist(a); persist(b) is called those are two separate AtomicWrites, so the atomic write really represents the user intent/need that these need to go together, and also we don't really encourage this style (we don't encourage persistAll).

So that's the only requirement Akka-wise here, other things can be stored without batches if you think it performs better - for an AtomicWrite that contains only one event it seems it would be good to avoid creating the batch - so you'd inspect the atomic write that comes in and decide; right?

from akka-persistence-cassandra.

krasserm commented on August 29, 2024

However you still generally get better perf from multiple writes as you make use of multiple coordinators.

I thought there's only a single coordinator responsible for a given partition. So multiple writes to the same partition should always go to the same coordinator. Am I missing something?

from akka-persistence-cassandra.

chbatey commented on August 29, 2024

@krasserm It depends on driver version/config, at a minimum there'll be N where N is the replication factor (which is nearly always 3 per DC), there's no master replica as they are all equal. If you switch off TokenAwareLoadBalancer or you are using a slightly older driver then any node in the cluster can be a coordinator for any partition. So basically a c* node has two roles: coordination and data storage.

It nearly always makes sense (and is now the default) that you always use a node for coordination that has the data you are trying to read/write. The only time I've turned this off is for "hot" partitions which can be helped if you stop using the nodes that own the "hot" partitions as coordinators.

@ktoso thanks for the clarification! I think given we're not expecting large AtomicWrites then merging them at the c* level based on partition is a good idea.

from akka-persistence-cassandra.

ktoso commented on August 29, 2024

Which field is that @chbatey (that is included in PK here)? You mean persistence_id?

from akka-persistence-cassandra.

chbatey commented on August 29, 2024

[edited - checked with c* dev]

Re-names are fine even if it is the primary key. However...

Tyler Hobbs
17:05
prepared statements will be invalidated (although there was a bug with that that was fixed semi-recently)
and, of course, if you're explicitly mentioning the column name anywhere (probably almost all queries do), it needs to be changed
and it's tough to change all statements in your codebase at just the right time
especially since the schema change takes a while to propagate across the cluster
so generally, you're going to need to stop writes for a bit

I am not sure how people have akka cluster/persistence deployed and whether they expect 0 downtime rolling upgrades so up to you guys :)

from akka-persistence-cassandra.

krasserm commented on August 29, 2024

Superseded by #64

from akka-persistence-cassandra.

Related Issues (20)

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.