eventsourcing / es4j Goto Github PK

View Code? Open in Web Editor NEW

407.0 407.0 27.0 1.64 MB

Event capture and querying framework for Java

Home Page: http://eventsourcing.com

License: Mozilla Public License 2.0

Java 99.28% JavaScript 0.17% PLpgSQL 0.31% Kotlin 0.24%

database event-sourcing java

es4j's People

Contributors

Stargazers

Watchers

es4j's Issues

Problem: PostgreSQLJournal leaking connections

This is most likely happening due to npgall/cqengine#74 — that said, we do have a workaround for this — if the iterator was actually finished, it should close.

Problem: PostgreSQL database provisioning contention

PostgreSQL-backed journal & indices will create necessary tables and types without consideration for others — meaning there might be a conflict and two nodes might attempt to do the same thing at the same time.

Proposed solution: ensure these storages are locking these initialization routines. PostgreSQL advisory locks can be used to easily accomplish this.

Problem: postgresql navigable index over a timestamp has a composite key

(Since the timestamp is not really comparable as is, it needs to be transparently re-encoded to a comparable value in order to work)

eventsourcing=# \d+ index_v1_9c22ea3e2733ba23f1e23c8577698827575a4457_navigable;
                   Table "public.index_v1_9c22ea3e2733ba23f1e23c8577698827575a4457_navigable"
 Column |                        Type                        | Modifiers | Storage  | Stats target | Description
--------+----------------------------------------------------+-----------+----------+--------------+-------------
 key    | layout_v1_c47d416193cba63554e5a69ca701973bc3e44172 |           | extended |              |
 object | uuid                                               |           | plain    |              |
Indexes:
    "index_v1_9c22ea3e2733ba23f1e23c8577698827575a4457_navigable_key" btree (key)
    "index_v1_9c22ea3e2733ba23f1e23c8577698827575a4457_navigable_obj" btree (object)

eventsourcing=# \d+ layout_v1_c47d416193cba63554e5a69ca701973bc3e44172
Composite type "public.layout_v1_c47d416193cba63554e5a69ca701973bc3e44172"
     Column     |  Type  | Modifiers | Storage | Description
----------------+--------+-----------+---------+-------------
 logicalCounter | bigint |           | plain   |
 logicalTime    | bigint |           | plain   |

Problem: *Index.as interferes with Kotlin

In Kotlin, as is a reserved keyword, and leads to this:

@JvmField var X = SimpleIndex.`as` { o: TestEntity -> o.x }

as opposed to

@JvmField var X = SimpleIndex.as { o: TestEntity -> o.x }

which is far from great

Review and finalize TypeHandler's fingerprints

These fingerprints (1-2 bytes most of the time) are used for calculating Layout hash, and they don't seem to follow one predefined numbering approach right.

It is probably preferable to keep them as short as possible IF the schema layout needs to be transferred at scale (although, how big of a deal is this?).

It might make sense to rename the term altogether. Magic number?

Problem: sometimes, -args aren't or can't be enabled

It is, however, still very desirable to detect parameter names. @ParameterName is one possible workaround, but it is a bit verbose.

Proposed solution: In some cases, it might be possible to use @java.beans.ConstructorProperties. Nice thing is that (for example) Lombok generates it automatically.

Implement MVStore-backed RadixTreeIndex

Problem: unsized queries increase the overhead for non-lazy queries

When using something like a remote SQL database for indexing, we often rely on the fact that we might not be consuming the entire ResultSet to achieve a better performance profile (or not consuming it at all).

However, this only works for truly lazy queries when nothing is extra done before the ResultSet is consumed, and the overhead of consuming the next result is low. In the SQL database scenario, even if the query itself is lazy, the moment the first result is consumed, the entire query will be fired off, and if the size of the index is substantial, there will be a delay.

Proposed solution: implement a query option for sized (limited) queries (Limit(N)). It can be ignored by truly lazy indices, but SQL-based indices can use it to optimize their queries.

Persistence-checking index tests

restart survival
auto-indexing on index addition after a restart

Implement MVStore-backed ReversedRadixTreeIndex

Problem: outgrowing cqengine

In certain aspects, we are starting to outgrow cqengine, the indexing engine behind es4j and one day we'll likely need our own indexing engine. cqengine was a great way to bootstrap and still is a good way to do what we want, but the time to upgrade will likely come. The telling signs so far are, in no particular order:

cqengine doesn't have support for atomic multi-index updates (this breaks index consistency, albeit for a relatively short [depending on requirements] period of time); or transactional multi-index updates
cqengine leaks details of its API that add a lot of noise to es4j apps. Consider Query<EntityHandle<..>> or ResultSet<EntityHandle<..>> versus Query<...> or ResultSet<...>
cqengine can return only primary collection's data in a join query getIndexedCollection(Car.class).retrieve(query) returns ResultSet<Car>
using cqengine API makes it harder to standardize our own index definition and query language standards across languages.
the API design of it relies on generic type information and this limits our abilities to an extent. we rely on reflected Fields to extract that information, cqengine itself does the same through typetools (which uses undocumented sun.reflect API); not sure if we can improve this significantly within Java's constraints (can we do this in Java 9?), but worth look at, at the very least.
limited expressibility of ordering-based queries ("find latest", etc.)

Proposed solution: implement our own indexing engine.

We can certainly learn from the experience with cqengine and take it to the next level. This issue is an open-ended "proposal collector" to figure out a longer term plan for a potential transition, or any ideas for any other way to address the problem.

(It is, of course, a significant undertaking)

Problem: querying for aggregates is expensive

Querying for things like:

latest (as in, biggest TIMESTAMP) event for a certain scope (say, reference=R1)
sum/avg/max/min for a certain scope (say, reference=R1)

is quite expensive as it involved a lot of scanning (as can be seen in https://github.com/eventsourcing/es4j/blob/master/eventsourcing-queries/src/main/java/com/eventsourcing/queries/IsLatestEntity.java and https://github.com/eventsourcing/es4j/blob/master/eventsourcing-queries/src/main/java/com/eventsourcing/queries/LatestAssociatedEntryQuery.java)

Proposed solution: designate and formalize an aggregate indexing system.

Add support for specifying indices with a quantizer

Problem: Command#events can block command publishing

If Command#events takes long time to complete (for example, making external API calls), the entire queue of commands will be blocked, because command processing is a single-thread process (originally, to avoid the complexities associated with multithread access to journal, indexing, etc.). This also holds true for massive event streams.

Proposed solutions:

Add a wrapping CompletableFuture<EventStream<S>> version of #events() that will call EventStream<S> version of #events() in a) the same thread or b) in a worker thread` and wrap its result. Publishing process, therefore, should only send the command down the pipeline when the result is ready. There is still a concern of using blocking operations during stream generation.
Make command publishing multi-threaded. Not yet sure how difficult would that be.

Implement MVStore-backed SuffixTreeIndex

Write tests for comparable serialization

Implement schema mismatch reporter

In any system the schema of commands and events will naturally evolve. While this is typically not a problem when using MemoryJournal (the data will not survive through restarts), it is an issue with persistent journals (MVStoreJournal and others).

Every persistent journal will need to print out the discrepancies found between entities available in the journal and in runtime. Using layout's hash we can easily detect which entities in the journal are no longer covered by the code at hand.

The report should look like this:

Found 1 unrecognized entity:

    d395f099078879b005797edf7ef37f83a673bb3582056490655984d3d5b18cf8 (234 records)

This reporter should not try to attempt facilitating schema migrations in any way, it should be just a reporter.

It would be great if some meta information can be stored every time the journal is used. This way, the report can be much more readable:

Found 1 unrecognized entity:

    com.foo.bar.events.UserLoggedIn (234 records) # d395f099078879b005797edf7ef37f83a673bb3582056490655984d3d5b18cf8

Even better if the layout can be saved, too:

Found 1 unrecognized entity:

    com.foo.bar.events.UserLoggedIn (234 records) # d395f099078879b005797edf7ef37f83a673bb3582056490655984d3d5b18cf8
     login: String

Problem: using postgresql backend without a pool is nearly impossible

It's just too slow because it'll keep re-opening connections and re-scanning types.

Proposed solution 1: force pool usage. Not sure how yet — different pools are available.

Proposed solution 2: do away with dataSource.getConnection() and do some kind of connection pooling/sharing internally.

Proposed solution 3: enforce pgjdbc-ng and do make use of HikariCP underneath. Comes with a tradeoff of needing to configure that pool, but can be done with HikariConfig. Ideally needs to share the pool with indices.

Constructor-based non-readonly Layout

Allow Layout to produce a non-readonly instance for classes without matching setters, but a matching constructor

Problem: Absence of local NTP server silently locks es4j up

The failure to connect to a local NTP server is hard to detect (nothing is happening, repository is not starting). Since NTP is a UDP protocol, we can't reliably tell if there's an NTP server listening. The only way to tell if we're not getting a response.

Proposed solution: throw an exception after some reasonably high delay (1s, 5s?)

Implement serialized command processing in DisruptorCommandConsumer

Problem: CommandTerminatedExceptionally does not follow RFC 9/RIG layout

http://rfc.eventsourcing.com/spec:9/RIG/#CommandTerminatedExceptionally

Implement MVStore-backed NavigableIndex

Proposal: use SonarQube for code analysis

I would like to propose to use http://docs.sonarqube.org/display/HOME/SonarQube+Platform for code analysis.
For IntelliJ - Sonar Lint http://www.sonarlint.org/intellij/.
It can be done as part of CI pipeline and will improve quality of the code.

HT

Hey, dont know if you've seen the projects already but there is some good efforts put into eventsourced and eventuate when it comes to eventsourcing and persistance and distribution.

Add support for BigDecimal in Layout

Consider using ReflectASM for faster access to properties (eventchain-layout)

ReflectASM is a very small Java library that provides high performance reflection by using code generation. An access class is generated to set/get fields, call methods, or create a new instance. The access class uses bytecode rather than Java's reflection, so it is much faster. It can also access primitive fields via bytecode to avoid boxing.

https://github.com/EsotericSoftware/reflectasm

Problem: eventsourcing is the whole organization

And we're using the same name for es4j for naming maven artifacts and package names, which might make it more difficult to assign proper naming to other solutions.

Proposed solution 1: rename com.eventsourcing:eventsourcing-X maven artifacts to com.eventsourcing:es4j-X and packages com.eventsourcing.* to com.eventsourcing.es4j.*

Proposed solution 2: rename com.eventsourcing:eventsourcing-X maven artifacts to org.es4j:es4j-X and packages com.eventsourcing.* to org.es4j.*

Proposed solution 3: rename com.eventsourcing:eventsourcing-X maven artifacts to com.eventsourcing:es4j-X and packages com.eventsourcing.* to org.es4j.*

Proposed solution 4: find a completely different name for es4j

Command that raises an exception should still be journalled

However, it should have a corresponding CommandThrownException event, and the original exception should still be thrown as it should be now.

Problem: org.h2.mvstore.db missing requirement in OSGi

See h2database/h2database#306

Automatic bintray publishing doesn't work

Despite publish = true in https://github.com/eventsourcing/es4j/blob/master/build.gradle#L74 stopped auto-publishing maven packages and they have to be published manually for now.

Problem: writing test suites for es4j apps is difficult

Writing tests for es4j-powered apps is always a little bit of a hassle. To list a few issues:

making sure every test gets the proper clean slate setup
reducing the setup time
making it compact and DRY
(for some apps) a need to test against different journal and index engine backends.

Proposed solution: implement an eventsourcing-test module that will cover this.

Problem: impossible to define indices in Scala

According to a report by @bsrk, and subsequent verification, Scala does not support static fields in classes. It does have companion objects, however, since Scala does not consider it a class and it is therefore impossible to specify it in the @Indices(Array(MyEventIndices)) annotation.

Proposed solution: implement a ScalaObjectIndexLoader that will use Scala's own reflection, but other ideas are welcome.

Temporary workaround (for those who need one): write your index definitions in separate Java classes and refer to them in @Indices(...) annotations.

Problem: doesn't support Timestamp type

See RFC1/ELF and RFC2/BES (added in eventsourcing/rfc@3ccf130)

Bintray: eventsourcing-inmem is missing

I can't use the dependency "com.eventsourcing:eventsourcing-inmem:0.3.1" because it is missing on the Bintray Maven repository [1].

Work-around: Deploy the dependency to a local maven repository.

[1] http://dl.bintray.com/eventsourcing/maven/com/eventsourcing/

ASL 2.0 | MIT | BSD

Wondering if you would be willing to also make this available under ASL 2.0?

Implement MVStore-backed InvertedRadixTreeIndex

Implement MVStore-backed CompoundIndex

Make Layout use Kryo for serialization

(Only if possible, first indications is that it is not)

Problem: Min/Max queries are global to the attribute

At this moment, Min/Max queries are limited to finding the smallest or largest value of a particular attribute globally across an entity. Which means you can tell "what's the latest order" but not "what's the latest order for this person". IsLatestEntity/LatestAssociatedEntryQuery are currently being used to address this type of need, but they are very slow and cumbersome. Min/Max has been already optimized for PostgreSQL storage.

It would have been great if it was possible to fit this type query within Min/Max as an option:

// Find the last order for user with id=<id>
max(Order.TIMESTAMP, equal(Order.USER_ID, id))
// Find all last orders for all users
max(Order.TIMESTAMP, Order.USER_ID)

(Or something similar)

eventsourcing / es4j Goto Github PK

es4j's People

Contributors

Stargazers

Watchers

Forkers

es4j's Issues

Recommend Projects

Recommend Topics

Recommend Org