Giter VIP home page Giter VIP logo

es4j's People

Contributors

adymitruk avatar bsrk avatar ddvinyaninov avatar fluffypony avatar gitter-badger avatar hastebrot avatar yrashk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

es4j's Issues

Problem: PostgreSQL database provisioning contention

PostgreSQL-backed journal & indices will create necessary tables and types without consideration for others — meaning there might be a conflict and two nodes might attempt to do the same thing at the same time.

Proposed solution: ensure these storages are locking these initialization routines. PostgreSQL advisory locks can be used to easily accomplish this.

Problem: postgresql navigable index over a timestamp has a composite key

(Since the timestamp is not really comparable as is, it needs to be transparently re-encoded to a comparable value in order to work)

eventsourcing=# \d+ index_v1_9c22ea3e2733ba23f1e23c8577698827575a4457_navigable;
                   Table "public.index_v1_9c22ea3e2733ba23f1e23c8577698827575a4457_navigable"
 Column |                        Type                        | Modifiers | Storage  | Stats target | Description
--------+----------------------------------------------------+-----------+----------+--------------+-------------
 key    | layout_v1_c47d416193cba63554e5a69ca701973bc3e44172 |           | extended |              |
 object | uuid                                               |           | plain    |              |
Indexes:
    "index_v1_9c22ea3e2733ba23f1e23c8577698827575a4457_navigable_key" btree (key)
    "index_v1_9c22ea3e2733ba23f1e23c8577698827575a4457_navigable_obj" btree (object)

eventsourcing=# \d+ layout_v1_c47d416193cba63554e5a69ca701973bc3e44172
Composite type "public.layout_v1_c47d416193cba63554e5a69ca701973bc3e44172"
     Column     |  Type  | Modifiers | Storage | Description
----------------+--------+-----------+---------+-------------
 logicalCounter | bigint |           | plain   |
 logicalTime    | bigint |           | plain   |

Problem: *Index.as interferes with Kotlin

In Kotlin, as is a reserved keyword, and leads to this:

@JvmField var X = SimpleIndex.`as` { o: TestEntity -> o.x }

as opposed to

@JvmField var X = SimpleIndex.as { o: TestEntity -> o.x }

which is far from great

Review and finalize TypeHandler's fingerprints

These fingerprints (1-2 bytes most of the time) are used for calculating Layout hash, and they don't seem to follow one predefined numbering approach right.

It is probably preferable to keep them as short as possible IF the schema layout needs to be transferred at scale (although, how big of a deal is this?).

It might make sense to rename the term altogether. Magic number?

Problem: sometimes, -args aren't or can't be enabled

It is, however, still very desirable to detect parameter names. @ParameterName is one possible workaround, but it is a bit verbose.

Proposed solution: In some cases, it might be possible to use @java.beans.ConstructorProperties. Nice thing is that (for example) Lombok generates it automatically.

Problem: unsized queries increase the overhead for non-lazy queries

When using something like a remote SQL database for indexing, we often rely on the fact that we might not be consuming the entire ResultSet to achieve a better performance profile (or not consuming it at all).

However, this only works for truly lazy queries when nothing is extra done before the ResultSet is consumed, and the overhead of consuming the next result is low. In the SQL database scenario, even if the query itself is lazy, the moment the first result is consumed, the entire query will be fired off, and if the size of the index is substantial, there will be a delay.

Proposed solution: implement a query option for sized (limited) queries (Limit(N)). It can be ignored by truly lazy indices, but SQL-based indices can use it to optimize their queries.

Problem: outgrowing cqengine

In certain aspects, we are starting to outgrow cqengine, the indexing engine behind es4j and one day we'll likely need our own indexing engine. cqengine was a great way to bootstrap and still is a good way to do what we want, but the time to upgrade will likely come. The telling signs so far are, in no particular order:

  • cqengine doesn't have support for atomic multi-index updates (this breaks index consistency, albeit for a relatively short [depending on requirements] period of time); or transactional multi-index updates
  • cqengine leaks details of its API that add a lot of noise to es4j apps. Consider Query<EntityHandle<..>> or ResultSet<EntityHandle<..>> versus Query<...> or ResultSet<...>
  • cqengine can return only primary collection's data in a join query getIndexedCollection(Car.class).retrieve(query) returns ResultSet<Car>
  • using cqengine API makes it harder to standardize our own index definition and query language standards across languages.
  • the API design of it relies on generic type information and this limits our abilities to an extent. we rely on reflected Fields to extract that information, cqengine itself does the same through typetools (which uses undocumented sun.reflect API); not sure if we can improve this significantly within Java's constraints (can we do this in Java 9?), but worth look at, at the very least.
  • limited expressibility of ordering-based queries ("find latest", etc.)

Proposed solution: implement our own indexing engine.

We can certainly learn from the experience with cqengine and take it to the next level. This issue is an open-ended "proposal collector" to figure out a longer term plan for a potential transition, or any ideas for any other way to address the problem.

(It is, of course, a significant undertaking)

Problem: querying for aggregates is expensive

Querying for things like:

  • latest (as in, biggest TIMESTAMP) event for a certain scope (say, reference=R1)
  • sum/avg/max/min for a certain scope (say, reference=R1)

is quite expensive as it involved a lot of scanning (as can be seen in https://github.com/eventsourcing/es4j/blob/master/eventsourcing-queries/src/main/java/com/eventsourcing/queries/IsLatestEntity.java and https://github.com/eventsourcing/es4j/blob/master/eventsourcing-queries/src/main/java/com/eventsourcing/queries/LatestAssociatedEntryQuery.java)

Proposed solution: designate and formalize an aggregate indexing system.

Problem: Command#events can block command publishing

If Command#events takes long time to complete (for example, making external API calls), the entire queue of commands will be blocked, because command processing is a single-thread process (originally, to avoid the complexities associated with multithread access to journal, indexing, etc.). This also holds true for massive event streams.

Proposed solutions:

  1. Add a wrapping CompletableFuture<EventStream<S>> version of #events() that will call EventStream<S> version of #events() in a) the same thread or b) in a worker thread` and wrap its result. Publishing process, therefore, should only send the command down the pipeline when the result is ready. There is still a concern of using blocking operations during stream generation.
  2. Make command publishing multi-threaded. Not yet sure how difficult would that be.

Implement schema mismatch reporter

In any system the schema of commands and events will naturally evolve. While this is typically not a problem when using MemoryJournal (the data will not survive through restarts), it is an issue with persistent journals (MVStoreJournal and others).

Every persistent journal will need to print out the discrepancies found between entities available in the journal and in runtime. Using layout's hash we can easily detect which entities in the journal are no longer covered by the code at hand.

The report should look like this:

Found 1 unrecognized entity:

    d395f099078879b005797edf7ef37f83a673bb3582056490655984d3d5b18cf8 (234 records)

This reporter should not try to attempt facilitating schema migrations in any way, it should be just a reporter.

It would be great if some meta information can be stored every time the journal is used. This way, the report can be much more readable:

Found 1 unrecognized entity:

    com.foo.bar.events.UserLoggedIn (234 records) # d395f099078879b005797edf7ef37f83a673bb3582056490655984d3d5b18cf8

Even better if the layout can be saved, too:

Found 1 unrecognized entity:

    com.foo.bar.events.UserLoggedIn (234 records) # d395f099078879b005797edf7ef37f83a673bb3582056490655984d3d5b18cf8
     login: String

Problem: using postgresql backend without a pool is nearly impossible

It's just too slow because it'll keep re-opening connections and re-scanning types.

Proposed solution 1: force pool usage. Not sure how yet — different pools are available.

Proposed solution 2: do away with dataSource.getConnection() and do some kind of connection pooling/sharing internally.

Proposed solution 3: enforce pgjdbc-ng and do make use of HikariCP underneath. Comes with a tradeoff of needing to configure that pool, but can be done with HikariConfig. Ideally needs to share the pool with indices.

Problem: Absence of local NTP server silently locks es4j up

The failure to connect to a local NTP server is hard to detect (nothing is happening, repository is not starting). Since NTP is a UDP protocol, we can't reliably tell if there's an NTP server listening. The only way to tell if we're not getting a response.

Proposed solution: throw an exception after some reasonably high delay (1s, 5s?)

HT

Hey, dont know if you've seen the projects already but there is some good efforts put into eventsourced and eventuate when it comes to eventsourcing and persistance and distribution.

Problem: eventsourcing is the whole organization

And we're using the same name for es4j for naming maven artifacts and package names, which might make it more difficult to assign proper naming to other solutions.

Proposed solution 1: rename com.eventsourcing:eventsourcing-X maven artifacts to com.eventsourcing:es4j-X and packages com.eventsourcing.* to com.eventsourcing.es4j.*

Proposed solution 2: rename com.eventsourcing:eventsourcing-X maven artifacts to org.es4j:es4j-X and packages com.eventsourcing.* to org.es4j.*

Proposed solution 3: rename com.eventsourcing:eventsourcing-X maven artifacts to com.eventsourcing:es4j-X and packages com.eventsourcing.* to org.es4j.*

Proposed solution 4: find a completely different name for es4j

Problem: writing test suites for es4j apps is difficult

Writing tests for es4j-powered apps is always a little bit of a hassle. To list a few issues:

  • making sure every test gets the proper clean slate setup
  • reducing the setup time
  • making it compact and DRY
  • (for some apps) a need to test against different journal and index engine backends.

Proposed solution: implement an eventsourcing-test module that will cover this.

Problem: impossible to define indices in Scala

According to a report by @bsrk, and subsequent verification, Scala does not support static fields in classes. It does have companion objects, however, since Scala does not consider it a class and it is therefore impossible to specify it in the @Indices(Array(MyEventIndices)) annotation.

Proposed solution: implement a ScalaObjectIndexLoader that will use Scala's own reflection, but other ideas are welcome.

Temporary workaround (for those who need one): write your index definitions in separate Java classes and refer to them in @Indices(...) annotations.

ASL 2.0 | MIT | BSD

Wondering if you would be willing to also make this available under ASL 2.0?

Problem: Min/Max queries are global to the attribute

At this moment, Min/Max queries are limited to finding the smallest or largest value of a particular attribute globally across an entity. Which means you can tell "what's the latest order" but not "what's the latest order for this person". IsLatestEntity/LatestAssociatedEntryQuery are currently being used to address this type of need, but they are very slow and cumbersome. Min/Max has been already optimized for PostgreSQL storage.

It would have been great if it was possible to fit this type query within Min/Max as an option:

// Find the last order for user with id=<id>
max(Order.TIMESTAMP, equal(Order.USER_ID, id))
// Find all last orders for all users
max(Order.TIMESTAMP, Order.USER_ID)

(Or something similar)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.