Giter VIP home page Giter VIP logo

Comments (9)

whilo avatar whilo commented on May 18, 2024 1

Yes, it should be improved, but the improvement mostly depends on how konserve-pg or konserve-jdbc performs at the moment. A reason for the slowness here was also the IO after each operation, which is necessary if you call transact for each entity in separate.

from datahike.

whilo avatar whilo commented on May 18, 2024 1

This should be resolved now with the persistent-sorted-set backend. I will close for now, reopen if it persists.

from datahike.

whilo avatar whilo commented on May 18, 2024

Hi, thanks for reporting this issue. Can you provide a small snippet to reproduce?

from datahike.

boxxxie avatar boxxxie commented on May 18, 2024

i'll whip up something soon.
i'm using test-check generators to populate, but i'll need to see if performance is similar with duplicate items

from datahike.

kordano avatar kordano commented on May 18, 2024

I created a small performance example inside datahike to measure the transaction time with different transaction sizes. Currently I'm running that and checking against in-memory, various backends, and datomic. I'll let you know, what I found there.

from datahike.

bhurlow avatar bhurlow commented on May 18, 2024

nice work on the latest release! I was playing around with the latest datahike on a plane today, also found transact to be suspiciously slow:

(defn t1 [prefix times]
  (dotimes [n times]
    (d/transact pg-conn [{:name (str prefix n)}])))

(time (t1 "baz6" 1000))
"Elapsed time: 76688.964706 msecs"

I'm using the konserve-pg backend in this case. I wonder if ya'll have plans for a transaction WAL of some kind, as re-indexing on each write is probably a bottleneck (maybe hitchhiker has this)

from datahike.

whilo avatar whilo commented on May 18, 2024

Yes, we are planning for that. But if I remember correctly datomic free also was synchronizing each transaction and I had to manually batch transactions. I had done that here for replikativ:

https://github.com/replikativ/twitter-collector/blob/master/src/twitter_collector/core.clj#L38

Could you work around that issue for now?

from datahike.

whilo avatar whilo commented on May 18, 2024

This issue should be solved with 0.2.1, previously we were printing the database after each transaction, which is the default behaviour of DataScript, but not reasonable for a durable database and can give unfortunate performance if you evaluate transact in the REPL.

Looking at your example again closely, you have to keep in mind that transact effectively ensures a roundtrip to the storage medium. Since this takes at least some milliseconds up to 10s of milliseconds in many storage systems, you have to wait for this latency times the number of transactions. While we plan to provide some internal buffering in the transactor, it is still crucial to batch as big chunks as possible in the process invoking it, because the transactor itself can require a network roundtrip in future Datahike distributed setups.

@boxxxie, @bhurlow can you recheck whether the issue is still there for you?

from datahike.

TimoKramer avatar TimoKramer commented on May 18, 2024

Can this be closed? The upsert improvement should have resolved that, right?
#201

from datahike.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.