Giter VIP home page Giter VIP logo

Comments (15)

hyc avatar hyc commented on June 13, 2024 2

@vtnerd is correct in his first reply #9193 (comment). Because of the first creation_gate.test_and_set() when mdb_txn_safe begins, prevent_new_txns() cannot proceed until the creation_gate.clear(). There is no race there.

from monero.

0xFFFC0000 avatar 0xFFFC0000 commented on June 13, 2024

In case you use blockchain.cpp to interact with db_lmdb.cpp, 9181 does fix this. Since no read transaction can start while there is a write transaction going on, and write transactions cannot start until acquires an exclusive lock on db.

from monero.

vtnerd avatar vtnerd commented on June 13, 2024

@hinto-janai I believe you mis-read the code:

while (creation_gate.test_and_set());
   num_active_txns++;
   creation_gate.clear();

Thread1 is blocked from advancing until creation_gate.clear() is called. This is basically just a spin mutex protecting num_active_txns.

The patch by @0xFFFC0000 is likely overkill, and not needed (that's just my knee-jerk reaction at this point).

from monero.

selsta avatar selsta commented on June 13, 2024

@vtnerd this isn't related to this specific issue but sometimes when there is heavy RPC traffic it's possible that the node can't add new blocks and falls behind due to the current locking mechanism. moneromooo suggested a while ago that a rw lock would help here.

from monero.

vtnerd avatar vtnerd commented on June 13, 2024

And to describe why the example doesn't work as @hinto-janai suggested - lmdb_txn_begin() should have creation_gate as true, so Thread1 is blocked in prevent_new_txs because the prior value is true until after num_active_txns++ is modified.

from monero.

vtnerd avatar vtnerd commented on June 13, 2024

@vtnerd this isn't related to this specific issue but sometimes when there is heavy RPC traffic it's possible that the node can't add new blocks and falls behind due to the current locking mechanism. moneromooo suggested a while ago that a rw lock would help here.

Yes, this may have some resource starvation issues since it's a dirty spin lock instead of some queued lock. But there is no data race afaik.

I'm not certain that #9181 fixes this though, I think the original spin lock code would have to be removed as well.

from monero.

0xFFFC0000 avatar 0xFFFC0000 commented on June 13, 2024

The patch by @0xFFFC0000 is likely overkill, and not needed (that's just my knee-jerk reaction at this point).

I believe if you read the PR carefully, you will realize that PR is addressing another issue. As a side-effect, it does solve the data race problem on lmdb_db too if you are accessing lmdb via blockchain.cpp APIs.

If you want to go into more technical depth, I am happy to do it :)

from monero.

vtnerd avatar vtnerd commented on June 13, 2024

I'm not certain that #9181 fixes this though, I think the original spin lock code would have to be removed as well.

Scratch that, #9181 probably works, but we just wasting cycles on atomic operations.

from monero.

vtnerd avatar vtnerd commented on June 13, 2024

@0xFFFC0000 well there is no data race where @hinto-janai describes, so you are saying there is yet another one? I'm doubting this as well, but sure why not enlighten me.

from monero.

0xFFFC0000 avatar 0xFFFC0000 commented on June 13, 2024

@0xFFFC0000 well there is no data race where @hinto-janai describes, so you are saying there is yet another one? I'm doubting this as well, but sure why not enlighten me.

I did specifically talk about rw access from blockchain.cpp.

About the data race, directly from lmdb_db, as hinto explains in his comment, I have not tried to prove (or disprove) it either.

Keep in mind, if you are interacting with lmdb_db via blockchain.cpp, that data race cannot happen due to rwlock.

Though I am tracking you and @hinto-janai discussion, to understand the situation.

from monero.

vtnerd avatar vtnerd commented on June 13, 2024

And sorry for being a bit rude, I'm just crabby that I have to review this DB change :/

from monero.

0xFFFC0000 avatar 0xFFFC0000 commented on June 13, 2024

And sorry for being a bit rude, I'm just crabby that I have to review this DB change :/

No no, I didn't even realize or notice anything rude from you :)

We were just discussing technical matters as always. Don't mention it at all. Looking forward to continuing our discussion. Because (possibility of) data race is a multi-level issue, we have to address it.

from monero.

vtnerd avatar vtnerd commented on June 13, 2024

@0xFFFC0000 is this about the potential starvation issue that @selsta mentioned or some other data race? I still don't know why #9181 was created (perhaps the discussion should move to that PR). It sounds like #9181 was not created due to the specific issue referenced by @hinto-janai in this issue.

from monero.

0xFFFC0000 avatar 0xFFFC0000 commented on June 13, 2024

It sounds like #9181 was not created due to the specific issue referenced by @hinto-janai in this issue.

Exactly, the 9181 is for a different issue. But a side-effect of that locking is, as I mentioned in my first comment, if there was a data race in lmdb_db, it would've not happened if you were accessing it via Blockchain.cpp APIs, since it does have its own locking mechanism.

I am available to discuss it on the PR page.

from monero.

hinto-janai avatar hinto-janai commented on June 13, 2024

@vtnerd Ah you are correct. I was looking at lmdb_resized() where that constructor doesn't run so the 2 atomics being checked appear like an ABA problem. Although, every transaction (in db_lmdb.cpp at least) seems to be guarded by mdb_txn_safe at some point in the call stack.

There's BlockchainLMDB::do_resize() which doesn't use mdb_txn_safe but seems like that's guarded by a lock.

from monero.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.