Giter VIP home page Giter VIP logo

Comments (6)

chanks avatar chanks commented on August 19, 2024

Yeah, I considered this when I was designing the system. I picked the int8 variant because I wanted job_id to be a bigserial in order to cleanly support long-lived and active tables (and job tables tend to be both long-lived and active). There's a risk of needlessly delaying other code that uses advisory locks, or even deadlocking in rare cases, but Postgres errors appropriately in that event. And besides, as near as I can tell, advisory locks are rarely used anyway, especially in the Ruby community.

Even with two job tables in the same database, the worst that could happen is if there happen to be jobs in each table that share the same job_id, then one won't be able to start while the other is still in progress, which may be inconvenient on rare occasions, but isn't catastrophic.

All that said, I'd like to avoid even those rare issues, and the idea of namespacing using the table's oid is an elegant one. My main issues are:

  • I've only just now skimmed the PG documentation on this, but I don't see a simple bitwise mechanism to take the lowest-order bits of an int8 job_id to use as an int4 key.
  • If that were possible, we'd lose the ability to scan the currently held advisory locks and unambiguously match them up to their corresponding jobs, as we do in the queue inspection queries. Although, I suppose that if you do need multiple job tables, you're not really able to do that currently anyway.

Brainstorming a bit, I suppose we could achieve that safely if we switched job_id to be an int4, and set the CYCLE option on the job_id sequence. I'm not sure how to write a migration that would convert current int8 job_ids to int4, but even if it could be done, it introduces the risk of having multiple jobs with the same job_id in the same table (a very active table with some jobs that error repeatedly), which raises the ambiguity issue again.

from que.

MSch avatar MSch commented on August 19, 2024

Thanks for the positive initial feedback!

  1. http://www.postgresql.org/docs/current/static/functions-math.html lists bitwise operations. Am I misunderstanding something something?
  2. I didn't consider the need to migrate the table since I'm not using que right now, but I'm thinking select min(id), max(id) from que_jobs, then shifting all ids if they fit within int4 (aborting otherwise), then changing the type of the column, then updating the sequence?

Another option would be to support both int8 without namespacing the locks and int4 with namespacing (defaulting to int4 IMO).

from que.

chanks avatar chanks commented on August 19, 2024

I don't have much formal CS training, so I'm teaching myself bitwise operations as I go. It took me some googling, but now I see how you could get the lowest-order bits with ((int8_column & ((1::int8 << 32) - 1))::int4). I believe this could be used to migrate current tables with ALTER TABLE ... USING as well, but I'd have to experiment with it.

Having thought more about this, though, I'm still skeptical that it's a good idea. Advisory lock collisions are going to be rare and transient and probably not usually noticeable. So I'm asking myself whether it's worth the increased code complexity to eliminate these rare inconveniences but introduce somewhat rarer ones? Right now I'm thinking no, but this might be worth revisiting if we decided to support multiple job tables at some point, or if another Ruby library that relied on advisory locks became popular, or if people started experiencing it as an actual problem, and not a theoretical one. So I'm going to close this for now, but thanks for bringing it up - it's something to keep in mind.

from que.

Sinjo avatar Sinjo commented on August 19, 2024

I've been doing some thinking on this. We use advisory locks in our app, though not in a way which would collide with Que1. Because of that, I'm not concerned about fixing this right away, but I had some thoughts on how you could use the pg_try_advisory_lock(uint32, uint32) version of the advisory lock method and figured it was worth sharing them:

  • As you say, using the lowest 4 bytes of the job_id as the lock ID makes sense and can be achieved with bitmasking/modular arithmetic.
  • It would be nice to allow apps to set the first argument when configuring Que. Apps may already have used some values to namespace other tasks, so it would be good to pass control of that to the app (with Que choosing something sensible if nothing is specified).
  • To make the job stats queries work again, only use the objid from the pg_locks table, and add a functional index on the job_id column which keeps only the lowest 4 bytes of the job_id.

I'd be interested to hear your thoughts on the approach.

Also, I should have mentioned this on the other issue I opened, but I'm happy to spend time implementing these things if you feel they'd make good additions to Que.

[1] We use really low-numbered lock IDs, so after a few jobs they'd no longer collide with Que.

from que.

joevandyk avatar joevandyk commented on August 19, 2024

"Even with two job tables in the same database, the worst that could happen is if there happen to be jobs in each table that share the same job_id, then one won't be able to start while the other is still in progress, which may be inconvenient on rare occasions, but isn't catastrophic."

The two different job tables could use the same sequence for the job_id column, that would prevent the same job_id from being used.

from que.

chanks avatar chanks commented on August 19, 2024

To avoid collisions at the lower values you could even set the job_id sequence to start at a higher value (100,000 or something).

@Sinjo I've actually spent part of the past few days mulling over how to support multiple job tables, in order to use Que in an app that uses schemas to support multitenancy. I'm not sure yet that these changes would make good additions, but I'm definitely hoping a robust solution to this comes up sooner rather than later.

from que.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.