Giter VIP home page Giter VIP logo

Comments (7)

tangopium avatar tangopium commented on July 29, 2024 2

I was digging deeper into this issue. The following occurs:

  1. Sidekiq Worker starts
  2. Worker picks up a job and removes it from the queue
  3. Job starts initializing
  4. Reaper gets started
  5. Reaper finds lock and check if it is an orphan or expired
  6. Reaper doesn't find the corresponding job, considers it an orphan and removes it
  7. Worker doesn't enter perform method because the lock is missing and finishes (I couldn't verify this step in the code, but thats what I can observ)
  8. About 10 seconds later, Sidekiq Worker updates status of job and puts it to the "Processing queue"

The main problem here, is that the Sidekiq worker pushes the Job status with a big delay. It takes up to 10 seconds until a job appears as "processing" (see the Sidekiq source here & here). This leads to a situation where job "disappears" for that period.

By the way, this can affect even jobs during normal runtime. If a worker picks them up and the reaper runs in the 10 seconds delay time, we are in the same situation.

I already thought about if this is a design error on Sidekiqs side, but I guess they are using the heartbeat method to reduce pressure on the DB and allow very high throughputs.

A possible solution I see here: the heartbeat timing is hardcoded in Sidekiq, so what we could do, is to maintain (supposedly) orphaned locks for at least 10 seconds in memory and run a re-check after that time. If they are still orphans, they can be removed. What do you think?

from sidekiq-unique-jobs.

mhenrixon avatar mhenrixon commented on July 29, 2024

This is the weirdest issue I have come across so far.

I had this happen once in a blank project (example app in this repository).

Fixed it and haven't been able yo replicate it since.

Other people did have this and then I didnt hear from them again.

from sidekiq-unique-jobs.

tangopium avatar tangopium commented on July 29, 2024

I prepared a simple dockerized repository to reproduce the error: https://github.com/tangopium/sidekiq-delay-issue

from sidekiq-unique-jobs.

DmitryRibalka avatar DmitryRibalka commented on July 29, 2024

@tangopium i can confirm the issue. I have a task with the same options as yours example.
Mine is run by cron every 10 min and duration can be 10 sec or even 1 hr.
Instead of run it every time i see this log:

19:00:03 sidekiq.1 \| 2024-01-05T19:00:03.213Z pid=39 tid=9oj uniquejobs=client until_and_while_executing=uniquejobs:fa970d55f6eb3db2764e6a05573900fe INFO: Skipping job with id (bece4b439233f82bc547866d) because lock_digest: (uniquejobs:fa970d55f6eb3db2764e6a05573900fe) already exists
 
 also i see in the web ui at 'sidekiq/locks' the lock which is broken i think. it started hours ago. why it is not released have no idea
 
 sidekiq 7.2, ruby 3.2
 

from sidekiq-unique-jobs.

DmitryRibalka avatar DmitryRibalka commented on July 29, 2024

I double checked the config and found these line missing:

Sidekiq.configure_server do |config|

  config.server_middleware do |chain|
    chain.add SidekiqUniqueJobs::Middleware::Server
  end

and looks like the lock exipres as it should

from sidekiq-unique-jobs.

tangopium avatar tangopium commented on July 29, 2024

@DmitryRibalka thanks for sharing your experience. The error you faced doesn't look like the one I described and is probably caused by the missing configuration you've mentioned

@mhenrixon Any update regarding my question? Did you have time to look into it?

from sidekiq-unique-jobs.

mhenrixon avatar mhenrixon commented on July 29, 2024

Thank you, @tangopium! This makes it a little more challenging.

So we need a queue for these situations with a timestamp to compare with.

Fantastic research, and it makes sense.

I will have a look at optimizing a few places as well. I am primarily using Lua already for that reason.

I haven't looked yet because I feel beat about rechecking this issue, but you gave me hope.

from sidekiq-unique-jobs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.