Giter VIP home page Giter VIP logo

Comments (18)

shaneharter avatar shaneharter commented on September 23, 2024
  1. Is it working? It says it's retrying, does it do so successfully or do you just see another error message? Just curious, obviously error messages are bad regardless.
  2. Try cleaning any errant IPC resources using: ipcrm -q (queue cleans message queues) from the commandline. Check out the manpag for ipcrm for more info on that.

from php-daemon.

willebil avatar willebil commented on September 23, 2024

Thanks for the fast response!

Seems the code is locked, the Sysv php code does 3 retries if I am correct, and then gives up, but it does not halt the deamon. As said, I am debugging the current situation, and am not 100% sure what happens. I am trying to understand error code 11, I simply cannot find any reference to the meaning of this.

The log posted here is from the syslog, the deamon console only regististerd the Worker error. More information is not logged.

Will try your suggestion, and see what happens and get back to this.

from php-daemon.

shaneharter avatar shaneharter commented on September 23, 2024

A downside of using the SysV message queues and SHM in php is that it's little used and poorly documented. Didn't realize that when I built it. In the last big release (about a year ago) I refactored the Mediator code so the IPC is abstract, so now you can swap out SysV for another message queue (POSIX would be my general purpose choice, but server-based queues would also work). I just haven't done the work yet.

from php-daemon.

willebil avatar willebil commented on September 23, 2024

Do I see a challenge coming up :-D I stick with debugging for now, but you never know :-)

from php-daemon.

shaneharter avatar shaneharter commented on September 23, 2024

Not sure if you've tried running your daemon with --debugworkers option set. It will let you step thru each call to the worker and possibly isolate problems. This error, though, is coming from the library.

This is the function PHP is using here:
http://www.lehman.cuny.edu/cgi-bin/man-cgi?msgsnd+2

If you look at error codes, there are constants. If you can find which constant is "11" it might give clues.

from php-daemon.

shaneharter avatar shaneharter commented on September 23, 2024

Ok, actually the daemon tries to handle this for you:

       case MSG_EAGAIN:    // Temporary Problem, Try Again
            usleep($this->mediator->backoff(20000, $try));
            return true;
            break;

So in this case, it's sleeping for 0.05 seconds, and then 0.10, and then 0.20 between each failure, trying again afterward as the error code indicates it should.

This is an OS (or stdlib) failure code, lmk if cleaning the IPC resources made any difference.

from php-daemon.

willebil avatar willebil commented on September 23, 2024

Check. Will deepdive on all of this, will take some time since the error only occurs after the workers are running for some time. Will get back to this asap.

from php-daemon.

shaneharter avatar shaneharter commented on September 23, 2024

You could do this.. run with --debugworkers and type "help". You'll see the commands. There's a command you can use, skipfor, that will run the daemon normally (skipping breakpoints) for however many seconds you say. So you could say "skipfor 600" and 10 mins later it'll hit a breakpoint and you can step thru. What this will let you do is get one of these failures, then use ipcs to show current IPC resources. Maybe there are messages backing up on the queue and its hitting a resource limit?

from php-daemon.

willebil avatar willebil commented on September 23, 2024

I went back to create a simple example with just one worker. The worker waits a random amount of seconds, and then finishes. Pretty straight forward, worker is initiated with the following:

    protected function setup_workers() {
        $that = $this;

        $via = new \Core_Worker_Via_SysV();
        $via->malloc(30 * 1024 * 1024);

        $this->worker('WorkerTasks', new WorkerTasks(), $via);
        $this->WorkerTasks->timeout(120);
        $this->WorkerTasks->workers(5);
    }

And is called in the execute method

    protected function execute() {
        $this->WorkerTasks->work();
    }

The actual code in the worker executed is the following:

    public function work () {
        $wait = rand(1, 60);
        $this->mediator->log("Tasks:work => waiting for $wait seconds");
        sleep ($wait);
    }

After a couple of iteration the exact error occurs. I must be doing something incredible simply wrong since the provided examples run without any problem. Debugging does not bring me anywhere. Any ideas?

from php-daemon.

willebil avatar willebil commented on September 23, 2024

The error code is not really helpfull, see http://www-numi.fnal.gov/offline_software/srt_public_context/WebDocs/Errors/unix_system_errors.html and refers to Try Again ;-) The Daemon actually trying to do that 3 times, and then execution gets messed up.

from php-daemon.

willebil avatar willebil commented on September 23, 2024

Indeed I have been overlooking something pretty elementary. Spend the whole day debugging and understanding what was going on, so the possitive side is that I have a good understanding of the codebase (an yes, I like the solution).

Problem is solved by validating if the worker was idle/busy, and only when idle I am executing the worker method. The problem was that to many worker processes got instantiated, and with that the solution ran out of resources pretty quickly. Thanks anyway for the proper help!

from php-daemon.

shaneharter avatar shaneharter commented on September 23, 2024

Well... that doesn't make me feel good. You shouldn't have to do that. Mind throwing up the exact code you were running (the simple example you mentioned) into a gist or something so I can run it?

What distro were you using?

Even if it's just the daemon using a more aggressive backoff -- slowing everything down in these instances -- it would be better than random cryptic errors.

from php-daemon.

shaneharter avatar shaneharter commented on September 23, 2024

Nevermind -- missed it upthread -- but would still like to know distro details.

from php-daemon.

willebil avatar willebil commented on September 23, 2024

Distro is Ubuntu 14.04. Let me know if you want the code example.

from php-daemon.

brunnels avatar brunnels commented on September 23, 2024

Here's one I've been getting occasionally that may be related:

Warning: shm_get_var(): variable key 2391 doesn't exist in PHP-Daemon/Core/Worer/Via/SysV.php on line 240

Call Stack:
    0.0004     225192   1. {main}() lib/daemon/run.php:0
    0.8631    6006160   2. Core_Daemon->run() lib/daemon/run.php:15
17949.1137   20455592   3. Core_Daemon->dispatch() PHP-Daemon/Core/Daemon.php:413
17949.1142   20458248   4. call_user_func_array() PHP-Daemon/Core/Daemon.php:510
17949.1142   20458416   5. Core_Worker_Mediator->run() PHP-Daemon/Core/Daemon.php:510
17949.1145   20460120   6. Core_Worker_Mediator->fork() PHP-Daemon/Core/Worker/Mediator.php:742
17949.1181   20460712   7. Core_Daemon->task() PHP-Daemon/Core/Worker/Mediator.php:636
17949.1386   20389312   8. call_user_func_array() PHP-Daemon/Core/Daemon.php:591
17949.1386   20389400   9. Core_Daemon->{closure:PHP-Daemon/Core/Daemon.php:555-559}() PHP-Daemon/Core/Daemon.php:591
17949.2726   11232776  10. call_user_func() PHP-Daemon/Core/Daemon.php:557
17949.2727   11233480  11. Core_Worker_Mediator->start() PHP-Daemon/Core/Daemon.php:557
17998.6530   67289328  12. Core_Worker_Via_SysV->get() PHP-Daemon/Core/Worker/Mediator.php:778
17998.6588   67291992  13. Core_Worker_Via_SysV->{closure:PHP-Daemon/Core/Worker/Via/SysV.php:239-245}() PHP-Daemon/Core/Worker/Via/SysV.php:276
17998.6588   67292040  14. shm_get_var() PHP-Daemon/Core/Worker/Via/SysV.php:240

Notice: Trying to get property of non-object in PHP-Daemon/Core/Worker/Via/SysV.php on line 241

Call Stack:
    0.0004     225192   1. {main}() lib/daemon/run.php:0
    0.8631    6006160   2. Core_Daemon->run() lib/daemon/run.php:15
17949.1137   20455592   3. Core_Daemon->dispatch() PHP-Daemon/Core/Daemon.php:413
17949.1142   20458248   4. call_user_func_array() PHP-Daemon/Core/Daemon.php:510
17949.1142   20458416   5. Core_Worker_Mediator->run() PHP-Daemon/Core/Daemon.php:510
17949.1145   20460120   6. Core_Worker_Mediator->fork() PHP-Daemon/Core/Worker/Mediator.php:742
17949.1181   20460712   7. Core_Daemon->task() PHP-Daemon/Core/Worker/Mediator.php:636
17949.1386   20389312   8. call_user_func_array() PHP-Daemon/Core/Daemon.php:591
17949.1386   20389400   9. Core_Daemon->{closure:PHP-Daemon/Core/Daemon.php:555-559}() PHP-Daemon/Core/Daemon.php:591
17949.2726   11232776  10. call_user_func() PHP-Daemon/Core/Daemon.php:557
17949.2727   11233480  11. Core_Worker_Mediator->start() PHP-Daemon/Core/Daemon.php:557
17998.6530   67289328  12. Core_Worker_Via_SysV->get() PHP-Daemon/Core/Worker/Mediator.php:778
17998.6588   67291992  13. Core_Worker_Via_SysV->{closure:PHP-Daemon/Core/Worker/Via/SysV.php:239-245}() PHP-Daemon/Core/Worker/Via/SysV.php:276


[2014-05-02 13:43:38] 12560               Exception Caught in Task: Core_Worker_Via_SysV::get Failed. Could Not Decode Message: Array
[2014-05-02 13:43:38] 12560               (
[2014-05-02 13:43:38] 12560                   [call_id] => 2391
[2014-05-02 13:43:38] 12560                   [status] => 0
[2014-05-02 13:43:38] 12560                   [microtime] => 1399052510.8313
[2014-05-02 13:43:38] 12560                   [pid] => 21996
[2014-05-02 13:43:38] 12560               )

from php-daemon.

StevenMapes avatar StevenMapes commented on September 23, 2024

so what was the actual fix for this? I'm seeing it occur a lot and am having to kill the daemon, wait, then kill the worker processes and start up the daemon again via a cron just to try to limit the problem. The auto_restart_interval sadly isn't enough to clean it up and it's not a limit of physical resources either.

from php-daemon.

defusiondev avatar defusiondev commented on September 23, 2024

Also getting this quite a lot particularly when the task the daemon is performing has timeout issues. Have needed to kill the daemon and restart it to get it to work.

from php-daemon.

cwhsu1984 avatar cwhsu1984 commented on September 23, 2024

I really hope that someday someone could somehow fix this. This project is great but it really needs to get some bugs fixed, especially like this one. If you use this project, you are likely to see the errors like this:

[2017-09-12 21:09:11] 17597 QueueReceive  SysV::put() Failed for call_id 552: Retrying. Error Code: 11
PHP Warning:  msg_send(): msgsnd failed: Resource temporarily unavailable in /path/to/your/workdir/app/vendor/shaneharter/php-daemon/Core/Worker/Via/SysV.php on line 208

Like @willebil said, you don't really need a complex worker to make this happen. After inspecting the codes and performing some tests, I think it is caused by the message queue overflow. Run "ipcs -q" as soon as you start the daemon, and run it again when seeing "Resource temporarily unavailable" message. You will find that the used-bytes of queue is 16376.

------ Message Queues --------
key        msqid      owner      perms      used-bytes   messages    
...
0x5101058b 3080200    ubuntu     666        16376        170 

To be more specific, you only see the errors when your workers idle for some time, and the time for errors to occur depends on how many workers you start. Anyway, if the idling workers keep sending messages to the queue, the queue will be full eventually.

So, in order to fix this, we will need to know why we should send message to the queue even if the worker is idling. I don't have time to trace this bug now, but I will just leave a clue for anyone who is interested in it.

Anyway, @brunnels provides another modified version here #39. Too bad, it still has the same issue. So you are still going to see the errors as long as your workers idle for a few minutes.

I do some experiments on "default max size of queue" by using sysctl like this:

sudo sysctl -w kernel.msgmnb=65536

By increasing your message queue size, it really deferred the issue but still won't solve it. It does prove that this is where the issue comes from. Still, one thing that I don't understand is that when seeing the following message:

[2017-09-13 11:21:49] 23677 QueueReceive  Dropped Call. Requeuing Call 86 To `listen`
[2017-09-13 11:21:49] 23677 QueueReceive  Dropped Call. Requeuing Call 87 To `listen`

The requesting call might be dropped, but the message is not removed from the message queue. Hence, dropping call does not decrease the used size of message queue. If dropping call would removed the useless message in the message queue, we might have a workaround by just increasing the size of message queue.

from php-daemon.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.