Giter VIP home page Giter VIP logo

recluster's People

Contributors

f8k8 avatar glasseyes42 avatar issa-tseng avatar maxvipon avatar sandinmyjoints avatar spion avatar vaseker avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

recluster's Issues

Respawn issue with two instances

If I add two instances of recluster and run both of them, respawning will work incorrectly.

$ git clone https://github.com/tarmolov/recluster-test-app.git
$ cd recluster-test-app
$ npm install
$ npm start
> [email protected] start /Users/hevil/Yandex/recluster-test-app
> node server.js

app1
app2
/Users/tarmolov/recluster-test-app/app1.js:4
    throw new Error('Error in app1');
    ^

Error: Error in app1
    at Timeout.setTimeout (/Users/tarmolov/recluster-test-app/app1.js:4:11)
    at tryOnTimeout (timers.js:232:11)
    at Timer.listOnTimeout (timers.js:202:5)
[71627] worker (0:1) must be replaced, respawning in 0
app2

The synthetic error is thrown in app1 and I expect that the app1 will be respawned. However, It respawns app2 instead.

Domain example

Is there a working example of using domains with recluster to gracefully restart on unhanded exceptions?

Throwing exception since Node v0.12.0

hi,

Throws the following since Node v0.12.0

assert.js:86
throw new assert.AssertionError({
^
AssertionError: false == true
at SharedHandle.add (cluster.js:97:3)
at queryServer (cluster.js:480:12)
at Worker.onmessage (cluster.js:438:7)
at ChildProcess. (cluster.js:692:8)
at ChildProcess.emit (events.js:129:20)
at handleMessage (child_process.js:324:10)
at Pipe.channel.onread (child_process.js:352:11)

Documentation for return type of activeWorkers() function is invalid

Now README.md says that cluster.activeWorkers() function

Returns a hash of all worker slots (0 <= WORKER_ID < N).

But actually it returns ArrayLike object, which have length property and index singnature:

interface ActiveWorkers {
    length: number;
    [index: number]: Worker | null | undefined;
}

So, if use Object.entries(activeWokers) (or similar way for iterate keys) we can get length as key and this is unexpected behavior, because length is not a worker id.

I think documentation should be more clear in this place.

Question: How to know when all workers are ready/active?

We can listen for ready events on the cluster object, but these fire once for each worker.

How can we trigger some handler once all the workers are ready? Is there any such event or hook, public or private?

My use case is notifying PM2 that the cluster is ready, for graceful reloading. So this is somewhat related to #35 which is concerned about how to gracefully shut down a cluster.

I have this naive custom function to return a promise that resolves when all workers are ready, but I think with this there is the possibility that, before all workers are ready, a worker dies and gets replaced, and the promise will resolve too early:

function allWorkersReady (cluster) {
  return new Promise((resolve, reject) => {
    let pendingWorkersCount = cluster.workers().length
    cluster.on('ready', onReady)
    function onReady (worker) {
      if (--pendingWorkersCount === 0) {
        cluster.removeListener('ready', onReady)
        resolve()
      }
    }
  })
}

possible EventEmitter memory leak detected

I have a lot of such messages in stderr.log:

(node) warning: possible EventEmitter memory leak detected. 11 listeners added. Use emitter.setMaxListeners() to increase limit.
Trace
    at EventEmitter.addListener (events.js:175:15)
    at module.exports.self.reload.i (/usr/lib/frontend/server/node_modules/recluster/index.js:230:21)
    at Array.forEach (native)
    at EventEmitter.module.exports.self.reload (/usr/lib/frontend/server/node_modules/recluster/index.js:211:22)
    at process.<anonymous> (/usr/lib/frontend/server/index.js:53:13)
    at process.EventEmitter.emit (events.js:93:17)
    at SignalWatcher.startup.processSignalHandlers.process.on.process.addListener.w.callback (node.js:487:45)

/usr/lib/frontend/server/index.js:53:13

51    process.on('SIGUSR2', function() {
52        console.log('%s [master] Got SIGUSR2, reloading cluster...', prefix);
53        cluster.reload();
54    });

Possible It will be enough to set emitter.setMaxListeners(options.workers)

How to get master to fork workers from new release?

This is more of a conceptual question than an issue with recluster per se, so sorry about that, but any guidance would be appreciated. If I am managing deployment by directories that contain releases, then when I do a new release, how do I tell cluster to fork the new workers from the path to the new release code?

In other words:

Say release A is in /var/deploy/app/A, and symlink /var/deploy/app/current/ points to it. My app lives in app.js so my file argument to recluster is a relative path:
var cluster = recluster("app.js");.

My simple cluster master script lives in cluster.js. I start up a master process via cd /var/deploy/app/A ; node cluster.js which forks a few app.js workers using the code in /var/deploy/app/A. The cwd of the master and the workers is /var/deploy/app/A.

Now I deploy release B to /var/deploy/app/B and update /var/deploy/app/current to point there. I send SIGUSR2 to master. master's cwd is still /var/deploy/app/A, so it is going to fork new workers from there rather than from /var/deploy/app/B.

One solution to this is:

  1. explicitly use the /current symlink when starting the cluster master: cd /var/deploy/app/current ; node /var/deploy/app/current/cluster.js.
  2. specify the file argument to cluster using the symlink from the master process:
var workerPath = process.argv[1].replace("cluster", "app");
var cluster    = recluster(workerPath);

Because the symlink is updated to the new release before SIGUSR2 is sent, the new workers will be forked from the new release.

In practice, I am using Chef and Upstart which complicates things somewhat. But ultimately I want to send a signal to the master process (via service <app name> reload) that will cause it to fork new workers from the new code release that Chef just deposited onto the server. Without the symlink, I haven't thought of a way to keep the same master process around but get it to fork workers from the new code. Do you have any thoughts on this?

Expose eventemitter API

Document and expose eventemitter API. To do this, all usages of self.on will need to be replaced with something else.

Push latest to NPM

With PR #50 merged which resolves issue #48, would it be possible to get a bump to the version on NPM?

Tests are flaky

The current tests are flaky; they rely on timings and sometimes randomly pass or fail. Need to identify which timeouts can be replaced with events on the eventemitter; implement those events and update the tests

Multiple calls support

In a project I need to call recluster 2 times because there are 2 separate servers. After calling recluster 2 times I found the first call's workers can't respawn correctly because the exec file was changed by the second call. I find it's easy to allow this to be happen( see 7nights@909f71d ). Can this feature be supported?

Windows Support ?

Does it support windows ? Or only POSIX Operating Systems. I would like to use this for a product which should run cross platform. I used naught and it does not seem to work (As clearly mentioned in their documentation).
Thanks !

respawn and backoff with multiple workers

During testing, backoff seems to work properly with 1 worker, but with 4 workers, the backoff (max respawn time) is exceeding the set time.

For example:
var opt = {
workers: 4,
/seconds/
timeout: 300,
respwan: 2,
backoff: 10
};

If the app is crashed repeatedly (for testing), the respawn time ends up being over 30 seconds and more, exceeding the 10 second backoff setting. I'll be happy to review the code to track down the problem, but I wanted to first make sure this was in fact a bug and not a misunderstanding on my part.

Thank you for a great module.

old workers never die.

Hi,

When I signal the master using SIGUSR2, new workers are started but the old ones never die.

            var timeout = setTimeout(worker.kill.bind(worker), opt.timeout * 1000);
            worker.on('disconnect', clearTimeout.bind(this, timeout));
            // possible leftover worker that has no channel estabilished will throw
            try { worker.disconnect(); } catch (e) { }
            cluster.removeListener('listening', stopOld);

What I think is happening is that the worker.disconnect() call is generating a disconnect event that clears the timeout before it has a chance to execute. Am I reading this correct?

I am running node 0.10.0, so it could be that this event wasn't generated in your version of node?

recluster and nodemon on Mac OS X

Hi,
I am using recluster and nodemon on Mac OS X High Sierra.
Recluster seems to work fine, when sending kill signal CLI.
Now, I am combining recluster with nodemon, so nodeon is responsible to restart as soon as a file changes. Worked fine on my old Windows machine, but on Mac it seems to have difficulties with this setup.

I run the following command: nodemon -w server --exec node server/main.js

nodemon -w server --exec node server/main.js
[nodemon] 1.14.12
[nodemon] to restart at any time, enter rs
[nodemon] watching: /.../server/**/*
[nodemon] starting node server/main.js
spawned cluster, kill -s SIGUSR2 10479 to reload
CONSUMER 10482
server listening on 3000 (NODE_ENV=development)
[nodemon] restarting due to changes...
Got SIGUSR2, reloading cluster...
[10480] worker (0:1) must be replaced, respawning in 0
[10481] worker (0:2) must be replaced, respawning in 1996
[10482] worker (0:3) must be replaced, respawning in 5994
CONSUMER 10490
CONSUMER 10491
CONSUMER 10488
CONSUMER 10489
CONSUMER 10493
CONSUMER 10494

Any ideas on this?

Processes stack up on concurrent restarts

send SIGUSR2 & wait, my 2 processes come back up as 2 process every time. Using siege, availability stays at 100%

Send SIGUSR2 twice rapidly, my 2 processes turn to 5 processes and I get error in console Error [ERR_IPC_DISCONNECTED]: IPC channel is already disconnected

Using siege, it shows availability drops from 100% to around 90%. Eventually number of running processes decrease back down to 2.

I would have expected it to kill a worker before spawning a replacement, instead it appears it spawns replacements before killing the worker being replaced, this means total processes sometimes exceed number of CPUs (undesired). Also 90% availability instead of 100% when issuing restarts in quick succession is an issue. A quick fix could be to toggle a flag & block restarts if theres already one pending.

Feature description to Shutdown and Kill the Whole Cluster

Hi, I made a comparison of several similar modules such as [1] [2] [3] [4] and I liked yours the most because it comes without bloat!

I think there are two more important use-case that should be described in README.md:

"To gracefully shutdown the whole cluster (workers and master)"

kill -s TERM <cluster_pid>

"To forcefully kill the whole cluster (workers and master)"

kill -s KILL <cluster_pid>

What do you think about adding it?

[1] https://github.com/strongloop/strong-cluster-control
[2] https://github.com/andrewrk/naught
[3] https://github.com/ql-io/cluster2
[4] https://github.com/Unitech/pm2

Error [ERR_IPC_CHANNEL_CLOSED]: Channel closed on node 9.2.1

Hi, I see there is a problem on node 9.2.1 while sending kill -s SIGUSR2 51865

my server.js file is

const express = require('express')
const app = express()

app.get('/', (req, res) => res.send('Hello World!'))

app.listen(3000, () => console.log('Example app listening on port 3000!'))

and cluster.js taken from your example

$ node cluster.js
spawned cluster, kill -s SIGUSR2 51865 to reload
Example app listening on port 3000!
Example app listening on port 3000!
Example app listening on port 3000!
Example app listening on port 3000!
Example app listening on port 3000!
Example app listening on port 3000!
Example app listening on port 3000!
Example app listening on port 3000!
Got SIGUSR2, reloading cluster...
Example app listening on port 3000!
Example app listening on port 3000!
Example app listening on port 3000!
Example app listening on port 3000!
Example app listening on port 3000!
Example app listening on port 3000!
Example app listening on port 3000!
Example app listening on port 3000!
events.js:136
      throw er; // Unhandled 'error' event
      ^

Error [ERR_IPC_CHANNEL_CLOSED]: Channel closed
    at ChildProcess.target.send (internal/child_process.js:606:16)
    at Worker.send (internal/cluster/worker.js:40:28)
    at killTimeout (/Users/jankowalski/3prty/JS/my-recluster/node_modules/recluster/index.js:134:24)
    at EventEmitter.workerReplaceTimeoutTerminate (/Users/jankowalski/3prty/JS/my-recluster/node_modules/recluster/index.js:111:9)
    at EventEmitter.emit (events.js:159:13)
    at emit (/Users/jankowalski/3prty/JS/my-recluster/node_modules/recluster/index.js:53:19)
    at EventEmitter.workerDisconnect (/Users/jankowalski/3prty/JS/my-recluster/node_modules/recluster/index.js:148:36)
    at EventEmitter.emit (events.js:159:13)
    at ChildProcess.worker.process.once (internal/cluster/master.js:207:13)
    at Object.onceWrapper (events.js:254:19)

works fine on node 8.2.1

Does recluster module maintains sticky session

I just want to know that does recluster maintains sticky session or not i.e for example if user 1's first request went to the first worker then whether recluster ensures that the next subsequent requests will also go to the same first worker?

Maybe emit() function can become more clear?

The question is about emit() function.

var self = new EE();
var channel = new EE();

function emit() {
    channel.emit.apply(self, arguments);
    self.emit.apply(channel, arguments);
}

According to the source code,We get two instances of EventEmitter class and use emit function to redirect messages.

But why we use channel object to emit events of self object whereas use self object to emit events of channel object?

  1. It seems like that we don't bind any event on self object.

  2. I change the function as below

    function emit() {
        channel.emit.apply(channel, arguments);
    }
    

the module also can invoke run() and reload() function.

If there is no other design about it,maybe this is a better clear way.

Add support for cwd?

A nice feature would be to allow using this together w/ other popular deployment tools such as capistrano. There is a problem with node-cluster itself with cwd, documented here http://clarkdave.net/2013/02/node-js-and-cluster-cwd-dirname-shenanigans/

A nice feature would be to solve for that here in recluster. Apparently doing this when new workers are spawned would solve the issue.

// http://clarkdave.net/2013/02/node-js-and-cluster-cwd-dirname-shenanigans/
    if (__dirname !== process.cwd()) {
        process.chdir(__dirname);
    }

Also see #8

timeout does not work as documented

I am running NODE_ENV=development node backend-cluster.js

backend-cluster.js

var recluster = require('recluster'),
    path = require('path');

process.title = 'backend'

// With relative path, it will only follow the symlink 1x.
// use absolute path to symlink. cuz we want it to follow the symlink every restart
var cluster = recluster(path.join('/var/www/html/app/current/repos/backend.js'), {
    timeout: 30
});
cluster.run();

backend.js

setInterval(() => console.log('foo'),1000)

After editing backend.js and removing the code / changing the log message, I send the kill signal, recluster spawns replacement workers but the old workers do not die & "foo" keeps getting sent to console. I expect the old workers to be killed after 1m since NODE_ENV is dev. I also added the explicit timeout opt for 30 seconds, but "foo" keeps getting sent to console for a long time longer (maybe 1hour?)

Is reload truly zero downtime?

If I'm reading the code correctly it is possible that the some requests could fail if they occur between shutdown and the fork loop at the end.

Douglas

Workers getting requests *after* `disconnect` message

Hi!

I'm trying to use recluster to get zero-downtown deploys for my hapi web server. It works most of the time, but sometimes I'm seeing a worker process that continues receiving and responding to requests even after it has been given a disconnect message. As far as I understand, recluster should not be routing requests to the old workers anymore?

In particular, I'm using hapi to proxy a CouchDB _changes feed, which is a longpoll feed that should last for 25 seconds at a shot. It is these longpolls that I'm primarily having an issue with.

I can see that the workers are still getting requests after the disconnect message, as I'm logging the pid when the hapi server receives requests and when it responds.

I'm directly running the following cluster.js:

var recluster = require('recluster'),
    path = require('path');

var cluster = recluster(path.join(__dirname, 'index.js'), {
    timeout: 120, // seconds
    workers: 2,
    readyWhen: 'ready',
    log: {
        respawns: true
    }
});
cluster.run();

process.on('SIGUSR2', function() {
    console.log('Got SIGUSR2, reloading cluster...' + new Date());
    cluster.reload(function() {
        console.log("done reloading cluster");
    });
});

console.log("spawned cluster, kill -s SIGUSR2", process.pid, "to reload");

In my hapi server, I have:

server.start( function () {
        process.send({cmd: 'ready'});
        server.log('info', 'Server running at: ' + server.info.uri + ' with PID #' + process.pid + " at " + new Date() );
    });

And to handle the disconnect message:

process.on('message', function(m) {
    if (m.cmd == 'disconnect') {
        // disconnected from master, no more clients. clean up.
        console.log("close message for PID#" + process.pid + " at " + new Date());
        server.stop({
            timeout: 60000
        }, function() {
            console.log('hapi stopped with PID#' + process.pid + " at " + new Date());
            process.exit();
        });
    }
});

I'm sorry I don't yet have a reduced test case; I can try work on one if you don't see anything immediately obviously wrong with my configuration. ;-)

Daemon option

Hi, I want to tell you that I really like your module and your work.

I use recluster for my project and I am really impressed :)

I would like to know if it's possible to add an option for launching recluster as a daemon or if I should use a node module for this?

I thought that maybe start-stop-daemon could be a solution. I would love to know your opinion about it.

Thank you

Backoff configuration may not guarantee the max time between respawns when workers die?

I have been studying the recluster source code recently.It's a great module to control node process.But I have some questions wiht the backoff configuration.

There are two files here:

index.js:

var path = require('path');
var recluster = require('recluster');


var opt = {
    respawn: 0.5
    ,
    backoff: 2
}
var cluster = recluster(path.join(__dirname,'server.js'),opt);

cluster.run();

server.js:

var http = require('http');


var s = http.createServer(function(req, res) {
    var params = req.url.split('/').slice(1);
    setTimeout(function() { 
        res.writeHead(200);
        res.end("hello world\n");
    }, params[0] || 1);
});

s.listen(8000);

setTimeout(function() {
    throw new Error("Unclean exit!");
}, 500);

I have done node index.js some times ,the result will show in the terminal:

result

I find that the respawn time was increasing from 0 to 6293ms(6s).

But the repository README says:

opt.backoff

Maximum respawn time (reached via exponential backoff). Set to 0 or undefined to disable exponential backoff.

I look up the part about backoff in source code.Here time is the fork timeout .

But as mentioned above,let backoff is set 2s, the actual result (2959ใ€4854) exceed backoff configuration.

According the source code, as long as setting the backoff,it will invoke delayedDecreaseBackoff and respwan time will be divided by 2 in a backoff time later.Before that the respawn time would be multiplied by 2.

But this exponential backoff seems like not to guarantee the maximum time between respawns when workers die or I can't make sence of this logic.

Would you talk about how this design of logic run :)

recluster.terminate() accepts a callback as argument

Hi!

I've been using your module for a while and I think it's doing its job very well and it has a small, adequate interface. Kudos!

I was refactoring one of my project's boot.js file (which is initiating a recluster instance and firing up the application) and I thought it would be very helpful to be able to execute code when the cluster terminates. Something like this. That way you can assign handlers to standard terminating signals, in them you can terminate the cluster and when it's ready, exit the process. Like so:

var cluster = recluster({....});

function shutdown() {
    cluster.terminate(process.exit);
}

process
    .on('SIGUSR2', cluster.reload)
    .on('SIGINT', shutdown)
    .on('SIGTERM', shutdown);

cluster.run();

I can send a pull request with the changes and tests, if you are interested.

Option to fork new worker on worker "stoppedListening"?

This still stands between me and zero downtime: on process uncaughtException, I do server.close which means this worker is no longer accepting new connections. However, it may take some time before exiting to wind down existing connections, clean up, etc. During this time, recluster will not fork a new worker, which means I'm down one worker. On a two-core machine, should another uncaught exception be thrown (perhaps there is a bad request that triggered the first, and now the client is resending it, so the second worker gets it), I will have no workers accepting new connections.

It seems that in this case, master will queue new connections until a worker accepts them, though I don't know for how long it will hold them, nor how many it will queue. But basically, this is downtime.

On recluster.reload, old workers are disconnected and new workers to replace them are immediately forked, which works great for zero downtime. However, could there be some way for a worker to tell master "I am no longer listening for new connections" which would be another trigger for master to fork a new worker? Maybe master could listen for a message "stoppedListening" from the worker, that I could send in my uncaught exception handler?

Just throwing this out there for discussion. I know you said you would be busy with a project so no worries if you don't have time for this now, but interested to hear your thoughts whenever you are able!

Old workers still don't die!

As with issue #1, I'm having difficulties getting old workers to die. On my dev machine, everything works fine. But in production when I send a SIGUSR2, new workers are started but the old ones remain. The old ones appear to continue doing work too.

What information/debugging can I provide to help solve this?

Question: what's the correct way to gracefully shutdown a cluster and its children?

I have a shutdown handler in child.js

let makeCloseHandler = (sig) => {
  return () => {
    console.log(`Received signal ${sig}`);
    server.close(() => {
      console.log('Server closed');
    });
  };
};

process.on('SIGINT', makeCloseHandler('SIGINT'));
process.on('SIGTERM', makeCloseHandler('SIGTERM'));

And cluster.js

const recluster = require('recluster');
const path = require('path');

const cluster = recluster(path.join(__dirname, 'index.js'), {
  timeout: 120
});

const workerEvent = function(ev) {
  cluster.on(ev, function(worker) {
    console.log('Worker ' + worker.id + ' [' + worker.process.pid + '] ' + ' ' + ev + '.');
  });
};

['online', 'listening', 'disconnect', 'exit'].forEach(function(ev) {
  workerEvent(ev);
});

cluster.run();
console.log('Master ' + process.pid + ' started.');

let makeCloseHandler = (sig) => {
  return () => {
    console.log(`Cluster received signal ${sig}`);
    cluster.terminate(() => {
      console.log('Cluster closed');
    });
  };
};

process.on('SIGINT', makeCloseHandler('SIGINT'));
process.on('SIGTERM', makeCloseHandler('SIGTERM'));

But I see those logs when I stop the node cluster.js

^CReceived signal SIGINT
Received signal SIGINT
Cluster received signal SIGINT
Received signal SIGINT
Server closed
Server closed
Server closed
Received signal SIGINT
Server closed
Cluster closed

It seems like child are receiving SIGINT before master does. So I'm confused on how grace shutdown are handled here. What's the best way to ensure we don't drop connect halfway in a request?

Send "disconnecting" message to workers

Another one from me. How about when timeout > 0, sending a "disconnecting" message to workers so they have a chance to gracefully cleanup and close out existing connections. I understand that once disconnected they will not receive new connections from master, but their servers may continue to serve existing long-lived connections for a while before abruptly exiting when timeout is over.

Then in my app, I can do process.on("message", function(msg) { if(msg == "disconnecting") cleanUp(); });. As-is, the app knows nothing about what is going on until it receives SIGTERM from worker.kill and by then existing clients may have been getting served by old workers for as long as the timeout -- in production, you default to 1 hour for this.

What do you think? I an opening an issue rather than a PR because I am not sure I am thinking about this the right way and not missing something. But for code, I think it could be as simple as:

diff --git a/index.js b/index.js
index 42c3fbc..bc52213 100644
--- a/index.js
+++ b/index.js
@@ -157,6 +157,7 @@ module.exports = function(file, opt) {
                 if (opt.timeout > 0) {
                     var timeout = setTimeout(killfn, opt.timeout * 1000);
                     worker.on('exit', clearTimeout.bind(this, timeout));
+                    worker.send('disconnecting');
                 } else {
                     killfn();
                 }

reload doesn't wait for existing connections

I just ran a test and it seems to demonstrate that reload trashes already established connections.

  1. I commented out my res.send() statement in my express middle ware
  2. I updated my cluster.js file to issue a reload 10 seconds after startup
  3. I started up my cluster
  4. I ran a curl command against the endpoint that never returns
  5. I waited for the reload
  6. When the reload triggers, my SIGTERM listener fires
  7. When I attempt graceful shutdown the server is already destroyed.
  8. The connection that was waiting gets an empty response, which in this trivial test is expected but if this was a real connection waiting for results that would not be acceptable.

Provide mechanism to not respawn if worker hasn't every gotten to ready

If I have a worker that is misconfigured (e.g., missing an env setting), and it never gets to the ready state, it would be nice if the entire cluster could shutdown, instead of going into an endless loop trying and failing to start the workers. If the cluster master could be told to listen for a worker to get to the ready state at least once, then we'd know it's probably useful to respawn. If, however, we don't get to ready, it's probably a good sign that respawning isn't going to help much.

I can see times where you might have workers that you want to keep kicking, so maybe this could be optional, opts.respawnExpectsReady or the like.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.