Giter VIP home page Giter VIP logo

huptime's Introduction

High uptime

Build Status

Huptime is a tool for achieving zero downtime restarts without the need to modify your program in any way.

Although many applications support reloading configurations while running, a zero downtime restart allows for upgrading the application code without rejecting any clients.

Basic Example

In a terminal, run:

huptime --exec python -m SimpleHTTPServer &
PID=$!

Then, in a second terminal:

while true; do curl http://localhost:8000 2>/dev/null || echo "fail"; done

Finally, in a third terminal (or back in the first):

kill -HUP $PID

You should see no "fail" output on the second terminal.

With this reload, the complete code for SimpleHTTPServer is reloaded (potentially with changes), but at no time are connections denied or dropped. When the new version is up and running again (i.e. it binds the socket and calls accept), then pending connections will be processed.

Why?

With continuous deployment, software can be updated dozens, hundreds or even thousands of times per day. It is critical that service is not interrupted during upgrades.

In an ideal world, all applications would support a mechanism for doing zero downtime restarts. The reality is that many standard frameworks make this difficult to do from the top down. It's not practical to plumb this functionality through every layer, particularly for applications over which you have no control.

Compound this with the fact that many applications consist of many different small components (written using different languages and frameworks), and you've got yourself a headache.

Because of this complexity, one of the first things people have to do is implement a custom load balancing tier and a complex upgrade process. Although this is important at a certain scale, it shouldn't be that hard for simple services. It's crazy to add a whole new tier when the problem can be solved in a much simpler way!

Huptime attempts to make it very simple to achieve these restarts for simple, unmodified applications.

How do I install it?

Building this package gnerally requires:

  • make
  • gcc and g++
  • python
  • rpmbuild (optional) and dpkg (optional)

Clone the repo:

git clone http://github.com/amscanne/huptime

Install it the old-fashioned way:

cd huptime && sudo make install

Want Ubuntu & Debian packages?

cd huptime && make deb && dpkg -i huptime*.deb

How about CentOS or RedHat?

cd huptime && make rpm && rpm -i huptime*.rpm

How do I use it?

You simply need to run services via huptime.

For example:

# Start the service.
huptime /usr/bin/myservice &

# Zero downtime restart.
killall -HUP myservice

# Or, if you prefer...
huptime --restart /usr/bin/myservice

If there is a pidfile, it can be reset on restart:

# Start the service.
huptime --unlink /var/run/myservice.pid /usr/bin/myservice &

# Zero downtime restarts.
killall -HUP myservice

# Again, if you prefer...
huptime --restart /usr/bin/myservice

Or, if you need exec (for example, to run under upstart):

# Start the service and get the PID.
huptime --exec /usr/bin/myservice &
PID=$!

# Zero downtime restart (same PID).
kill -HUP $PID

# Again, as always...
huptime --restart /usr/bin/myservice

What does it support?

Huptime should [+] handle the following normal things:

  • Daemonization & pid files
  • Process pools
  • Multiple server sockets
  • Event-based and thread-based servers
  • Integration with supervisors (just use exec!)

In terms of languages and frameworks, huptime should support nearly all programs that are dynamically linked against a modern libc.

Most modern dynamic languages (python, ruby, node, etc.) fall into this category. Most C/C++ programs also fall into this category. A unique exception is go, which invokes system calls directly and uses only static linking. (For the record, I am a big fan of this approach. However, both have their merits).

[+] Should. YMMV.

What else does it do?

  • Transparent multi-binding (running the same service multiple times)

If you are running Linux 3.9+, then you can also easily enable pools of processes by starting your services with the --multi option. Again, this does not require any modification on your application.

For example:

# Start the service (4 workers).
huptime --multi=4 /usr/bin/myservice &

# Zero downtime restart of all.
killall -HUP myservice

# Or, if you prefer...
huptime --restart /usr/bin/myservice

Want to manage the number of running scripts yourself?

pids="";

reload() {
    for pid in $pids; do
        kill -HUP $pid;
    done
}

stop() {
    for pid in $pids; do
        kill -TERM $pid;
    done
}

trap reload SIGHUP;
trap stop SIGTERM;
trap stop SIGINT;

count="0";
while [ "$count" -lt "$N" ]; do
    huptime --multi=1 /usr/bin/myservice &
    pids="$pids $!";
    count=$(($count + 1));
done

for pid in $pids; do
    wait $pid;
done
  • Transparent restart on exit

If you've got a stubborn program and you want to restart it automatically when it fails, you should really fix your program. Barring that, you should use a supervisor like upstart. Barring that (you don't care about the sensible features that a supervisor provides and want zero downtime restarts), you can use huptime.

To enable this option, simply specify --revive on the huptime command line.

For example:

# Start a zero downtime netcat.
huptime --revive nc -l 9000 < message.txt &

# Clients will always find a server...
nc localhost 9000

How does it work?

Huptime installs a signal handler for SIGHUP.

It tracks open file descriptors by intercepting calls to bind and accept (among other things). When the program receives a SIGHUP, it will intelligently exec a new copy of the program without closing any bound sockets and without requiring any changes to the program.

Note that this is not simply a reload but rather a new version of the application with config changes and code changes (as both now appear on disk).

When the new copy of the program tries to bind the same socket, huptime will silently replace it with the still-open socket from the previous version.

There are two fundamental modes of operation:

  • fork (default)

If you use fork, then when the process receives a SIGHUP, then it will fork and exec a new copy of the application. This results in less downtime, as new requests can start being served immediately, while old requests are still being finished by the original program.

This may not integrate cleanly with supervisor processes like upstart however, which depend on the PID of the application staying constant.

This may also present issues for some applications that check pidfiles or contain internal mechanisms for preventing two copies of themselves from running. Huptime goes to some effort to prevent conflict (allowing for unlink prior to executing the child), but it may still arise.

  • exec

If you use exec, then when a process receives a SIGHUP, then it will begin queueing requests to the bound socket (in the kernel) and wait until all outstanding requests are finished. Only when existing requests are finished will the program restart.

This may not work properly if requests are not bounded in how long they will take. This may also lead to high response times for some clients during the restart. However, this approach will play well with supervisors.

For example, if you are using upstart, you can do the restart as:

upstart reload service

Limitations

Although the majority of programs will work, I'm sure that all will not.

The exit is not done through the normal application path on restart. Although all file descriptors are closed, there may be application-level resources (or some system resources) that are not cleaned up as expected and may cause problems.

The command line and environment cannot be changed between restarts. You can easily work around this issue by putting all configuration inside a file that is read on start-up (i.e. myservice --config-file=/etc/myservice.cfg).

What's up with the name?

It's clever! Services are often reloaded using SIGHUP. The point of this tool is to maximize uptime by enabling zero downtime restarts via SIGHUP. It's your high availabilibuddy!

huptime's People

Contributors

amscanne avatar derekchiang avatar jackdanger avatar joshmoore avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

huptime's Issues

Huptime doesn't work with node.js apps?

I'm trying to use huptime with a super-simple node.js application based on express.js, but sending the HUP signal terminates the process instead of "reloading" it:

I start the process, save the PID, run nc against my HTTP server, and then send the HUP siganl:

On Server:

huptime  --exec node huptime.js &
PID=$!
# start nc on client
kill -HUP $PID

On Client:
nc myserver 6000

When I send the HUP signal, nc on the client side exits and the process on the server side terminates. The same process with the example from github (SimpleHTTPServer) seems to work fine. I'm probably doing something wrong, but can't figure out what exactly.

The application code can be pulled from here: https://github.com/shaharke/node-huptime

impl.c:(.text+0x23ef): warning: the use of `tempnam' is dangerous, better use `mkstemp'

$ sudo make install
impl.c:(.text+0x23ef): warning: the use of `tempnam' is dangerous, better use `mkstemp'

I use supervisor to manage stop/start application, but i've been digging around for something to do graceful restarts of my web applications without having to integrate a separate "thing" onto each one. (i.e. something for node, something for go, etc).

huptime seems like the answer, but I can't get it to build.

Possibly work on java programs?

Well...not really a issue. I created a very simple java webserver, and tried as the doc says, and failed. I guess it's because it doesn't fell in to the category "all programs that are dynamically linked against a modern libc."?

I tried:

  1. Write a simple java webserver, make sure java WebServer can be started, then kill it.
  2. try huptime java WebServer &. it successfully started. get the pid
  3. try kill -HUP <pid>
  4. the old process is killed like this:
    Exit 129 huptime java WebServer

I don't think this need to be fixed, but it will be interesting if Java is supported.

centos install error

hi
my os is CentOS release 5.6 (Final)

when i executed "make rpm", i got the fllowing error info:

make[1]: Entering directory /root/huptime' In file included from src/fdinfo.c:22: src/fdinfo.h: In function ‘alloc_info’: src/fdinfo.h:149: warning: value computed is not used src/fdinfo.h:152: warning: value computed is not used src/fdinfo.h:155: warning: value computed is not used src/fdinfo.h:158: warning: value computed is not used src/fdinfo.h:161: warning: value computed is not used src/fdinfo.h: In function ‘free_info’: src/fdinfo.h:178: warning: value computed is not used src/fdinfo.h:185: warning: value computed is not used src/fdinfo.h:188: warning: value computed is not used src/fdinfo.h:191: warning: value computed is not used src/fdinfo.h:194: warning: value computed is not used src/fdinfo.h: In function ‘inc_ref’: src/fdinfo.h:203: warning: value computed is not used src/fdinfo.h: At top level: src/fdinfo.h:167: warning: ‘dec_ref’ declared inline after being called src/fdinfo.h:167: warning: previous declaration of ‘dec_ref’ was here In file included from src/fdtable.h:25, from src/fdtable.c:22: src/fdinfo.h: In function ‘alloc_info’: src/fdinfo.h:149: warning: value computed is not used src/fdinfo.h:152: warning: value computed is not used src/fdinfo.h:155: warning: value computed is not used src/fdinfo.h:158: warning: value computed is not used src/fdinfo.h:161: warning: value computed is not used src/fdinfo.h: In function ‘free_info’: src/fdinfo.h:178: warning: value computed is not used src/fdinfo.h:185: warning: value computed is not used src/fdinfo.h:188: warning: value computed is not used src/fdinfo.h:191: warning: value computed is not used src/fdinfo.h:194: warning: value computed is not used src/fdinfo.h: In function ‘inc_ref’: src/fdinfo.h:203: warning: value computed is not used src/fdinfo.h: At top level: src/fdinfo.h:167: warning: ‘dec_ref’ declared inline after being called src/fdinfo.h:167: warning: previous declaration of ‘dec_ref’ was here In file included from src/impl.c:24: src/fdinfo.h: In function ‘alloc_info’: src/fdinfo.h:149: warning: value computed is not used src/fdinfo.h:152: warning: value computed is not used src/fdinfo.h:155: warning: value computed is not used src/fdinfo.h:158: warning: value computed is not used src/fdinfo.h:161: warning: value computed is not used src/fdinfo.h: In function ‘free_info’: src/fdinfo.h:178: warning: value computed is not used src/fdinfo.h:185: warning: value computed is not used src/fdinfo.h:188: warning: value computed is not used src/fdinfo.h:191: warning: value computed is not used src/fdinfo.h:194: warning: value computed is not used src/fdinfo.h: In function ‘inc_ref’: src/fdinfo.h:203: warning: value computed is not used src/fdinfo.h: At top level: src/fdinfo.h:167: warning: ‘dec_ref’ declared inline after being called src/fdinfo.h:167: warning: previous declaration of ‘dec_ref’ was here src/impl.c: In function ‘do_accept4’: src/impl.c:1318: error: ‘SOCK_NONBLOCK’ undeclared (first use in this function) src/impl.c:1318: error: (Each undeclared identifier is reported only once src/impl.c:1318: error: for each function it appears in.) src/impl.c: In function ‘do_accept4_retry’: src/impl.c:1387: error: ‘SOCK_NONBLOCK’ undeclared (first use in this function) src/impl.c: In function ‘do_syscall’: src/impl.c:1449: error: ‘SYS_accept4’ undeclared (first use in this function) make[1]: *** [src/impl.o] Error 1 make[1]: Leaving directory/root/huptime'
make: *** [rpm] Error 2

any one can help me ?

tomcat?

Can it handle tomcat.. which has a startup time of several minutes.. ?

And how does it do it.. startup two processes - which both want to listen on the same port simultaenously and controls who actually has the port somehow? (I haven't read the c-code - and its been a long time since I wrote networking/IPC code in C.. so it would take some time to grasp :)

Compilation fail

I'm on a CentOS 6.4, trying to build the rpm but the compilation fails:

# make rpm
make[1]: Entering directory `/var/apps/huptime'
In file included from src/fdinfo.c:22:
src/fdinfo.h:107: warning: declaration does not declare anything
src/fdinfo.h: In function ‘free_info’:
src/fdinfo.h:152: error: ‘fdinfo_t’ has no member named ‘tracked’
src/fdinfo.h:154: error: ‘fdinfo_t’ has no member named ‘tracked’
src/fdinfo.c: In function ‘info_decode’:
src/fdinfo.c:84: error: ‘fdinfo_t’ has no member named ‘bound’
src/fdinfo.c:85: error: ‘fdinfo_t’ has no member named ‘bound’
src/fdinfo.c:86: error: ‘fdinfo_t’ has no member named ‘bound’
src/fdinfo.c:89: error: ‘fdinfo_t’ has no member named ‘bound’
src/fdinfo.c:92: error: ‘fdinfo_t’ has no member named ‘bound’
src/fdinfo.c:93: error: ‘fdinfo_t’ has no member named ‘bound’
src/fdinfo.c:101: error: ‘fdinfo_t’ has no member named ‘saved’
src/fdinfo.c:101: error: ‘fdinfo_t’ has no member named ‘saved’
src/fdinfo.c:101: error: ‘fdinfo_t’ has no member named ‘saved’
src/fdinfo.c:105: error: ‘fdinfo_t’ has no member named ‘saved’
src/fdinfo.c:105: error: ‘fdinfo_t’ has no member named ‘saved’
src/fdinfo.c:105: error: ‘fdinfo_t’ has no member named ‘saved’
src/fdinfo.c: In function ‘info_encode’:
src/fdinfo.c:133: error: ‘fdinfo_t’ has no member named ‘bound’
src/fdinfo.c:140: error: ‘fdinfo_t’ has no member named ‘saved’
src/fdinfo.c:140: error: ‘fdinfo_t’ has no member named ‘saved’
src/fdinfo.c:140: error: ‘fdinfo_t’ has no member named ‘saved’
src/fdinfo.c:144: error: ‘fdinfo_t’ has no member named ‘saved’
src/fdinfo.c:144: error: ‘fdinfo_t’ has no member named ‘saved’
src/fdinfo.c:144: error: ‘fdinfo_t’ has no member named ‘saved’
make[1]: *** [src/fdinfo.o] Error 1
make[1]: Leaving directory `/var/apps/huptime'
make: *** [rpm] Error 2
# gcc --version
gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-3)
Copyright (C) 2010 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

I noticed the problem is the union used on fdinfo, the flag -std=c99 doesn't allow that syntax. It works with -std=gnu99.

Node JS not working #2

I had some problems with using huptime with my node.js applications.

I've found out that even simplest node.js apps do not want to restart e.g.

require('http').createServer(function (req, res){
    res.end('FOO');
}).listen(5000, '127.0.0.1',function(){console.log("S")});

when launched with:
huptime --debug --exec node node.js
and HUPped with killall -HUP node

produces

huptime 26101: Mode is exec.
huptime 26101: Unlink is .
huptime 26101: Multi is False.
huptime 26101: Revive is False.
huptime 26101: Wait is False.
huptime 26101: Initializing...
huptime 26101: Exit strategy is exec.
huptime 26101: -wait- 1431
huptime 26101: -acquired- 1431
huptime 26101: -release- 1433
huptime 26101: Saving all initial file descriptors.
huptime 26101: Saved fd 0 (offset -1).
huptime 26101: Saved fd 1 (offset -1).
huptime 26101: Saved fd 2 (offset -1).
huptime 26101: do_close(6, ...) ...
huptime 26101: -wait- 439
huptime 26101: -acquired- 439
huptime 26101: -release- 443
huptime 26101: do_close(6) => 0 (no info)
huptime 26101: Saved environment.
huptime 26101: do_close(6, ...) ...
huptime 26101: -wait- 439
huptime 26101: -acquired- 439
huptime 26101: -release- 443
huptime 26101: do_close(6) => 0 (no info)
huptime 26101: Saved args.
huptime 26101:  arg0=node
huptime 26101:  arg1=node.js
huptime 26101: Saved cwd.
huptime 26101: Saved exe.
huptime 26101: Signal handler installed.
huptime 26101: Initialization complete.
huptime 26101: do_close(14, ...) ...
huptime 26101: -wait- 439
huptime 26101: -acquired- 439
huptime 26101: -release- 443
huptime 26101: do_close(14) => 0 (no info)
huptime 26101: do_bind(15, ...) ...
huptime 26101: -wait- 1116
huptime 26101: -acquired- 1116
huptime 26101: -release- 1210
huptime 26101: do_bind(15, ...) => 0
huptime 26101: do_listen(15, ...) ...
huptime 26101: -wait- 1222
huptime 26101: -acquired- 1222
huptime 26101: -release- 1269
huptime 26101: do_listen(15, 511) => 0
S
huptime 26101: do_accept4(15, ...) ...
huptime 26101: -wait- 1281
huptime 26101: -acquired- 1281
huptime 26101: -release- 1316
huptime 26101: -wait- 1331
huptime 26101: -acquired- 1331
huptime 26101: -release- 1368
huptime 26101: do_accept4(15, ...) => 16 (tracked 1) 
huptime 26101: do_accept4(15, ...) ...
huptime 26101: -wait- 1281
huptime 26101: -acquired- 1281
huptime 26101: -release- 1316
huptime 26101: -wait- 1331
huptime 26101: -acquired- 1331
huptime 26101: -release- 1368
huptime 26101: do_accept4(15, ...) => -1 (tracked 1) Resource temporarily unavailable
huptime 26101: do_close(16, ...) ...
huptime 26101: -wait- 439
huptime 26101: -acquired- 439
huptime 26101: -release- 451
huptime 26101: do_close(16) => 0 (0 tracked)
huptime 26101: -wait- 1004
huptime 26101: -acquired- 1004
huptime 26101: Exit started -- this is the master.
huptime 26101: do_dup(15, ...) ...
huptime 26101: -wait- 172
huptime 26101: -acquired- 172
huptime 26101: -release- 189
huptime 26101: do_dup(15) => 6 (with info)
huptime 26101: do_close(18, ...) ...
huptime 26101: -wait- 439
huptime 26101: -acquired- 439
huptime 26101: -release- 443
huptime 26101: do_close(18) => 0 (no info)
huptime 26101: do_dup3(16, 15, ...) ...
huptime 26101: -wait- 386
huptime 26101: -acquired- 386
huptime 26101: -release- 421
huptime 26101: do_dup3(16, 15, ...) => 15
huptime 26101: Replaced FD 15 with dummy.
huptime 26101: Exit strategy is exec.
huptime 26101: No active connections, finishing exit.
huptime 26101: See you soon...
huptime 26101: Preparing for exec...
huptime 26101: Encoded fd 3 (type 3).
huptime 26101: Encoded fd 4 (type 3).
huptime 26101: Encoded fd 5 (type 3).
huptime 26101: Encoded fd 6 (type 1).
huptime 26101: Finished encoding.
huptime 26101: Doing exec()... bye!

While this simple example sometimes worked my main application refused to work and just closed when receiving HUP signal.

And also while testing I encountered Segmentation Fault when using --wait flag

FYI
$ node -v
v0.10.3
$ huptime --version
0.1-25.gd29d9e7

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.