Giter VIP home page Giter VIP logo

Comments (18)

Kosta-Github avatar Kosta-Github commented on August 31, 2024

OK, browsing the internet a little bit more and reading stuff, I came to the conclusion, that multiple running haproxy instances might not be a problem, since the older ones should finish themselves once the last open connection gets closed.

But then there might be another issue for this observed behavior:

As soon as there is a change with respect to services running and a new haproxy config file is generated and reloaded by haproxy I get HTTP connection errors for a very short time period (~1sec). And this is across services/apps, e.g., when scaling/deploying a service_a I also get connection errors for a running service_b.

Any ideas?

from panteras.

sielaq avatar sielaq commented on August 31, 2024

ad. to HAproxy multiple processes - this is a normal behavior.
ad. 2'nd question - we are not having this problem at all, till now, HAproxy reload never caused any issue yet.

Do you have problem with every running service, when HAproxy reloads ?
Do you have problem with services when scaling down only ? or also when scaling up ?

from panteras.

Kosta-Github avatar Kosta-Github commented on August 31, 2024

I just setup a very simple node.js based service (just echo'ing the HTTP headers that were passed to it) for testing the fail-over behavior of HAProxy.

This service gets stressed by another node.js based script, which runs 10 requests in parallel in a loop hitting the echo service from above.

As soon as the HAProxy config file gets changed (due to the deployment of a new service, scaling an existing service up or down, ...) I get some failing requests reported, e.g.:

...
 stress-test 4: remote address: ::ffff:10.50.80.218, service host: 0e0a26b73ff8, duration: 32
 stress-test 5: remote address: ::ffff:10.50.80.218, service host: 0e0a26b73ff8, duration: 32
 stress-test 3: remote address: ::ffff:10.50.80.218, service host: 0e0a26b73ff8, duration: 32
 stress-test 6: remote address: ::ffff:10.50.80.218, service host: 0e0a26b73ff8, duration: 31
 stress-test 8: request error: Error: read ECONNRESET, duration: 27
 stress-test 7: request error: Error: read ECONNRESET, duration: 27
 stress-test 9: request error: Error: read ECONNRESET, duration: 27
 stress-test 1: request error: Error: read ECONNRESET, duration: 30
 stress-test 0: remote address: ::ffff:10.50.80.218, service host: 0e0a26b73ff8, duration: 29
 stress-test 2: remote address: ::ffff:10.50.80.218, service host: 0e0a26b73ff8, duration: 32
 stress-test 3: remote address: ::ffff:10.50.80.218, service host: 0e0a26b73ff8, duration: 30
...

or sometimes connection timeout like this:

...
  stress-test 0: remote address: ::ffff:10.50.80.218, service host: 1088cc6d638d, duration: 27
  stress-test 0: remote address: ::ffff:10.50.80.218, service host: 4382a069d20c, duration: 30
  stress-test 0: remote address: ::ffff:10.50.80.218, service host: 6e97e38e6d33, duration: 34
  stress-test 2: request error: Error: connect ETIMEDOUT, duration: 20099
  stress-test 3: request error: Error: connect ETIMEDOUT, duration: 20099
  stress-test 9: request error: Error: connect ETIMEDOUT, duration: 20099
  stress-test 5: request error: Error: connect ETIMEDOUT, duration: 20099
  stress-test 8: request error: Error: connect ETIMEDOUT, duration: 20099
  stress-test 6: request error: Error: connect ETIMEDOUT, duration: 21000
  stress-test 4: request error: Error: connect ETIMEDOUT, duration: 20099
  stress-test 1: request error: Error: connect ETIMEDOUT, duration: 21000
  stress-test 7: request error: Error: connect ETIMEDOUT, duration: 20099
  stress-test 0: remote address: ::ffff:10.50.80.218, service host: 980e24577611, duration: 28
  stress-test 2: remote address: ::ffff:10.50.80.218, service host: 0e0a26b73ff8, duration: 28
  stress-test 3: remote address: ::ffff:10.50.80.218, service host: 1088cc6d638d, duration: 30
...

from panteras.

sielaq avatar sielaq commented on August 31, 2024

I think this is kind of HAproxy design problem that for few millisecond port is unavailable - specially exposed if you test it without keep alive - you should rather simulate browser behavior and check
apache ab (from apache2-utils package) that supports keep alive
and then:
ab -n 10000 -c 10 -k http://echo.service.my_company.com
this will give you more reliable results.

We are considering to migrate to different Load Balancer. One of our internal developer wrote a golang consul based router, we are waiting for OS it (It need some time to Open Source any internal tool)

from panteras.

sielaq avatar sielaq commented on August 31, 2024

if this is still bothering you, change in generate_yml.sh for that

HAPROXY_RELOAD_COMMAND="iptables -I INPUT -p tcp --dport 80 --syn -j DROP; sleep 1; /usr/sbin/haproxy -p /tmp/haproxy.pid -f /etc/haproxy/haproxy.cfg -sf $(pidof /usr/sbin/hapro
xy); iptables -D INPUT -p tcp --dport 80 --syn -j DROP || true"

I just saw that this option cannot be overwritten from external ENV variable , man man man

from panteras.

sielaq avatar sielaq commented on August 31, 2024

more info :
http://inside.unbounce.com/product-dev/haproxy-reloads/
http://engineeringblog.yelp.com/2015/04/true-zero-downtime-haproxy-reloads.html

from panteras.

Kosta-Github avatar Kosta-Github commented on August 31, 2024

Thx for the pointers; will have a closer look tomorrow in the office.

I hoped/thought that HAProxy could handle those cases out-of-the-box... :-(

Do you have experience with nginx? Can it handle that reload case more gracefully?

from panteras.

sielaq avatar sielaq commented on August 31, 2024

Yes we have experience with nginx, apache and varnish.
Nginx could be a better choice, varnish also (but varnish has implemented it since version 4)
The solution with iptables should also do the trick with HAproxy, especially the 1st method (from the first link) is better, but much more complicated. The 2'nd one with tc and nl-qdisc-add could be easy to implement.
And worth mention: there is no faster LB than HAproxy.

IMHO all those methods doesn't fully feet to consul cluster - because you always need to use consul-template "glue code" thats why some of my colleagues decided to write its own router.

from panteras.

sielaq avatar sielaq commented on August 31, 2024

I have just tested solution with nl-qdisc-add that buffers all the stuff for milliseconds and it works really good,
I think I can adapt this into PanteraS - this will definitely satisfy you.

from panteras.

Kosta-Github avatar Kosta-Github commented on August 31, 2024

Yes, definitely! :-)

Thank you for all your suggestions and hard work!

from panteras.

sielaq avatar sielaq commented on August 31, 2024

fixed

from panteras.

Kosta-Github avatar Kosta-Github commented on August 31, 2024

thx

from panteras.

Kosta-Github avatar Kosta-Github commented on August 31, 2024

This change didn't fix the issue for me; I am observing the exact same errors as noted above... :-(

from panteras.

sielaq avatar sielaq commented on August 31, 2024

Yea, I forgot to include one part, I was wondering where to put it, and forgot about it.

from panteras.

sielaq avatar sielaq commented on August 31, 2024

oh man this is not working as suspected, depends on buffer size it can even slow down a lot.
I start to think to implement the best option with two HAproxy deamons and iptables switch.

from panteras.

sielaq avatar sielaq commented on August 31, 2024

I have a PoC using iptables switching between two HAproxy,
and this one is really working fine. I have well tested it,
it works much better than queuing/buffering.
I will push changes tomorrow - I have to make some cosmetic changes.

from panteras.

sielaq avatar sielaq commented on August 31, 2024

Kindly please verify if this is working fine

from panteras.

Kosta-Github avatar Kosta-Github commented on August 31, 2024

This seems to work now and also with my keepalived PR (#68)...

Thanks for that.

from panteras.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.