Giter VIP home page Giter VIP logo

Comments (6)

solarkennedy avatar solarkennedy commented on May 26, 2024

I think I agree. That init script was probably stolen from
https://gist.github.com/stojg/d0d96a123761830c0ff1

Can you PR?

from puppet-consul.

runswithd6s avatar runswithd6s commented on May 26, 2024

from puppet-consul.

EvanKrall avatar EvanKrall commented on May 26, 2024

Would consul leave on a server cause the other servers to decrease the expected number of servers w.r.t quorum calculations? That would be somewhat scary.

from puppet-consul.

runswithd6s avatar runswithd6s commented on May 26, 2024

It appears that the bootstrap-expect is a "maybe" condition if no raft data has been initilized. If you've already bootstrapped, then it will be ignored. The expect/quorum is managed dynamically based on the number of hosts in the cluster. We've been playing with a Vagrant build of a 3 server cluster and ran into issues that led us to recognize this behavior. Both the consul leave and kill -INT PROCESSID will cause the node to leave the cluster. A leader can be elected with the remaining two hosts, due to the time-based elections, but they will not expect a 3rd until you re-start and re-join.

We've run into issues with using bootstrap-expect to elect leaders at all and have needed to manually bootstrap the cluster. This closed bug plagues us now: hashicorp/consul#370. Here's the comment that is telling for us:

All of the server nodes ever leaving is not considered a
standard operating case. The servers are long running, and if
you expect to run more than one for HA, it is an outage
scenario if only one is running. In the case of all the
machines losing power / failing, when they start up it will
automatically heal. In the case of all nodes leaving the
cluster and shutting down, that is not considered a normal
mode of operation.

Recovery for this: https://www.consul.io/docs/guides/outage.html

If all three servers have 'left' the cluster, recovering from this logically forced outage isn't as simple as clearing out the peers.json file. Leaders can't be elected, and the KV data is effecively lost. Perhaps it is because the nodes are not in a boostrap mode any more. Who knows. We have found that sending SIGKILL the consul service preserves its state, at least to the capability that the nodes can rejoin the cluster without losing information.

From a pragmatic position, it is better for a 'server' process to 'die' rather than 'leave'. For agent services, it's a different story.

from puppet-consul.

solarkennedy avatar solarkennedy commented on May 26, 2024

I'm down with this. The upstream upstart scripts don't leave
https://github.com/hashicorp/consul/tree/0c7ca91c74587d0a378831f63e189ac6bf7bab3f/terraform/aws/scripts

@runswithd6s if you make a PR I would accept it or I will do it myself.

from puppet-consul.

runswithd6s avatar runswithd6s commented on May 26, 2024

Ok. We're almost done with the investigation spike. I'll talk to our team to see if I can carve out some time to do it.

from puppet-consul.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.