kelseyhightower / consul-on-kubernetes Goto Github PK

View Code? Open in Web Editor NEW

601.0 601.0 183.0 268 KB

Running HashiCorp's Consul on Kubernetes

License: Apache License 2.0

Shell 100.00%

consul-on-kubernetes's People

Contributors

Stargazers

Watchers

Forkers

viglesiasce lachie83 jim3ma pinterb cdelpinogeo mjudeikis vpiduri pabens rfay 40a rbramwell emmanuel ajohnstone makentenza stephansnyt butchland billyteves bobhenkel satheeshagowda jwhitcraft cnighojkar rmiddle tvollmer nicksuckling seanyinx jljlpch callmeradical izakp sivakumarvunnam kslaffka doktoric gagliardetto blachniet abdelhegazi globallogicpractices johnworth rongfengliang msessa mkam423 elordahl milosd rishikeshpalve sidcarter madchap jasonnic luigi-riefolo fvbock topefremov carone1 jlaham ricfeatherstone 37teams praveenjha527 clustellar sbikram andrewwatson flybyray adnanpri dima1034 innovia continuul ivanyinusa claudiouzelac oleg-filiutsich markmarine brainmix ivan-at hanyinting secauth-skhan uplogin appcoreopc mailtovivek87 arkii wanyinglong ecgouvea yuhaibao324 naresh0112 dlouks qafro1 ql2005 hitxiang marcelocorreia entercloud-local-dev satyabongu avdhutpandit dghogare-zz tonglin96 chenk008 aeroglyphic pcjeff misterniall blairham richardyuwen e-n-g-xor ruben-e gabrielrojasnyc yuanyblf devops-ru alextsm hareeshrao4839

consul-on-kubernetes's Issues

No cluster leader

When I'm calling to the HTTP service from my app it's sending me an error with this message: No cluster leader

Request for documentation/guidance on setting up consul agents

We've been setting up consul in k8s, using a similar setup to what you have here, but how do you handle individual consul agents? Using the pod abstraction, it would seems we'd have an agent per pod, which is how we've done it, but it inflates the deployment descriptor for each pod, couples the pod description to the consul server and imposes unnecessary overhead on the consul servers and on resources in the k8s cluster. It seems it would be more efficient to associate a single consul agent per node. I looked into DaemonSet to implement that, but there didn't seem to be a convenient way to associate the pods on the node with that node's agent. Have you run into this, or is 1:1 agent/pod the way to go?

The documentation in DaemonSet suggests the following without guidance:

Clients know the list of nodes ips somehow, and know the port by convention.

I did a PoC of this, attempting to provide the hostIP to a pod via the downward API, but hostIP isn't supported.

It seems in order to use DaemonSet, one would have to write an additional tool to watch the apiserver for pod creation events and then register each pod with its node's consul agent. It doesn't seem too hard, but I thought there might be a simpler way.

Data dir permissions issue in Minikube

Hello,

I'm following this tutorial on a local minikube cluster and Consul fails with this error:

$ kubectl logs -f consul-0
==> CRITICAL: Permission denied for data folder at "/var/lib/consul/mdb"!
==> Consul will refuse to boot without access to this directory.
==> Please correct permissions and try starting again.

I was able to get around it by using /consul/data as the mountPath. The Consul image entrypoint sets the right permissions for this path when launching Consul.

I'm not sure if this error is specific to Minikube, but I imagine volume mounts in general would be owned by root and would have the same issue. I guess the issue here is that Consul runs as a non-root user, which is unconventional for Docker images.

[ERR] agent: Coordinate update error: No cluster leader

i have deploy the consul latest version on kubernetes V1.10.0 .but the consul pod's log show these error message:
2018/07/20 11:26:11 [WARN] agent: Check "service:ribbon-consumer" HTTP request failed: Get http://DESKTOP-MCQSJ49:8504/health: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
2018/07/20 11:26:15 [ERR] agent: failed to sync remote state: No cluster leader
2018/07/20 11:26:16 [ERR] agent: Coordinate update error: No cluster leader

the cluster doesnt work correctly.

How do you register service health check

I got k8 services to register with consul, but am having issues to expose or do health checks on service it self like /ping /health etc.. any pointers would be really great.

statefulset Replicas status is desired

hi：
when i run this command: kubectl describe statefulset consul;
it show that
Replicas: 3 desired | 0 total
Pods Status: 0 Running / 0 Waiting / 0 Succeeded / 0 Failed;
it didn't create any pods, so how can i solve this problem

consul deployment in kubernetes setup , required a doc

I need to deploy consul in one kubernetes and the service is running on the same machine and same namespace. service needs to be registered in consul and can I see this registeration in consul.

Deploy it on bare metal

Hey ! Nice project!

But how can I use it on baremetal k8s cluster!

i got pods via state "pending" and error:

pod has unbound immediate PersistentVolumeClaims (repeated 7 times)

Plz help

PetSets

Wouldn't PetSets help reduce some of the boilerplate?

auto provision persistent disks
single petset configuration instead of 3 deployments

Creating Persistent Volumes

I had to manually create persistent volumes in order to get this example to work. Is that expected, or is there likely something wrong with my configuration of minikube?

If it's expected that the user should manually create the persistent volumes, I'd be happy to update the instructions and share my example. Here's what I did:

kubectl create -f pv.yaml

# pv.yaml
kind: PersistentVolume
apiVersion: v1
metadata:
  name: pv0
  labels:
    type: local
spec:
  storageClassName: default
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/tmp/data/pv0"
---
kind: PersistentVolume
apiVersion: v1
metadata:
  name: pv1
  labels:
    type: local
spec:
  storageClassName: default
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/tmp/data/pv1"
---
kind: PersistentVolume
apiVersion: v1
metadata:
  name: pv2
  labels:
    type: local
spec:
  storageClassName: default
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/tmp/data/pv2"

Consul On Kubernetes

@kelseyhightower How does it help running consul on kubernetes. How anyone can be able register kubernetes services on to consul and utilize it as service discovery tool over DNS. I am not finding any reference utilizing consul as a service discovery tool for kubernetes.

Kindly share your thoughts/notes if it is a viable solution.

Consul 1.0.1 breaks reading the configmap config

When running consul 1.0.1 I get the following error attempting to spin up consul using the artifacts in this tutorial

Percival-Derpington:manifests mattcase$ kubectl logs consul-0
==> config: ReadFile failed on /consul/config/..data: read /consul/config/..data: is a directory

Using the exact same configmap but utilizing 1.0.0 results in proper startup:

Percival-Derpington:manifests mattcase$ kubectl logs consul-0
==> Starting Consul agent...
bootstrap_expect > 0: expecting 3 servers
==> Consul agent running!
           Version: 'v1.0.0'
           Node ID: '4a369f12-a478-529f-c426-07db9baa7ac0'
         Node name: 'consul-0'
        Datacenter: 'dc1' (Segment: '<all>')
            Server: true (Bootstrap: false)
       Client Addr: [0.0.0.0] (HTTP: 8500, HTTPS: 8443, DNS: 8600)
      Cluster Addr: 100.111.100.48 (LAN: 8301, WAN: 8302)
           Encrypt: Gossip: true, TLS-Outgoing: true, TLS-Incoming: true

No cluster leader

Hello. Did everything according to your instructions, but unfortunately nothing works

if i try to get list peers

consul operator raft -list-peers -token=e986a3da-f69d-4d82-b497-cbc0cb382596 Operator "raft" subcommand failed: Unexpected response code: 500 (No cluster leader)

List of pods seems ok:

kubectl get pods NAME READY STATUS RESTARTS AGE consul-1-587bf84975-xv8pb 1/1 Running 0 3d1h consul-2-9c7bcb99b-75dpw 1/1 Running 0 3d1h consul-3-5699d494b8-twd2s 1/1 Running 0 3d1h

If i write kubectl pport forward ( f.e consul-2) ,I see only one under which I indicated

# consul members Node Address Status Type Build Protocol DC Segment worker-2 10.233.33.42:8301 alive server 0.7.2 2 devsecops <all>

Logs say the same thing:

2019/12/09 13:45:52 [ERR] agent: failed to sync remote state: No cluster leader 2019/12/09 13:45:56 [ERR] agent: coordinate update error: No cluster leader 2019/12/09 13:46:23 [ERR] agent: failed to sync remote state: No cluster leader 2019/12/09 13:46:27 [ERR] agent: coordinate update error: No cluster leader 2019/12/09 13:46:54 [ERR] agent: coordinate update error: No cluster leader 2019/12/09 13:46:59 [ERR] agent: failed to sync remote state: No cluster leader 2019/12/09 13:47:20 [ERR] agent: coordinate update error: No cluster leader 2019/12/09 13:47:22 [ERR] agent: failed to sync remote state: No cluster leader 2019/12/09 13:47:46 [ERR] agent: failed to sync remote state: No cluster leader 2019/12/09 13:47:49 [ERR] agent: coordinate update error: No cluster leader 2019/12/09 13:48:13 [ERR] agent: coordinate update error: No cluster leader 2019/12/09 13:48:16 [ERR] agent: failed to sync remote state: No cluster leader

UI doesnt work too

I hope u will help me ! Thx

Doubt about the TLS certificates

Hi,

I don´t know much about certificates but I could not understand why you used "server.dc1.cluster.local" as CN and hosts. Where did you get this URL from? Wasn't it supposed to be the server URL? Something more like consul.$(NAMESPACE).svc.cluster.local where $(NAMESPACE) I should replace with the namespace?

Thanks in advance,

Paulo Leal

Testing failing nodes does not restore the cluster....

$ kubectl delete pods consul-2 consul-1;

HTTP error code from Consul: 500 Internal Server Error

This is an error page for the Consul web UI. You may have visited a URL that is loading an unknown resource, so you can try going back to the root.

Otherwise, please report any unexpected issues on the GitHub page.

$kubectl  exec --tty -i consul-0 -- consul members
Node      Address           Status  Type    Build  Protocol  DC
consul-0  100.96.4.13:8301  alive   server  0.7.2  2         dc1
consul-1  100.96.7.6:8301   alive   server  0.7.2  2         dc1
consul-2  100.96.6.12:8301  alive   server  0.7.2  2         dc1

$ kubectl get pods -o wide
NAME           READY     STATUS    RESTARTS   AGE       IP            NODE
consul-0       1/1       Running   0          7h        100.96.4.13   ip-10-117-89-126.eu-west-1.compute.internal
consul-1       1/1       Running   0          8m        100.96.7.6    ip-10-117-97-131.eu-west-1.compute.internal
consul-2       1/1       Running   0          8h        100.96.6.12   ip-10-117-37-128.eu-west-1.compute.internal
docker-debug   1/1       Running   0          10h       100.96.6.2    ip-10-117-37-128.eu-west-1.compute.internal

$ kubectl  exec --tty -i consul-0 -- consul operator raft -list-peers
Operator "raft" subcommand failed: Unexpected response code: 500 (No cluster leader)

$ kubectl  exec --tty -i consul-0 -- consul members
Node      Address           Status  Type    Build  Protocol  DC
consul-0  100.96.4.13:8301  alive   server  0.7.2  2         dc1
consul-1  100.96.7.6:8301   alive   server  0.7.2  2         dc1
consul-2  100.96.6.12:8301  alive   server  0.7.2  2         dc1

$ kubectl  exec --tty -i consul-0 -- consul monitor
...
2017/01/20 10:50:59 [WARN] raft: Election timeout reached, restarting election
2017/01/20 10:50:59 [INFO] raft: Node at 100.96.4.13:8300 [Candidate] entering Candidate state in term 4324
2017/01/20 10:50:59 [ERR] raft: Failed to make RequestVote RPC to {Voter 100.96.6.10:8300 100.96.6.10:8300}: dial tcp 100.96.6.10:8300: getsockopt: no route to host
2017/01/20 10:50:59 [ERR] raft: Failed to make RequestVote RPC to {Voter 100.96.6.11:8300 100.96.6.11:8300}: dial tcp 100.96.6.11:8300: getsockopt: no route to host
2017/01/20 10:50:59 [ERR] raft: Failed to make RequestVote RPC to {Voter 100.96.4.7:8300 100.96.4.7:8300}: dial tcp 100.96.4.7:8300: getsockopt: no route to host
2017/01/20 10:51:01 [ERR] raft: Failed to make RequestVote RPC to {Voter 100.96.7.5:8300 100.96.7.5:8300}: dial tcp 100.96.7.5:8300: getsockopt: no route to host
2017/01/20 10:51:05 [WARN] raft: Election timeout reached, restarting election
2017/01/20 10:51:05 [INFO] raft: Node at 100.96.4.13:8300 [Candidate] entering Candidate state in term 4325
2017/01/20 10:51:08 [ERR] raft: Failed to make RequestVote RPC to {Voter 100.96.4.7:8300 100.96.4.7:8300}: dial tcp 100.96.4.7:8300: getsockopt: no route to host
2017/01/20 10:51:08 [ERR] raft: Failed to make RequestVote RPC to {Voter 100.96.6.10:8300 100.96.6.10:8300}: dial tcp 100.96.6.10:8300: getsockopt: no route to host
2017/01/20 10:51:08 [ERR] raft: Failed to make RequestVote RPC to {Voter 100.96.6.11:8300 100.96.6.11:8300}: dial tcp 100.96.6.11:8300: getsockopt: no route to host
2017/01/20 10:51:08 [ERR] raft: Failed to make RequestVote RPC to {Voter 100.96.7.5:8300 100.96.7.5:8300}: dial tcp 100.96.7.5:8300: getsockopt: no route to host
2017/01/20 10:51:12 [INFO] agent.rpc: Accepted client: 127.0.0.1:42080
...

Generate TLS Certificates error as must specify -cert or -domain

Hi @kelseyhightower

I am just following the document to deploy consul in k8s cluster. When I try to execute the Generate TLS certificate steps, getting below error.

cfssl gencert -initca ca/ca-config.json | cfssljson -bare ca
Must specify bundle target through -cert or -domain
Failed to parse input: unexpected end of JSON input

Can you please help me what I am missing here?

Thanks,
Sakthi

Using Consul to source a ConfigMap

Hi,

Does anybody know how I can fill a config map to configure my application from consul, so I can push the configuration into consul, kubernetes will read the configmap generated by that and use it to config the app?

Anybody understand what I mean?

conection refused

hello there

what i met was "connect refused" problem

I 've successfully deployed a cluster. I got this problem when deployment another one

Any assistance would be appreciated.

logs of pod:

ACLs Disabled

Hi, I've setup a 3 nodes kubernetes consul cluster and able to access the consul UI (localhost:8500/ui). Further, I'm trying test the services by login into the individual consul nodes and issue (curl -l http://localhost:8500/ui/) for health check. But, I got the the error message "ACLs are disabled in this Consul cluster. This is the default behavior, as you have to explicitly enable them". so, i'm trying to enable the ACL by made the following configurations under /consul/config/master.json and restarting all the three nodes one by one. But, the ACL seems to be not enabled at the end and not sure any issues in the configuration. so, I want to understand any configurations/steps missing here. It would be great if you could share your thoughts. Thanks.

{
"acl_datacenter":"mydc",
"acl_default_policy":"allow",
"acl_down_policy":"allow",
"acl_master_token":
}

https://www.consul.io/docs/guides/acl.html#bootstrapping-acls
http://jovandeginste.github.io/2016/05/04/turning-on-acl-s-in-our-consul-cluster.html

chown: /consul/config: Read-only file system

I'm getting this when I try to start the statefulset

NAME       READY     STATUS             RESTARTS   AGE       IP              NODE
consul-0   0/1       CrashLoopBackOff   1          12s       10.200.100.13   k8s-node-2

kubectl logs consul-0
chown: /consul/config: Read-only file system

I didn't see any other issue here so I'm assuming I did something really wrong,
but /consul/config is the mount point for the configmap, so it makes sense that it's read only I guess.

consul stuck on init after successful PVC claim is made

Hello,

So I started consul with the command kubectl create -f statefulsets/consul.yaml and I see PVC claim is successfully made. But after that, first instance of stateful set is still stuck at init state and I see the following message in Rancher UI.

Containers with unready status: [consul]

Any idea why this is happening?

Anyone help me how to install single consul server on Kubernetes standlone server

If i am going to use the same code of 3 node installation then what are all the set of modification required for that

AntiAffinity prevents containers from starting on single node

I am using minikube with only a single node. Because the statefulset has an anti-affinity rule, only the first container will start. Simply changing this from anti-affinity to affinity fixes the problem and allows all three containers to start.

      affinity:
#        podAntiAffinity:
        podAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: app
                    operator: In
                    values:
                      - consul
              topologyKey: kubernetes.io/hostname

Consul CLI using different port

Hi,

After running the command:
consul members
I got:
Error querying agent: Get http://127.0.0.1:8500/v1/agent/self: dial tcp 127.0.0.1:8500: connect: connection refused
So I changed the port forward to:
kubectl port-forward consul-0 8500:8500
and the CLI now is working.

rolling replace of k8s nodes causes "No Cluster Leader"

Sometimes I have to use Kops to do a rolling replace of the current nodes and this requires moving the consul cluster master nodes to new k8s nodes. Every time this is performed I have to manually reconnect the consul master nodes. Does anyone know of a way to have the consul nodes reconnect after they move k8s nodes? This also happens with k8s node taints.

PersistentVolumeClaim is not bound

Hello.
I has been create kubernetes cluster ( https://www.linuxtechi.com/install-kubernetes-1-7-centos7-rhel7/#comment-1221 ).
I has been installed kubernetes dashboard.
Now, I trying to install consul, but I got error: PersistentVolumeClaim is not bound: "data-consul-0" (repeated 4 times)
I has been create "Persistent Volumes" (isshues blachniet) - it's didn't help.
What could be else the problem?

Web ui doesnt work

Cant open web ui.
i got 3 pods on 3 nodes.
But noone on master node.
All on workers

kubectl -n default get po -l app=consul,component=server NAME READY STATUS RESTARTS AGE consul-0 1/1 Running 0 3h56m consul-1 1/1 Running 0 3h56m consul-2 1/1 Running 0 4m19s

if I make:
kubectl port-forward consul-0 8500:8500 Forwarding from 127.0.0.1:8500 -> 8500 Forwarding from [::1]:8500 -> 8500

I got:

consul members Node Address Status Type Build Protocol DC Segment consul-0 10.233.103.96:8301 alive server 1.4.0rc1 2 dc1 <all>

So consul-o works on another node.

If i try to open it :
kubectl port-forward --address 10.2.67.205 consul-0 8500:8500 Unable to listen on port 8500: Listeners failed to create with the following errors: [Unable to create listener: Error listen tcp4 10.2.67.205:8500: bind: cannot assign requested address] error: Unable to listen on any of the requested ports: [{8500 8500}]

service:
kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE consul ClusterIP None <none> 8500/TCP,8443/TCP,8400/TCP,8301/TCP,8301/UDP,8302/TCP,8302/UDP,8300/TCP,8600/TCP 3h38m

What can I do with that?

can't resolve consul-1.consul.$(NAMESPACE).svc.cluster.local when join.

I try to follow the guideline to install consul in kubernetes 1.7.5, but it is failure due to can't resolve the hostname consul-0.consul.$(NAMESPACE).svc.cluster.local, I use default namespace

kubectl logs consul-0
...

2017/11/01 12:48:58 [INFO] consul: Adding LAN server consul-0 (Addr: tcp/10.233.78.176:8300) (DC: dc1)
2017/11/01 12:48:58 [INFO] agent: Started DNS server 0.0.0.0:8600 (udp)
2017/11/01 12:48:58 [INFO] agent: Started DNS server 0.0.0.0:8600 (tcp)    2017/11/01 12:48:58 [INFO] agent: (LAN) joining: [consul-0.consul.default.svc.cluster.local consul-1.consul.default.svc.cluster.local consul-2.consul.default.svc.cluster.local]
2017/11/01 12:48:58 [INFO] agent: Started HTTP server on [::]:8500

2017/11/01 12:48:58 [INFO] agent: Retry join is supported for: aws azure gce softlayer
2017/11/01 12:48:58 [INFO] agent: Joining cluster...
2017/11/01 12:48:58 [WARN] memberlist: Failed to resolve consul-1.consul.default.svc.cluster.local: lookup consul-1.consul.default.svc.cluster.local on 10.233.0.3:53: no such host
2017/11/01 12:48:58 [WARN] memberlist: Failed to resolve consul-2.consul.default.svc.cluster.local: lookup consul-2.consul.default.svc.cluster.local on 10.233.0.3:53: no such host
2017/11/01 12:48:58 [INFO] agent: (LAN) joined: 1 Err: <nil>
2017/11/01 12:48:58 [INFO] agent: Join completed. Synced with 1 initial agents
2017/11/01 12:49:04 [WARN] raft: no known peers, aborting election

Are there anything I missed to configure in kubernetes or I need extra component there.

Error attaching EBS volume ... volume is in "creating" state

I have setup and installed this Consul stateful set on another kops cluster and it works fine. I added a new kops cluster, running the same scripts and the Pods fail to start with:

pod has unbound PersistentVolumeClaims (repeated 4 times)
AttachVolume.Attach failed for volume "pvc-b032bc3a-a660-11e8-af7e-06d95e0f9040" : "Error attaching EBS volume "vol-07968f2e9d0bb523d"" to instance "i-0f511d8cf681f0fc9" since volume is in "creating" state
Back-off restarting failed container

Any help would be much appreciated!

Join job with a scaling StatefulSet

Hi, thanks,
this repo has all been very helpful..

I have a question about the bootstrapping (consul join) Job,

If the StatefulSet was scaled on the fly, the consul join would need to be run for the new pod to join them in, is there any way to do this?

Any ideas on what to look into?