hashicorp / consul-helm Goto Github PK
View Code? Open in Web Editor NEWHelm chart to install Consul and other associated components.
License: Mozilla Public License 2.0
Helm chart to install Consul and other associated components.
License: Mozilla Public License 2.0
With reference to the to the README, it clearly states:
For now, we do not host a chart repository.
Because a Helm provider is now managed by HashiCorp, it would be very pleasent if a Helm chart repository also could be managed by HashiCorp. For example by using GitHub pages.
Then we don't have to first clone this repo or to download and unpack the chart, and we can very easily change the version of the Helm chart to be installed.
And then we could do everything in Terraform configuration, for example:
resource "helm_repository" "main" {
name = "hashicorp-consul"
url = "https://hashicorp.github.io/consul-helm/"
}
resource "helm_release" "main" {
name = "consul-westeurope"
repository = "${helm_repository.main.metadata.0.name}"
chart = "consul"
version = "0.1.0"
set {
name = "global.datacenter"
value = "westeurope"
}
}
Please vote on this issue by adding a ๐ reaction.
It would be nice if an ingress resource could optionally be configured for the ui, ideally with ability to configure labels, annotations, and TLS.
For example: how it's done in the prometheus-operator chart
Ran into this today when using helm template
to generate YAML to be subsequently applied with kubectl
.
Today only connect-inject-serviceaccount.yaml
and sync-catalog-service-account.yaml
appear to be setting metadata.namespace
via {{ .Release.Namespace }}
Ideally, helm
would do this automatically when running template --namespace=<x>
to mirror the behavior of helm install --namespace=<x>
, but it does not. See: helm/helm#3553 for a longer discussion.
Other projects, such as Istio, have worked around this limitation by including metadata.name
on all resource templates. See: istio/istio#4606
If I access the UI, I can see my services (with the -proxy
suffix) plus the consul
service, and I can also see the services via the CLI (kubectl exec ... consul catalog services
), but when I go to the Intentions tab, the dropdown menu only shows * (All Services)
and consul
, and if I try to manually enter a service name, it lets me create it but I get Use a future Consul Service called '<servicename>'
when entering the name.
The strange thing is that if I create an intention anyway, it does take effect, but only if I create it without the -proxy
suffix (i.e. the name of the services based on their annotation and/or the name of the first container). The behavior is the same when creating the intention via the CLI.
Is this the intended Consul behavior? Or is that something specific to consul-helm/consul-k8s?
I'm using consul-helm 0.3.0 (installed with Helm 2.10), consul-k8s 0.2.1, and Kubernetes 1.9.4, running on EC2 (not EKS). I have the sync catalog enabled but defaulted to false
and both toConsul
and toK8S
are set to false
.
In connect-inject-mutatingwebhook.yaml, namespace is hardcoded "default".
In my cluster I have rook running. When provisioning the consul cluster, my helm values file looks like this:
global:
enabled: true
domain: consul
image: "consul:1.2.3"
datacenter: dc1
server:
enabled: "-"
image: null
replicas: 3
bootstrapExpect: 3
storage: 10Gi
storageClass: rook-ceph-block
connect: true
resources: {}
updatePartition: 0
disruptionBudget:
enabled: true
maxUnavailable: null
extraConfig: |
{}
extraVolumes: []
client:
enabled: "-"
image: null
join: null
resources: {}
extraConfig: |
{}
extraVolumes: []
dns:
enabled: "-"
ui:
enabled: "-"
service:
enabled: true
type: null
connectInject:
enabled: false # "-" disable this by default for now until the image is public
image: "TODO"
default: false # true will inject by default, otherwise requires annotation
caBundle: "" # empty will auto generate the bundle
namespaceSelector: null
certs:
secretName: null
caBundle: ""
certName: tls.crt
keyName: tls.key
In order to start the chart, I use the following cmdline:
helm install -f ./helm/values.digitalocean.yaml --name consul --namespace service-discovery ./helm
NAME: consul
LAST DEPLOYED: Wed Sep 26 07:46:32 2018
NAMESPACE: service-discovery
STATUS: DEPLOYED
RESOURCES:
==> v1beta1/PodDisruptionBudget
NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE
consul-server N/A 0 0 1s
==> v1/Pod(related)
NAME READY STATUS RESTARTS AGE
consul-56lvs 0/1 ContainerCreating 0 1s
consul-jttwp 0/1 ContainerCreating 0 1s
consul-qpgdn 0/1 ContainerCreating 0 1s
consul-server-0 0/1 Pending 0 1s
consul-server-1 0/1 Pending 0 1s
consul-server-2 0/1 Pending 0 1s
==> v1/ConfigMap
NAME DATA AGE
consul-client-config 1 1s
consul-server-config 1 1s
==> v1/Service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
consul-dns ClusterIP 10.3.169.187 <none> 53/TCP,53/UDP 1s
consul-server ClusterIP None <none> 8500/TCP,8301/TCP,8301/UDP,8302/TCP,8302/UDP,8300/TCP,8600/TCP,8600/UDP 1s
consul-ui ClusterIP 10.3.20.200 <none> 80/TCP 1s
==> v1/DaemonSet
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
consul 3 3 0 3 0 <none> 1s
==> v1/StatefulSet
NAME DESIRED CURRENT AGE
consul-server 3 3 1s
A quick verification of pvc indicate that they have bound successfully, please note that I've observed that it takes up to 20s to bind the pvc(s) running under rook
consul-qpgdn 0/1 ContainerCreating 0 1s
consul-server-0 0/1 Pending 0 1s
consul-server-1 0/1 Pending 0 1s
consul-server-2 0/1 Pending 0 1s
==> v1/ConfigMap
NAME DATA AGE
consul-client-config 1 1s
consul-server-config 1 1s
==> v1/Service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
consul-dns ClusterIP 10.3.169.187 <none> 53/TCP,53/UDP 1s
consul-server ClusterIP None <none> 8500/TCP,8301/TCP,8301/UDP,8302/TCP,8302/UDP,8300/TCP,8600/TCP,8600/UDP 1s
consul-ui ClusterIP 10.3.20.200 <none> 80/TCP 1s
==> v1/DaemonSet
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
consul 3 3 0 3 0 <none> 1s
==> v1/StatefulSet
NAME DESIRED CURRENT AGE
consul-server 3 3 1s
mmisztal@tsunami ๎ฐ ~/Projects/@cloud-technologies/ops-k8s-services-inf/src/consul ๎ฐ ๎ master โ ๎ฐ kubectl get pvc --all-namespaces
NAMESPACE NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
service-discovery data-consul-server-0 Bound pvc-85de967e-c14f-11e8-8c17-c6926976ea12 10Gi RWO rook-ceph-block 13s
service-discovery data-consul-server-1 Bound pvc-85e4d3d0-c14f-11e8-8c17-c6926976ea12 10Gi RWO rook-ceph-block 13s
service-discovery data-consul-server-2 Bound pvc-85ea1341-c14f-11e8-8c17-c6926976ea12 10Gi RWO rook-ceph-block 13s
However the consul-server pods have failed to start:
kubectl -n service-discovery get pods
NAME READY STATUS RESTARTS AGE
consul-56lvs 0/1 Running 0 36s
consul-jttwp 0/1 Running 0 36s
consul-qpgdn 0/1 Running 0 36s
consul-server-0 0/1 ContainerCreating 0 36s
consul-server-1 0/1 ContainerCreating 0 36s
consul-server-2 0/1 Running 0 36s
An examination of the pod indicates that the readiness probe has failed:
kubectl -n service-discovery describe pod consul-server-2
Name: consul-server-2
Namespace: service-discovery
Node: k8s-node-1.cloud-technologies.net/142.93.131.205
Start Time: Wed, 26 Sep 2018 07:47:42 +0200
Labels: app=consul
chart=consul-0.1.0
component=server
controller-revision-hash=consul-server-66479c5df5
hasDNS=true
release=consul
statefulset.kubernetes.io/pod-name=consul-server-2
Annotations: consul.hashicorp.com/connect-inject=false
Status: Running
IP: 10.2.1.13
Controlled By: StatefulSet/consul-server
Containers:
consul:
Container ID: docker://09641e9f57faf0304b9e74818ee99f2a7dd23a4f1bc44fa6e426ad4be2d72578
Image: consul:1.2.3
Image ID: docker-pullable://consul@sha256:ea66d17d8c8c1f1afb2138528d62a917093fcd2e3b3a7b216a52c253189ea980
Ports: 8500/TCP, 8301/TCP, 8302/TCP, 8300/TCP, 8600/TCP, 8600/UDP
Command:
/bin/sh
-ec
CONSUL_FULLNAME="consul"
exec /bin/consul agent \
-advertise="${POD_IP}" \
-bind=0.0.0.0 \
-bootstrap-expect=3 \
-client=0.0.0.0 \
-config-dir=/consul/config \
-datacenter=dc1 \
-data-dir=/consul/data \
-domain=consul \
-hcl="connect { enabled = true }" \
-ui \
-retry-join=${CONSUL_FULLNAME}-server-0.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \
-retry-join=${CONSUL_FULLNAME}-server-1.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \
-retry-join=${CONSUL_FULLNAME}-server-2.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \
-server
State: Running
Started: Wed, 26 Sep 2018 07:48:17 +0200
Ready: True
Restart Count: 0
Readiness: exec [/bin/sh -ec curl http://127.0.0.1:8500/v1/status/leader 2>/dev/null | \
grep -E '".+"'
] delay=5s timeout=5s period=3s #success=1 #failure=2
Environment:
POD_IP: (v1:status.podIP)
NAMESPACE: service-discovery (v1:metadata.namespace)
Mounts:
/consul/config from config (rw)
/consul/data from data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-js48j (ro)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: data-consul-server-2
ReadOnly: false
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: consul-server-config
Optional: false
default-token-js48j:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-js48j
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 10m default-scheduler Successfully assigned consul-server-2 to k8s-node-1.cloud-technologies.net
Normal SuccessfulMountVolume 10m kubelet, k8s-node-1.cloud-technologies.net MountVolume.SetUp succeeded for volume "config"
Normal SuccessfulMountVolume 10m kubelet, k8s-node-1.cloud-technologies.net MountVolume.SetUp succeeded for volume "default-token-js48j"
Warning FailedMount 10m kubelet, k8s-node-1.cloud-technologies.net MountVolume.SetUp failed for volume "pvc-85ea1341-c14f-11e8-8c17-c6926976ea12" : invalid character '-' after top-level value
Normal SuccessfulMountVolume 10m kubelet, k8s-node-1.cloud-technologies.net MountVolume.SetUp succeeded for volume "pvc-85ea1341-c14f-11e8-8c17-c6926976ea12"
Normal Pulled 10m kubelet, k8s-node-1.cloud-technologies.net Container image "consul:1.2.3" already present on machine
Normal Created 10m kubelet, k8s-node-1.cloud-technologies.net Created container
Normal Started 9m kubelet, k8s-node-1.cloud-technologies.net Started container
Warning Unhealthy 6m (x12 over 9m) kubelet, k8s-node-1.cloud-technologies.net Readiness probe failed:
mmisztal@tsunami ๎ฐ ~/Projects/@cloud-technologies/ops-k8s-services-inf/src/consul ๎ฐ ๎ master โ ๎ฐ
An examination of the server logs indicates that it has failed to form the cluster:
kubectl -n service-discovery logs consul-server-2
==> Starting Consul agent...
bootstrap_expect > 0: expecting 3 servers
==> Consul agent running!
Version: 'v1.2.3'
Node ID: '0a5797a9-fee3-d61c-61d8-01b500a9e3c8'
Node name: 'consul-server-2'
Datacenter: 'dc1' (Segment: '<all>')
Server: true (Bootstrap: false)
Client Addr: [0.0.0.0] (HTTP: 8500, HTTPS: -1, DNS: 8600)
Cluster Addr: 10.2.1.13 (LAN: 8301, WAN: 8302)
Encrypt: Gossip: false, TLS-Outgoing: false, TLS-Incoming: false
==> Log data will now stream in as it occurs:
2018/09/26 05:48:17 [INFO] raft: Initial configuration (index=0): []
2018/09/26 05:48:17 [INFO] raft: Node at 10.2.1.13:8300 [Follower] entering Follower state (Leader: "")
2018/09/26 05:48:17 [INFO] serf: EventMemberJoin: consul-server-2.dc1 10.2.1.13
2018/09/26 05:48:17 [INFO] serf: EventMemberJoin: consul-server-2 10.2.1.13
2018/09/26 05:48:17 [INFO] agent: Started DNS server 0.0.0.0:8600 (udp)
2018/09/26 05:48:17 [INFO] consul: Adding LAN server consul-server-2 (Addr: tcp/10.2.1.13:8300) (DC: dc1)
2018/09/26 05:48:17 [INFO] consul: Handled member-join event for server "consul-server-2.dc1" in area "wan"
2018/09/26 05:48:17 [WARN] agent/proxy: running as root, will not start managed proxies
2018/09/26 05:48:17 [INFO] agent: Started DNS server 0.0.0.0:8600 (tcp)
2018/09/26 05:48:17 [INFO] agent: Started HTTP server on [::]:8500 (tcp)
2018/09/26 05:48:17 [INFO] agent: started state syncer
2018/09/26 05:48:17 [INFO] agent: Retry join LAN is supported for: aliyun aws azure digitalocean gce k8s os packet scaleway softlayer triton vsphere
2018/09/26 05:48:17 [INFO] agent: Joining LAN cluster...
2018/09/26 05:48:17 [INFO] agent: (LAN) joining: [consul-server-0.consul-server.service-discovery.svc consul-server-1.consul-server.service-discovery.svc consul-server-2.consul-server.service-discovery.svc]
2018/09/26 05:48:17 [INFO] serf: EventMemberJoin: consul-56lvs 10.2.3.10
2018/09/26 05:48:17 [INFO] serf: EventMemberJoin: consul-jttwp 10.2.1.11
2018/09/26 05:48:17 [INFO] serf: EventMemberJoin: consul-server-1 10.2.3.11
2018/09/26 05:48:17 [INFO] consul: Adding LAN server consul-server-1 (Addr: tcp/10.2.3.11:8300) (DC: dc1)
2018/09/26 05:48:17 [WARN] memberlist: Refuting a suspect message (from: consul-server-2.dc1)
2018/09/26 05:48:17 [INFO] serf: EventMemberJoin: consul-server-1.dc1 10.2.3.11
2018/09/26 05:48:17 [INFO] consul: Handled member-join event for server "consul-server-1.dc1" in area "wan"
2018/09/26 05:48:17 [WARN] memberlist: Failed to resolve consul-server-2.consul-server.service-discovery.svc: lookup consul-server-2.consul-server.service-discovery.svc on 10.3.0.10:53: no such host
2018/09/26 05:48:17 [INFO] agent: (LAN) joined: 1 Err: <nil>
2018/09/26 05:48:17 [INFO] agent: Join LAN completed. Synced with 1 initial agents
2018/09/26 05:48:24 [WARN] raft: no known peers, aborting election
2018/09/26 05:48:25 [ERR] agent: failed to sync remote state: No cluster leader
Any hints what may be wrong?
I've noticed that the probe's initialDelaySeconds
default value is 5, so I'm guessing it may have failed before the pvcs have been bound? Perhaps it'd make sense to have this value configurable?
When starting agent instances, the agents use the hostnames of the pods consul-{random} to register as nodes. IMHO this isn't desired and creates confusion for the operator of the consul cluster, b/c it's impossible to tell which node represents which physical k8s node.
I install from helm.
kubectl describe pvc
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal FailedBinding 7s (x2 over 7s) persistentvolume-controller no persistent volumes available for this claim and no storage class is set
In my cluster no persistent volumes and storage class.
Which pv and sc i need add to my cluster?
Thanks
May be a duplicate of other issues (#9, #10), but they are closed without solution.
Helm: v2.11.0
Minikube: v0.30.0
Consul-Helm: checked out at v0.3.0
Installation process:
$ git clone https://github.com/hashicorp/consul-helm.git
$ cd consul-helm
$ git checkout v0.3.0
$ helm install --name consul ./
After installing via Helm, the pods either never Ready or sit in pending state:
$ => kubectl get pods
NAME READY STATUS RESTARTS AGE
consul-mw5pz 0/1 Running 0 2m
consul-server-0 0/1 Running 0 2m
consul-server-1 0/1 Pending 0 2m
consul-server-2 0/1 Pending 0 2m
Logs from the consul pod are endless loop of:
2018/11/08 09:04:12 [ERR] agent: failed to sync remote state: rpc error making call: No cluster leader
2018/11/08 09:04:28 [ERR] consul: "Coordinate.Update" RPC failed to server 172.17.0.7:8300: rpc error making call: No cluster leader
2018/11/08 09:04:28 [ERR] agent: Coordinate update error: rpc error making call: No cluster leader
2018/11/08 09:04:34 [ERR] consul: "Catalog.NodeServices" RPC failed to server 172.17.0.7:8300: rpc error making call: No cluster leader
Logs from the first consul-server pod are repeated:
2018/11/08 09:05:16 [ERR] agent: Coordinate update error: No cluster leader
2018/11/08 09:05:27 [ERR] agent: failed to sync remote state: No cluster leader
The failure is straightfoward I think - none of the servers are started in bootstrap mode and the first one refuses to self-elect. The second and third servers are never started because the first is never ready.
Is there some missing information from the readme?
There are metrics:
It makes sense delpoyed as sidecar in pods of consul-server.
How can i do it?
Thanks
I'm trying to integrate consul in our kubernetes cluster and have it connect to an existing consul cluster. That part works fine. What's not working is enabling "syncCatalog". For now, I just want kubernetes services in the consul catalog. So I've disabled sync to kubernetes. But the consul-sync-catalog pod immediately crashes. The error message is not entirely helpful.
$ kubectl -n consul logs -f consul-sync-catalog-d888cf85d-dpvjz
2018-10-05T06:25:18.706Z [INFO ] to-consul/source: starting runner for endpoints
2018-10-05T06:25:18.707Z [INFO ] to-consul/sink: ConsulSyncer quitting
ERROR: logging before flag.Parse: E1005 06:25:18.707409 7 controller.go:115] Error syncing cache
ERROR: logging before flag.Parse: E1005 06:25:18.707434 7 controller.go:115] Error syncing cache
Here's my values.yaml file.
---
global:
enabled: true
domain: consul
image: consul:1.2.3
imageK8S: hashicorp/consul-k8s:0.1.0
datacenter: aoc-devtest
server:
enabled: "-"
image:
replicas: 3
bootstrapExpect: 3
storage: 10Gi
storageClass: ibmc-file-bronze
connect: true
resources: {}
updatePartition: 0
disruptionBudget:
enabled: true
maxUnavailable:
extraConfig: |
{
"encrypt": "secretkey",
"retry_join_wan": [
"10.115.173.171",
"10.115.173.188",
"10.115.173.176"
]
}
extraVolumes: []
client:
enabled: "-"
image:
join:
resources: {}
extraConfig: |
{
"encrypt": "secretkey"
}
extraVolumes: []
dns:
enabled: "-"
ui:
enabled: "-"
service:
enabled: true
type:
syncCatalog:
enabled: "-"
image:
toConsul: true
toK8S: false
k8sPrefix:
connectInject:
enabled: false
image: TODO
default: false
caBundle: ''
namespaceSelector:
certs:
secretName:
caBundle: ''
certName: tls.crt
keyName: tls.key
Hi Guys!
I had tried join consul cluster using cloud auto-join in Kubernetes.
I'm using helm for consul with configuration below:
First attempt
client:
enabled: true
image: null
join: ["provider=aws", "tag_key=consul-staging", "tag_value=auto-join"]
Second attempt
join:
- "provider=aws"
- "tag_key=consul-staging"
- "tag_value=auto-join"
For the both ways I got the same error:
Failed to resolve tag_key=consul-staging: lookup tag_key=consul-staging: no such host
* Failed to resolve tag_value=auto-join: lookup tag_value=auto-join: no such host
2018/11/09 15:21:35 [WARN] agent: Join LAN failed: <nil>, retrying in 30s
2018/11/09 15:21:37 [WARN] manager: No servers available
2018/11/09 15:21:37 [ERR] http: Request GET /v1/status/leader, error: No known Consul servers from=127.0.0.1:59748
2018/11/09 15:21:47 [WARN] manager: No servers available
2018/11/09 15:21:47 [ERR] http: Request GET /v1/status/leader, error: No known Consul servers from=127.0.0.1:59818
2018/11/09 15:21:53 [WARN] manager: No servers available
2018/11/09 15:21:53 [ERR] agent: failed to sync remote state: No known Consul servers
2018/11/09 15:21:57 [WARN] manager: No servers available
2018/11/09 15:21:57 [ERR] http: Request GET /v1/status/leader, error: No known Consul servers from=127.0.0.1:59874
2018/11/09 15:22:05 [INFO] discover-aws: Address type is not supported. Valid values are {private_v4,public_v4,public_v6}. Falling back to 'private_v4'
2018/11/09 15:22:05 [INFO] discover-aws: Region not provided. Looking up region in metadata...
2018/11/09 15:22:05 [INFO] discover-aws: Region is us-east-1
2018/11/09 15:22:05 [INFO] discover-aws: Filter instances with =
2018/11/09 15:22:05 [INFO] agent: Discovered LAN servers:
2018/11/09 15:22:05 [INFO] agent: (LAN) joining: [tag_key=consul-staging tag_value=auto-join]
2018/11/09 15:22:05 [WARN] memberlist: Failed to resolve tag_key=consul-staging: lookup tag_key=consul-staging: no such host
2018/11/09 15:22:05 [WARN] memberlist: Failed to resolve tag_value=auto-join: lookup tag_value=auto-join: no such host
2018/11/09 15:22:05 [INFO] agent: (LAN) joined: 0 Err: 2 error(s) occurred:
Just for test purpose I have changed "join" attribute instead of tag_key and tag_value to consul-cluster IP address:
client:
enabled: true
image: null
join: ["10.29.20.137", "10.29.20.148", "10.29.20.60"] # Fake IPs
With this configuration passing IP address I got join on cluster successfully.
For troubleshooting:
I've checked role permissions attached in kubernetes node - OK.
I've checked communications between nodes and consul-server - OK for ALL ports.
I've have other consul clients using cloud-discovery and that clients works perfectly.
I've followed instruction from issue #16 but unsuccessfully.
Can help me?
If you need more detailed information please just let me known.
Thanks,
Dynamic:
https://github.com/hashicorp/consul-helm/blob/master/templates/connect-inject-mutatingwebhook.yaml#L6
Static:
https://github.com/hashicorp/consul-helm/blob/master/templates/connect-inject-deployment.yaml#L53
I'd PR but I'm not sure how you'd want to handle that.. if it's variable or static in both locations just let me know and I can submit one.
Helm/tiller version: 2.11.0 with init & RBAC --service-account tiller
set
Kubectl client/server version: 1.11.5
Target Provider/platform: AKS
When deploying consul-helm with value of connectInject.enabled=true, the below error pops up.
If I manually apply the connect-inject service account, role binding, mutatingwebhook and deployment it fails to inject connect even if annotation is added to pod manifest.
Error Message:
Error: release consul failed: clusterroles.rbac.authorization.k8s.io "consul-connect-injector-webhook" is forbidden: attempt to grant extra privileges: [{[get] [admissionregistration.k8s.io] [mutatingwebhookconfigurations] [] []} {[list] [admissionregistration.k8s.io] [mutatingwebhookconfigurations] [] []} {[watch] [admissionregistration.k8s.io] [mutatingwebhookconfigurations] [] []} {[patch] [admissionregistration.k8s.io] [mutatingwebhookconfigurations] [] []}] user=&{system:serviceaccount:kube-system:default 6aa4a5a7-f800-11e8-bc17-0a58ac1f0dfd [system:serviceaccounts system:serviceaccounts:kube-system system:authenticated] map[]} ownerrules=[] ruleResolutionErrors=[clusterroles.rbac.authorization.k8s.io "system:discovery" not found]
Pick one in the documentation online and READMEs
I installed helm
, kubectl
, and bats
on an Ubuntu system and ran the unit test suite bats ./test/unit
and all tests failed.
โ client/ConfigMap: enabled by default
(in test file test/unit/client-configmap.bats, line 11)
`[ "${actual}" = "true" ]' failed
โ client/ConfigMap: enable with global.enabled false
(in test file test/unit/client-configmap.bats, line 22)
`[ "${actual}" = "true" ]' failed
โ client/ConfigMap: disable with client.enabled
(in test file test/unit/client-configmap.bats, line 32)
`[ "${actual}" = "false" ]' failed
โ client/ConfigMap: disable with global.enabled
(in test file test/unit/client-configmap.bats, line 42)
`[ "${actual}" = "false" ]' failed
โ client/ConfigMap: extraConfig is set
(in test file test/unit/client-configmap.bats, line 52)
`[ ! -z "${actual}" ]' failed
โ connectInject/Deployment: disabled by default
(in test file test/unit/connect-inject-deployment.bats, line 11)
`[ "${actual}" = "false" ]' failed
โ connectInject/Deployment: enable with global.enabled false
(in test file test/unit/connect-inject-deployment.bats, line 22)
`[ "${actual}" = "true" ]' failed
โ connectInject/Deployment: disable with connectInject.enabled
(in test file test/unit/connect-inject-deployment.bats, line 32)
`[ "${actual}" = "false" ]' failed
โ connectInject/Deployment: disable with global.enabled
(in test file test/unit/connect-inject-deployment.bats, line 42)
`[ "${actual}" = "false" ]' failed
โ connectInject/Deployment: no secretName: no tls-{cert,key}-file set
(in test file test/unit/connect-inject-deployment.bats, line 52)
`[ "${actual}" = "false" ]' failed
โ connectInject/Deployment: with secretName: tls-{cert,key}-file set
(in test file test/unit/connect-inject-deployment.bats, line 77)
`[ "${actual}" = "true" ]' failed
โ dns/Service: enabled by default
(in test file test/unit/dns-service.bats, line 11)
`[ "${actual}" = "true" ]' failed
/tmp/bats.153734.src: line 11: yq: command not found
โ dns/Service: enable with global.enabled false
(in test file test/unit/dns-service.bats, line 22)
`[ "${actual}" = "true" ]' failed
โ dns/Service: disable with dns.enabled
(in test file test/unit/dns-service.bats, line 32)
`[ "${actual}" = "false" ]' failed
โ dns/Service: disable with global.enabled
(in test file test/unit/dns-service.bats, line 42)
`[ "${actual}" = "false" ]' failed
โ server/ConfigMap: enabled by default
(in test file test/unit/server-configmap.bats, line 11)
`[ "${actual}" = "true" ]' failed
โ server/ConfigMap: enable with global.enabled false
(in test file test/unit/server-configmap.bats, line 22)
`[ "${actual}" = "true" ]' failed
โ server/ConfigMap: disable with server.enabled
(in test file test/unit/server-configmap.bats, line 32)
`[ "${actual}" = "false" ]' failed
โ server/ConfigMap: disable with global.enabled
(in test file test/unit/server-configmap.bats, line 42)
`[ "${actual}" = "false" ]' failed
โ server/ConfigMap: extraConfig is set
(in test file test/unit/server-configmap.bats, line 52)
`[ ! -z "${actual}" ]' failed
โ server/DisruptionBudget: enabled by default
(in test file test/unit/server-disruptionbudget.bats, line 11)
`[ "${actual}" = "true" ]' failed
โ server/DisruptionBudget: enable with global.enabled false
(in test file test/unit/server-disruptionbudget.bats, line 22)
`[ "${actual}" = "true" ]' failed
โ server/DisruptionBudget: disable with server.enabled
(in test file test/unit/server-disruptionbudget.bats, line 32)
`[ "${actual}" = "false" ]' failed
โ server/DisruptionBudget: disable with server.disruptionBudget.enabled
(in test file test/unit/server-disruptionbudget.bats, line 42)
`[ "${actual}" = "false" ]' failed
โ server/DisruptionBudget: disable with global.enabled
(in test file test/unit/server-disruptionbudget.bats, line 52)
`[ "${actual}" = "false" ]' failed
โ server/DisruptionBudget: correct maxUnavailable with n=3
(in test file test/unit/server-disruptionbudget.bats, line 62)
`[ "${actual}" = "0" ]' failed
โ server/Service: enabled by default
(in test file test/unit/server-service.bats, line 11)
`[ "${actual}" = "true" ]' failed
โ server/Service: enable with global.enabled false
(in test file test/unit/server-service.bats, line 22)
`[ "${actual}" = "true" ]' failed
โ server/Service: disable with server.enabled
(in test file test/unit/server-service.bats, line 32)
`[ "${actual}" = "false" ]' failed
โ server/Service: disable with global.enabled
(in test file test/unit/server-service.bats, line 42)
`[ "${actual}" = "false" ]' failed
โ server/Service: tolerates unready endpoints
(in test file test/unit/server-service.bats, line 53)
`[ "${actual}" = "true" ]' failed
โ server/StatefulSet: enabled by default
(in test file test/unit/server-statefulset.bats, line 11)
`[ "${actual}" = "true" ]' failed
โ server/StatefulSet: enable with global.enabled false
(in test file test/unit/server-statefulset.bats, line 22)
`[ "${actual}" = "true" ]' failed
โ server/StatefulSet: disable with server.enabled
(in test file test/unit/server-statefulset.bats, line 32)
`[ "${actual}" = "false" ]' failed
โ server/StatefulSet: disable with global.enabled
(in test file test/unit/server-statefulset.bats, line 42)
`[ "${actual}" = "false" ]' failed
/tmp/bats.155386.src: line 42: yq: command not found
โ server/StatefulSet: image defaults to global.image
(in test file test/unit/server-statefulset.bats, line 52)
`[ "${actual}" = "foo" ]' failed
โ server/StatefulSet: image can be overridden with server.image
(in test file test/unit/server-statefulset.bats, line 63)
`[ "${actual}" = "bar" ]' failed
โ server/StatefulSet: no updateStrategy when not updating
(in test file test/unit/server-statefulset.bats, line 75)
`[ "${actual}" = "null" ]' failed
โ server/StatefulSet: updateStrategy during update
(in test file test/unit/server-statefulset.bats, line 85)
`[ "${actual}" = "RollingUpdate" ]' failed
โ server/StatefulSet: adds extra volume
(in test file test/unit/server-statefulset.bats, line 111)
`[ "${actual}" = "foo" ]' failed
โ server/StatefulSet: adds extra secret volume
(in test file test/unit/server-statefulset.bats, line 156)
`[ "${actual}" = "null" ]' failed
โ server/StatefulSet: adds loadable volume
(in test file test/unit/server-statefulset.bats, line 197)
`[ "${actual}" = "1" ]' failed
โ ui/Service: enabled by default
(in test file test/unit/ui-service.bats, line 11)
`[ "${actual}" = "true" ]' failed
โ ui/Service: enable with global.enabled false
(in test file test/unit/ui-service.bats, line 23)
`[ "${actual}" = "true" ]' failed
โ ui/Service: disable with server.enabled
(in test file test/unit/ui-service.bats, line 33)
`[ "${actual}" = "false" ]' failed
โ ui/Service: disable with ui.enabled
(in test file test/unit/ui-service.bats, line 43)
`[ "${actual}" = "false" ]' failed
โ ui/Service: disable with ui.service.enabled
(in test file test/unit/ui-service.bats, line 53)
`[ "${actual}" = "false" ]' failed
โ ui/Service: disable with global.enabled
(in test file test/unit/ui-service.bats, line 63)
`[ "${actual}" = "false" ]' failed
โ ui/Service: disable with global.enabled and server.enabled on
(in test file test/unit/ui-service.bats, line 74)
`[ "${actual}" = "false" ]' failed
โ ui/Service: no type by default
(in test file test/unit/ui-service.bats, line 83)
`[ "${actual}" = "null" ]' failed
โ ui/Service: specified type
(in test file test/unit/ui-service.bats, line 93)
`[ "${actual}" = "LoadBalancer" ]' failed
I see yq being used in the unit test suite but not documented anywhere in the README. After installing yq I reran the tests and they still fail.
It's mentioned in the guide that consul client will expose the port 8500 on host machine but after deploying the consul using helm chart client is not exposing any port on host machine.
kubectl describe pod consul-df8js
Name: consul-df8js
Namespace: default
Node: minikube/10.0.2.15
Start Time: Fri, 05 Oct 2018 19:56:18 +0530
Labels: app=consul
chart=consul-0.1.0
component=client
controller-revision-hash=53542314
hasDNS=true
pod-template-generation=1
release=consul
Annotations: consul.hashicorp.com/connect-inject=false
Status: Running
IP: 172.17.0.10
Controlled By: DaemonSet/consul
Containers:
consul:
Container ID: docker://6688a53c6d651d3226bfde66a2cdf77ea193721987b2b81ca6c46c8ac0e26bf3
Image: consul:1.2.2
Image ID: docker-pullable://consul@sha256:8603f0d1b2278364ecb7c11068a477b1ea648df735eda8791362063aba99656a
Ports: 8500/TCP, 8301/TCP, 8302/TCP, 8300/TCP, 8600/TCP, 8600/UDP
Command:
/bin/sh
-ec
CONSUL_FULLNAME="consul"
exec /bin/consul agent \
-advertise="${POD_IP}" \
-bind=0.0.0.0 \
-client=0.0.0.0 \
-config-dir=/consul/config \
-datacenter=dc1 \
-data-dir=/consul/data \
-retry-join=${CONSUL_FULLNAME}-server-0.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \
-retry-join=${CONSUL_FULLNAME}-server-1.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \
-retry-join=${CONSUL_FULLNAME}-server-2.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \
-domain=consul
State: Running
Started: Fri, 05 Oct 2018 19:56:54 +0530
Ready: True
Restart Count: 0
Readiness: exec [/bin/sh -ec curl http://127.0.0.1:8500/v1/status/leader 2>/dev/null | \
grep -E '".+"'
] delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
POD_IP: (v1:status.podIP)
NAMESPACE: default (v1:metadata.namespace)
Mounts:
/consul/config from config (rw)
/consul/data from data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-px6mr (ro)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
data:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: consul-client-config
Optional: false
default-token-px6mr:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-px6mr
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/disk-pressure:NoSchedule
node.kubernetes.io/memory-pressure:NoSchedule
node.kubernetes.io/not-ready:NoExecute
node.kubernetes.io/unreachable:NoExecute
Events: <none>
I would suggest to run this in host mode so that client ports are available for other clients which are outside K8 to connect with consul running on K8.
I tried to create the service using NodePort so that I can use that port to connect my external consul client with through the consul client which is running as DS in K8 but no luck.
apiVersion: v1
kind: Service
metadata:
name: consulclientsvc
labels:
run: consulclientsvc
spec:
type: NodePort
ports:
- port: 8500
targetPort: 8500
protocol: TCP
name: consulport
selector:
app: consul
component: client
kubectl get svc consulclientsvc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
consulclientsvc NodePort 10.100.249.49 <none> 8500:31664/TCP 11s
kubectl get ep consulclientsvc
NAME ENDPOINTS AGE
consulclientsvc 172.17.0.10:8500 20s
But when the external consul client is trying to register with the consul running in k8 it's getting failed with the following error.
docker@consul1:~$ docker run -d --rm --net=host consul agent --retry-join=192.168.99.100:31664 -bind=192.168.99.101
80d24b20ea64d3443c9e89912d2cf2b98787bbb1a1d44b0c8aa93896af7aecc2
docker@consul1:~$ docker logs 80d24b20ea64d3443c9e89912d2cf2b98787bbb1a1d44b0c8aa93896af7aecc2
==> Starting Consul agent...
==> Consul agent running!
Version: 'v1.2.3'
Node ID: '8a7a2c00-1147-dab0-ad5b-306b4273e869'
Node name: 'consul1'
Datacenter: 'dc1' (Segment: '')
Server: false (Bootstrap: false)
Client Addr: [127.0.0.1] (HTTP: 8500, HTTPS: -1, DNS: 8600)
Cluster Addr: 192.168.99.101 (LAN: 8301, WAN: 8302)
Encrypt: Gossip: false, TLS-Outgoing: false, TLS-Incoming: false
==> Log data will now stream in as it occurs:
2018/10/06 08:14:13 [INFO] serf: EventMemberJoin: consul1 192.168.99.101
2018/10/06 08:14:13 [INFO] agent: Started DNS server 127.0.0.1:8600 (udp)
2018/10/06 08:14:13 [INFO] agent: Started DNS server 127.0.0.1:8600 (tcp)
2018/10/06 08:14:13 [INFO] agent: Started HTTP server on 127.0.0.1:8500 (tcp)
2018/10/06 08:14:13 [INFO] agent: started state syncer
2018/10/06 08:14:13 [INFO] agent: Retry join LAN is supported for: aliyun aws azure digitalocean gce k8s os packet scaleway softlayer triton vsphere
2018/10/06 08:14:13 [INFO] agent: Joining LAN cluster...
2018/10/06 08:14:13 [INFO] agent: (LAN) joining: [192.168.99.100:31664]
2018/10/06 08:14:13 [WARN] manager: No servers available
2018/10/06 08:14:13 [ERR] agent: failed to sync remote state: No known Consul servers
~~
Setting up the Auto-join values in values.yaml results in error
join:
- "provider=gce project_name=test-0 tag_value=consul-server"
Error -
==> config: Unknown extra arguments: [project_name=test-0 tag_value=consul-server -domain=consul]
Is this valid error or something is wrong with my configuration
2018-12-05T17:28:53.592Z [INFO ] to-consul/sink: ConsulSyncer quitting
ERROR: logging before flag.Parse: E1205 17:28:53.592459 7 controller.go:115] Error syncing cache
2018-12-05T17:28:53.592Z [INFO ] to-consul/source: starting runner for endpoints
ERROR: logging before flag.Parse: E1205 17:28:53.592565 7 controller.go:115] Error syncing cache
2018-12-05T17:28:53.593Z [WARN ] to-consul/sink: error querying services, will retry: err="Get http://192.168.43.4:8500/v1/catalog/services?index=1&stale=&wait=60000ms: dial tcp 192.168.43.4:8500: connect: connection refused"
2018-12-05T17:28:53.593Z [WARN ] to-consul/sink: error querying services, will retry: err="Get http://192.168.43.4:8500/v1/catalog/services?index=1&stale=&wait=60000ms: dial tcp 192.168.43.4:8500: connect: connection refused"
2018-12-05T17:28:53.593Z [WARN ] to-consul/sink: error querying services, will retry: err="Get http://192.168.43.4:8500/v1/catalog/services?index=1&stale=&wait=60000ms: dial tcp 192.168.43.4:8500: connect: connection refused"
I registered out service in consul use rest api consul client(daemonset). But after restart pod all service is clean of this node, although this service is alive.
Is this right?
If I want to connect a kubernetes consul cluster with another consul cluster in another datacenter over WAN, the communication only works one way, because this helm chart does not expose a route to the consul servers. So within the kubernetes cluster, I can resolve services that are running in the external datacenter. But if I'm in the external datacenter, I'm unable to resolve services that are running in the kubernetes cluster. The error given is 500 (rpc error: No path to datacenter)
Currently when running the Helm chart against a Kubernetes cluster that has RBAC fully enabled, no RBAC policies exist for consul-k8s which prevents the functionality from being able to communicate with the Kubernetes API.
The required policy to enable consul-k8s to function also does not exist anywhere in the documentation.
Note: this will also be needed when the connect auto-injection is complete as well
Lines 70 to 73 in ec0de41
How exactly should client.extraConfig and client.extraVolumes be set so that we can have TLS enabled from Consul client to a Consul cluster outside of kubernetes?
Could you provide examples?
Trial and error formatting this 'raw JSON payload' via "helm --set client.extraConfig" isn't working out
We want to not have to customize values.yaml by hand but rather just use the chart
What I did:
I set my pod disruption budget's maxUnavailable
to 0
in my Helm values override file. I then installed the helm chart to a minikube cluster.
What I saw:
When running helm install
, I saw an error that told me the maxUnavailable
had been set to -1
, not the 0
that I had used.
$ helm install -f helm-consul-values.yaml ./consul-helm
Error: release pouring-monkey failed: PodDisruptionBudget.policy "pouring-monkey-consul
-server" is invalid: spec.maxUnavailable: Invalid value: -1: must be greater than or eq
ual to 0
What I expected:
I expected that setting maxUnavailable
to 0
would be interpreted by the chart as a valid value.
Other context:
If I set the value to 1
, everything works as expected. This suggests that a piece of logic is interpreting 0
as false
and fails to use the value. Values 1
and greater are interpreted correctly.
My Helm values file has (this is the config that causes the above error):
server:
replicas: 1
bootstrapExpect: 1
disruptionBudget:
enabled: true
maxUnavailable: 0
The issue may be related to code in _helpers.tpl
which checks if .Values.server.disruptionBudget.maxUnavailable
but should look for if .Values.server.disruptionBudget.enabled
.
I opened pretty much this same ticket on consul-k8s, but I thought it was worth a shot to open one here too, so please forgive the copy/paste.
I'm trying to use Vault as CA. I'm running Vault with a self-signed cert. I can add the public CA cert for it to the trusted store on my containers, including the Consul server stateful set pods (modifying the stateful set template a little bit), so I'm able to configure Vault as the CA. However, when the Envoy sidecar injected by consul-k8s is spinning up the proxy, I get this:
[2018-12-18 23:29:52.516][1][info][upstream] source/common/upstream/cluster_manager_impl.cc:494] add/update cluster local_app during init
[2018-12-18 23:29:52.516][1][warning][upstream] source/common/config/grpc_mux_impl.cc:226] gRPC config for type.googleapis.com/envoy.api.v2.Cluster update rejected: Failed to load trusted CA certificates from <inline>
[2018-12-18 23:29:52.516][1][warning][config] bazel-out/k8-opt/bin/source/common/config/_virtual_includes/grpc_mux_subscription_lib/common/config/grpc_mux_subscription_impl.h:70] gRPC config for type.googleapis.com/envoy.api.v2.Cluster rejected: Failed to load trusted CA certificates from <inline>
[2018-12-18 23:29:52.516][1][info][upstream] source/common/upstream/cluster_manager_impl.cc:135] cm init: all clusters initialized
[2018-12-18 23:29:52.516][1][info][main] source/server/server.cc:421] all clusters initialized. initializing init manager
[2018-12-18 23:29:52.518][1][warning][upstream] source/common/config/grpc_mux_impl.cc:226] gRPC config for type.googleapis.com/envoy.api.v2.Listener update rejected: Error adding/updating listener public_listener:100.126.123.255:20000: Failed to load trusted CA certificates from <inline>
[2018-12-18 23:29:52.518][1][warning][config] bazel-out/k8-opt/bin/source/common/config/_virtual_includes/grpc_mux_subscription_lib/common/config/grpc_mux_subscription_impl.h:70] gRPC config for type.googleapis.com/envoy.api.v2.Listener rejected: Error adding/updating listener public_listener:100.126.123.255:20000: Failed to load trusted CA certificates from <inline>
The line Failed to load trusted CA certificates from <inline>
makes me think that it's not able to talk to Vault because of the lack of trust. Indeed if I exec
into the sidecar and curl
my Vault endpoint, I get the standard "Can't verify server identity" error that goes away with -k
.
Is there a way to either pass/inject the public CA cert for the Envoy sidecar to use, or at least set some environment variable to ignore cert verification?
For what it's worth, I'm not entirely sure if this is an issue with Envoy, Consul, this chart, or consul-k8s, but I thought this was a good starting point. Also, it seems to me like something similar is being addressed on the Consul side, but I was wondering if something can be done in the meantime.
We are running the consul chart in a cluster that taints the controller nodes. We would like to run the consul client on the controller nodes as well. In order to do this we need to be able to specify tolerations to those taints.
https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/
We don't want, for now, to register Consul services as K8s services (we are just reaching those using service_name.service.domain
), so we are using this config:
syncCatalog:
# True if you want to enable the catalog sync. "-" for default.
enabled: true
image: null
# toConsul and toK8S control whether syncing is enabled to Consul or K8S
# as a destination. If both of these are disabled, the sync will do nothing.
toConsul: true
toK8S: false
However, having toK8S: false
makes consul-sync-catalog
not to start:
consul-sync-catalog-644884767d-b97bn 0/1 Error 2 19s
consul-sync-catalog-644884767d-b97bn 0/1 CrashLoopBackOff 2 20s
consul-sync-catalog-644884767d-b97bn 0/1 Error 3 48s
consul-sync-catalog-644884767d-b97bn 0/1 CrashLoopBackOff 3 1m
consul-sync-catalog-644884767d-b97bn 0/1 Error 4 1m
consul-sync-catalog-644884767d-b97bn 0/1 CrashLoopBackOff 4 1m
consul-sync-catalog-644884767d-b97bn 1/1 Running 5 2m
consul-sync-catalog-644884767d-b97bn 0/1 Error 5 2m
consul-sync-catalog-644884767d-b97bn 0/1 CrashLoopBackOff 5 3m
The logs for thad pod are:
โ ~ kubectl logs -f consul-sync-catalog-644884767d-k6th9
2018-10-11T05:38:31.376Z [INFO ] to-consul/sink: ConsulSyncer quitting
ERROR: logging before flag.Parse: E1011 05:38:31.377307 6 controller.go:115] Error syncing cache
2018-10-11T05:38:31.378Z [INFO ] to-consul/source: starting runner for endpoints
ERROR: logging before flag.Parse: E1011 05:38:31.378587 6 controller.go:115] Error syncing cache
When toK8S: true
K8s services are registered in Consul correctly, but as I explained, we don't want Consul to K8S to be synced. We are not using CoreDNS for now and it seems it's required for toK8S
Running helm install --name consul --namespace=pkr -f dev-consul.yaml ./consul-helm
with these custom values:
syncCatalog:
enabled: true
server:
storage: "8Gi"
It completes successfully, but when running kubectl get pods -n pkr
, I see that consul-server-1
& consul-server-2
are Pending.
NAME READY STATUS RESTARTS AGE
consul-4vstb 0/1 Running 0 15m
consul-server-0 0/1 Running 0 15m
consul-server-1 0/1 Pending 0 15m
consul-server-2 0/1 Pending 0 15m
consul-sync-catalog-587b6859f6-dpj5v 1/1 Running 0 15m
Closer inspection on pods consul-server-0
:
kubectl describe pods consul-server-0 -n pkr
Name: consul-server-0
Namespace: pkr
Node: minikube/10.0.2.15
Start Time: Sat, 29 Sep 2018 22:39:09 +0300
Labels: app=consul
chart=consul-0.1.0
component=server
controller-revision-hash=consul-server-66479c5df5
hasDNS=true
release=consul
statefulset.kubernetes.io/pod-name=consul-server-0
Annotations: consul.hashicorp.com/connect-inject=false
Status: Running
IP: 172.17.0.9
Controlled By: StatefulSet/consul-server
Containers:
consul:
Container ID: docker://7de44c6027bdb78ba4b7bc73643701aa9e0bbb55abce8ce2c7b8e12e2adf82b0
Image: consul:1.2.3
Image ID: docker-pullable://consul@sha256:ea66d17d8c8c1f1afb2138528d62a917093fcd2e3b3a7b216a52c253189ea980
Ports: 8500/TCP, 8301/TCP, 8302/TCP, 8300/TCP, 8600/TCP, 8600/UDP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/UDP
Command:
/bin/sh
-ec
CONSUL_FULLNAME="consul"
exec /bin/consul agent \
-advertise="${POD_IP}" \
-bind=0.0.0.0 \
-bootstrap-expect=3 \
-client=0.0.0.0 \
-config-dir=/consul/config \
-datacenter=dc1 \
-data-dir=/consul/data \
-domain=consul \
-hcl="connect { enabled = true }" \
-ui \
-retry-join=${CONSUL_FULLNAME}-server-0.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \
-retry-join=${CONSUL_FULLNAME}-server-1.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \
-retry-join=${CONSUL_FULLNAME}-server-2.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \
-server
State: Running
Started: Sat, 29 Sep 2018 22:39:10 +0300
Ready: False
Restart Count: 0
Readiness: exec [/bin/sh -ec curl http://127.0.0.1:8500/v1/status/leader 2>/dev/null | \
grep -E '".+"'
] delay=5s timeout=5s period=3s #success=1 #failure=2
Environment:
POD_IP: (v1:status.podIP)
NAMESPACE: pkr (v1:metadata.namespace)
Mounts:
/consul/config from config (rw)
/consul/data from data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-fd6r9 (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: data-consul-server-0
ReadOnly: false
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: consul-server-config
Optional: false
default-token-fd6r9:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-fd6r9
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 31m (x2 over 31m) default-scheduler pod has unbound PersistentVolumeClaims
Normal Scheduled 31m default-scheduler Successfully assigned consul-server-0 to minikube
Normal SuccessfulMountVolume 31m kubelet, minikube MountVolume.SetUp succeeded for volume "pvc-55188bbe-c41f-11e8-b65d-080027750557"
Normal SuccessfulMountVolume 31m kubelet, minikube MountVolume.SetUp succeeded for volume "config"
Normal SuccessfulMountVolume 31m kubelet, minikube MountVolume.SetUp succeeded for volume "default-token-fd6r9"
Normal Pulled 31m kubelet, minikube Container image "consul:1.2.3" already present on machine
Normal Created 31m kubelet, minikube Created container
Normal Started 31m kubelet, minikube Started container
Warning Unhealthy 16m (x299 over 30m) kubelet, minikube Readiness probe failed:
This "pod has unbound PersistentVolumeClaims" error is same for all consul servers. Yet, when running kubectl get pvc & kubectl get pv, I see persistent volumes fine:
kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-55188bbe-c41f-11e8-b65d-080027750557 8Gi RWO Delete Bound pkr/data-consul-server-0 standard 37m
pvc-55245df6-c41f-11e8-b65d-080027750557 8Gi RWO Delete Bound pkr/data-consul-server-1 standard 37m
pvc-5533ebc2-c41f-11e8-b65d-080027750557 8Gi RWO Delete Bound pkr/data-consul-server-2 standard 37m
kubectl get pvc -n pkr
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
data-consul-server-0 Bound pvc-55188bbe-c41f-11e8-b65d-080027750557 8Gi RWO standard 38m
data-consul-server-1 Bound pvc-55245df6-c41f-11e8-b65d-080027750557 8Gi RWO standard 38m
data-consul-server-2 Bound pvc-5533ebc2-c41f-11e8-b65d-080027750557 8Gi RWO standard 38m
postgresql Bound pvc-56367c46-c41f-11e8-b65d-080027750557 8Gi RWO standard 38m
So, I don't understand the error, since pv & pvc outputs seem fine to me.
How should I debug this further? I've tried deleting the minikube cluster and starting over from scratch, but I get this result every time.
kubectl version
Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.3", GitCommit:"a4529464e4629c21224b3d52edfe0ea91b072862", GitTreeState:"clean", BuildDate:"2018-09-10T11:44:36Z", GoVersion:"go1.11", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.0", GitCommit:"fc32d2f3698e36b93322a3465f63a14e9f0eaead", GitTreeState:"clean", BuildDate:"2018-03-26T16:44:10Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
helm version
Client: &version.Version{SemVer:"v2.11.0", GitCommit:"2e55dbe1fdb5fdb96b75ff144a339489417b146b", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.11.0", GitCommit:"2e55dbe1fdb5fdb96b75ff144a339489417b146b", GitTreeState:"clean"}```
I have syncCatalog is enabled. I registered outside service on node use rest api consul agent. In kubernetes auto create service(my-service) from consul catalog.
How can i access my-service from kubernetes use dns name?
my-service.default return ip pod of consul-agent(daemontset), but my need ip node where run consul-agent.
Thanks
I am attempting to upgrade our consul install from chart 0.1.0
to 0.4.0
and am receiving the following error messages:
Error: UPGRADE FAILED: DaemonSet.apps "consul" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"app":"consul", "chart":"consul-0.4.0", "component":"client", "hasDNS":"true", "release":"consul"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable
&& StatefulSet.apps "consul-server" is invalid: spec: Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', and 'updateStrategy' are forbidden.
Looks like the lable chart
is being updating from consul-0.1.0
to consul-0.4.0
, which seems to be the norm. However this is causing the spec.selector.matchLabels.chart
section in the StatefulSet and DaemonSet to change, which is an immutable field. I was able to get around this issue by changing the chart version back to 0.1.0
in the Chart.yaml
file.
I think we should remove the chart
label from any label selectors as this value will change with chart upgrades.
Will try to make a PR but currently the docs relating to consul-helm here are related to kube-dns:
https://www.consul.io/docs/platform/k8s/dns.html
I believe as of k8s v1.11 CoreDNS is now the standard name server.
https://kubernetes.io/docs/tasks/administer-cluster/dns-custom-nameservers/#coredns
Unfortunately it's a completely different syntax for coredns vs kube-dns. :(
With the following command, I always get an error on an AKS cluster
helm delete consul --tiller-namespace helm
Error:
Error: deletion completed with 8 error(s): object not found, skipping delete; object not found, skipping delete; object not found, skipping delete; object not found, skipping delete; object not found, skipping delete; object not found, skipping delete; object not found, skipping delete; object not found, skipping delete
additional:
Wat also sometimes happen is that a pvc does not de-provision correctly. Not sure if this is an issue of the combination of helm consul and aks, or just AKS. then i need to do it manually.
When trying to test Consul using the default Helm Chart (and when changing a few settings to match local environment), it is never managing to elect a leader.
Looking at the Pod Statuses the POD_IP
variable is not set, I tracked the status in the Kubernetes UI and the IP field does not populate for several seconds after the container is created (at which point it is showing the Environment Variables). This leads to a loop of log errors where it cannot resolve the other Pods so it cannot elect a leader.
This is on a brand new AKS cluster.
consul-connect-injector-webhook-deployment pod crash-loops with error in the logs
flag provided but not defined: -consul-image
in connect-inject-deployment.yaml on line 47 there is -consul-image="{{ default .Values.global.image .Values.connectInject.imageConsul }}" \
removing this line fixes the error
I have submitted PR #63 for this
When using the extraVolumes
configuration option in the values.yaml
, the Helm chart fails to start Consul correctly with the error data_dir is empty
.
All secrets exist and when tested by telling the chart by setting them to load: false
Consul starts correctly.
Upon further investigation the issue appears to be a missing \
in the loop that adds the extra config directories for the statefulset and the daemonset:
https://github.com/hashicorp/consul-helm/blob/master/templates/server-statefulset.yaml#L91
https://github.com/hashicorp/consul-helm/blob/master/templates/client-daemonset.yaml#L77
I would submit a Pull Request but the company I work for currently doesn't have a policy in place for contributing to open source software.
I would like to describe my understanding and the current state of the project on this topic. It would be nice to discuss how best to use Hashicorp Consul with Kubernetes across multiple Datacenters.
Unfortunately at the moment there are very few guides on the topic "Multi Datacenter Consul Setup on Kubernetes", so I will try to describe the problems that I discovered and would like to discuss them and find the right solutions.
The first problem I see is "Federation with the WAN Gossip Pool".
Imagine we have two independent networks, each has a k8s cluster with the same CIRD for pods (10.0.0.0/14). k8s nodes of the first network will be created in 172.100.0/24
range and nodes of the second network in 174.200.0/24
.
For both clusters we use the current helm chart to bootstrap Consul cluster.
If we are going to join these clusters, it would be impossible, because the Consul servers are not reachable from the outside and each consul server has only an internal pod IP address:
A possible solution to this problem might be #27 feat: enable consul-servers to be accessed externally, but what if we didnโt want the Consul servers to be accessible with an external ip address. Then we have to open the consul server ports on the host machine (k8s node) that does not overlap with the consul client ports that are already open on the host. I created a PR for this: #84
In addition, we also need to allow connections between networks (aka firewall rules) and to configure the custom iptables for target network on each cluster: https://github.com/bowei/k8s-custom-iptables
I'm not sure that
k8s-custom-iptables
should be added to the current chart.
If we apply these changes, we can join two clusters.
consul-k8s tool synchronises k8s services with consul. The problem here is that it is intended only for single cluster and writes only internal pod addresses into the consul service catalog. Also if we create a NodePort-service, only the entrypoints of this service will be written to the catalog. So we cannot reach a service from another datacenter or from any non-k8s vm on the same network
.
consul-k8s tool also inserts a sidecar consul connect proxy into a pod and provide a secure connection to the service. The problem here is the same - it is intended only for single cluster, since the connect proxy service is registered with a private k8s address (pod CIDR range) in the current data center.
A possible solution for this problem would be to register the proxy on a random host port or create a k8s NodePort-service for this proxy and sync it to the catalog with an external IP address:
P.S.
We really would like to combine several datacenters into one service mesh and use all consul features across these datacenters, but at this stage of the project it is simply impossible.
I have a 3 replicas setup and for some reason the PDB was set to 0.
ceil (sub (div (int .Values.server.replicas) 2) 1)
looks good to me but it's possible the calculation is off in GoTemplate ?
I checked out tag v0.3.0
and found that Chart.yaml still shows version: 0.1.0
.
I wonder: when will the next release occur? I also found #26 and forked from 0.3.0 to work around the issue.
I was struggling with getting even the most basic examples from da-connect-demo to work. I posted an issue here, but eventually concluded that the issue was with the chart and not the examples...
I had been observing that service registration was failing and seemed to be the source of all my problems. As to why it was failing...
I had installed / deleted / re-installed the Consul Helm chart a number of times in the consul-system
namespace. As noted, on every attempt to make the examples work, service registrations failed... silently.
I eventually stumbled upon a realization that things do work when I install the chart in a new namespace. Anytime I re-install to a namespace that previously had the chart installed, things would go back to not working. Clearly, there was some state being left behind that wasn't removed by helm delete consul --purge
...
What I had initially failed to realize was that when helm delete consul --purge
deleted the StatefulSet
for the Consul server, it does not cascade that delete down to PVCs...
So, in any case where I deleted and re-installed in a namespace that already contained those PVCs, I was effectively launching a new cluster of Consul servers backed by old data. I'm not clear on why that doesn't work (one would think it should), but it doesn't.
If I take care to manually delete PVCs (or the entire namespace) after helm delete consul --purge
, it gets me back on track for the next time I re-install into the same namespace.
So...
I'm reporting one of the two following issues. Either:
or
helm delete consul --purge
should be documented to help others avoid the surprises I was encountering.We're wanting to use Consul in GKE. Our current architecture has applications in multiple clusters, depending on their context. We'd like to be able to use the helm chart to install the server and clients, but the server would reside in its own cluster. There doesn't seem to be a way to change the ServiceType for the server and/or client currently. I believe that we'd need to use something other than ClusterIP for our use case (please correct me if I'm wrong, I'd love to see a POC).
Deploying default helm chart onto a blank minikube system has DNS issues.
minikube version: v0.28.2
kubernetes 1.10
helm version 2.10.0 (client and server)
MaxOS 10.13.6
Log output of first consul-server after launching it:
$ kubectl logs kissable-duck-consul-server-0
bootstrap_expect > 0: expecting 3 servers
==> Starting Consul agent...
==> Consul agent running!
Version: 'v1.2.3'
Node ID: '7a2aeb77-2b45-9c3b-0409-2b6612e949d1'
Node name: 'kissable-duck-consul-server-0'
Datacenter: 'dc1' (Segment: '')
Server: true (Bootstrap: false)
Client Addr: [0.0.0.0] (HTTP: 8500, HTTPS: -1, DNS: 8600)
Cluster Addr: 172.17.0.8 (LAN: 8301, WAN: 8302)
Encrypt: Gossip: false, TLS-Outgoing: false, TLS-Incoming: false
==> Log data will now stream in as it occurs:
2018/09/25 00:03:39 [INFO] raft: Initial configuration (index=0): []
2018/09/25 00:03:39 [INFO] raft: Node at 172.17.0.8:8300 [Follower] entering Follower state (Leader: "")
2018/09/25 00:03:39 [INFO] serf: EventMemberJoin: kissable-duck-consul-server-0.dc1 172.17.0.8
2018/09/25 00:03:39 [INFO] serf: EventMemberJoin: kissable-duck-consul-server-0 172.17.0.8
2018/09/25 00:03:39 [INFO] agent: Started DNS server 0.0.0.0:8600 (udp)
2018/09/25 00:03:39 [INFO] consul: Adding LAN server kissable-duck-consul-server-0 (Addr: tcp/172.17.0.8:8300) (DC: dc1)
2018/09/25 00:03:39 [INFO] consul: Handled member-join event for server "kissable-duck-consul-server-0.dc1" in area "wan"
2018/09/25 00:03:39 [WARN] agent/proxy: running as root, will not start managed proxies
2018/09/25 00:03:39 [INFO] agent: Started DNS server 0.0.0.0:8600 (tcp)
2018/09/25 00:03:39 [INFO] agent: Started HTTP server on [::]:8500 (tcp)
2018/09/25 00:03:39 [INFO] agent: started state syncer
2018/09/25 00:03:39 [INFO] agent: Retry join LAN is supported for: aliyun aws azure digitalocean gce k8s os packet scaleway softlayer triton vsphere
2018/09/25 00:03:39 [INFO] agent: Joining LAN cluster...
2018/09/25 00:03:39 [INFO] agent: (LAN) joining: [kissable-duck-consul-server-0.kissable-duck-consul-server.default.svc kissable-duck-consul-server-1.kissable-duck-consul-server.default.svc kissable-duck-consul-server-2.kissable-duck-consul-server.default.svc]
2018/09/25 00:03:39 [WARN] memberlist: Failed to resolve kissable-duck-consul-server-0.kissable-duck-consul-server.default.svc: lookup kissable-duck-consul-server-0.kissable-duck-consul-server.default.svc on 10.96.0.10:53: no such host
2018/09/25 00:03:39 [WARN] memberlist: Failed to resolve kissable-duck-consul-server-1.kissable-duck-consul-server.default.svc: lookup kissable-duck-consul-server-1.kissable-duck-consul-server.default.svc on 10.96.0.10:53: no such host
2018/09/25 00:03:39 [WARN] memberlist: Failed to resolve kissable-duck-consul-server-2.kissable-duck-consul-server.default.svc: lookup kissable-duck-consul-server-2.kissable-duck-consul-server.default.svc on 10.96.0.10:53: no such host
2018/09/25 00:03:39 [INFO] agent: (LAN) joined: 0 Err: 3 error(s) occurred:
Any ideas what is going wrong?
Hello,
Enhancement: AWS internal ALB and NLB configuration support would be nice for Consul on AWS EKS.
Reference: https://kubernetes.io/docs/concepts/services-networking/service/#internal-load-balancer
It would be pretty awesome if the chart would support providing a storage class name via the values.yaml file. Is there a chance of having that implemented?
Attempt to setup consul via helm using this repo and the directions provided via the blog.
The pods fail with the following.
The pods
`==> Starting Consul agent...
==> Consul agent running!
Version: 'v1.2.3'
Node ID: 'ff2962b2-4429-b113-1ae5-c40f78fb7fb6'
Node name: 'giggly-mouse-consul-4p9zd'
Datacenter: 'dc1' (Segment: '')
Server: false (Bootstrap: false)
Client Addr: [0.0.0.0] (HTTP: 8500, HTTPS: -1, DNS: 8600)
Cluster Addr: 100.110.0.1 (LAN: 8301, WAN: 8302)
Encrypt: Gossip: false, TLS-Outgoing: false, TLS-Incoming: false
==> Log data will now stream in as it occurs:
2018/10/03 17:50:05 [INFO] serf: EventMemberJoin: giggly-mouse-consul-4p9zd 100.110.0.1
2018/10/03 17:50:05 [WARN] agent/proxy: running as root, will not start managed proxies
2018/10/03 17:50:05 [INFO] agent: Started DNS server 0.0.0.0:8600 (udp)
2018/10/03 17:50:05 [INFO] agent: Started DNS server 0.0.0.0:8600 (tcp)
2018/10/03 17:50:05 [INFO] agent: Started HTTP server on [::]:8500 (tcp)
2018/10/03 17:50:05 [INFO] agent: started state syncer
2018/10/03 17:50:05 [INFO] agent: Retry join LAN is supported for: aliyun aws azure digitalocean gce k8s os packet scaleway softlayer triton vsphere
2018/10/03 17:50:05 [INFO] agent: Joining LAN cluster...
2018/10/03 17:50:05 [INFO] agent: (LAN) joining: [giggly-mouse-consul-server-0.giggly-mouse-consul-server.default.svc giggly-mouse-consul-server-1.giggly-mouse-consul-server.default.svc giggly-mouse-consul-server-2.giggly-mouse-consul-server.default.svc]
2018/10/03 17:50:05 [WARN] manager: No servers available
2018/10/03 17:50:05 [ERR] agent: failed to sync remote state: No known Consul servers
2018/10/03 17:50:05 [WARN] memberlist: Failed to resolve giggly-mouse-consul-server-0.giggly-mouse-consul-server.default.svc: lookup giggly-mouse-consul-server-0.giggly-mouse-consul-server.default.svc on 100.64.0.10:53: no such host
2018/10/03 17:50:05 [WARN] memberlist: Failed to resolve giggly-mouse-consul-server-1.giggly-mouse-consul-server.default.svc: lookup giggly-mouse-consul-server-1.giggly-mouse-consul-server.default.svc on 100.64.0.10:53: no such host
2018/10/03 17:50:05 [WARN] memberlist: Failed to resolve giggly-mouse-consul-server-2.giggly-mouse-consul-server.default.svc: lookup giggly-mouse-consul-server-2.giggly-mouse-consul-server.default.svc on 100.64.0.10:53: no such host
2018/10/03 17:50:05 [INFO] agent: (LAN) joined: 0 Err: 3 error(s) occurred:
I was and am able to use the repo at https://github.com/helm/charts/tree/master/stable/consul to run traefik with consul.
Also it is a bit confusing that there exists a chart on helm and one at hashicorp.
I don't really understand the following change #62. the resources
block is an object and it's using the tpl
function now which expects a string. What is the expected value to be used to be able to apply something like the following?
resources:
requests:
memory: "10Gi"
limits:
memory: "10Gi"
Hello,
I have an EKS cluster with 3 workers, and deployed Consul and Rabbitmq via Helm. Both services are running, and I am able to access the Consul UI.
Each eks-worker in AWS has a private IP, as well as a few secondary IPs, which are used for Pods.
I am attempting to get syncCatalog working, but I am getting connection refused.
2018-10-17T23:52:23.572Z [WARN ] to-consul/sink: error registering service: node-name=ip-10-20-132-50.ec2.internal service-name=halting-penguin-rabbitmq err="Put http://10.20.62.49:8500/v1/catalog/register: dial tcp 10.20.62.49:8500: connect: connection refused"
IP - 10.20.62.49 is a primary ip of one of my eks-workers, however the pod that runs consul server (8500) is actually running on a secondary IP - 10.20.61.120 on that same eks-worker.
kubectl describe pod consul-server-2
Name: consul-server-2
Namespace: default
Node: ip-10-20-62-49.ec2.internal/10.20.62.49
Start Time: Wed, 17 Oct 2018 16:12:01 -0700
Labels: app=consul
chart=consul-0.1.0
component=server
controller-revision-hash=consul-server-66c8b8459c
hasDNS=true
release=consul
statefulset.kubernetes.io/pod-name=consul-server-2
Annotations: consul.hashicorp.com/connect-inject: false
Status: Running
IP: 10.20.61.120
How can I register services the Consul cluster via the pod IP address which is on my eks-worker's secondary IP, and not connect to the primary ip?
Please let me know if I need to provide more information
Thank you
When I do helm install .
I get the following error:
Error: parse error in "consul/templates/_helpers.tpl": template: consul/templates/_helpers.tpl:1: function "ceil" not defined
Any idea what I am missing?
helm version
Client: &version.Version{SemVer:"v2.10.0", GitCommit:"9ad53aac42165a5fadc6c87be0dea6b115f93090", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.6.1+unreleased", GitCommit:"c98726f110e2149ba6780f88bc9a3cff22c37923", GitTreeState:"clean"}
Thanks!
AKS cluster spun up using Terraform in an existing Azure Subnet (hard-coded variables substituted in to clarify configuration)
resource "azurerm_kubernetes_cluster" "aks" {
name = "${local.aks_name}"
location = "eastus"
resource_group_name = "${var.resource_group_name}"
dns_prefix = "${local.dns_prefix}"
kubernetes_version = "1.10.8" # Results in same errors with the AKS default 1.9.9
agent_pool_profile {
name = "default"
count = "5"
vm_size = "Standard_DS2"
os_type = "Linux"
os_disk_size_gb = 30
vnet_subnet_id = "${var.subnet_id}"
}
linux_profile {
admin_username = "localadmin"
ssh_key {
key_data = "${var.ssh_key}"
}
}
service_principal {
client_id = "${var.spn_client}"
client_secret = "${var.spn_secret}"
}
tags {
environment = "${var.environment}"
}
}
Used the helm chart to install with commit 8b57bed
helm install --name az1 --namespace consul .
After a few minutes logs from consul-consul-server-0
bootstrap_expect > 0: expecting 3 servers
==> Starting Consul agent...
==> Consul agent running!
Version: 'v1.2.3'
Node ID: '441321d9-cf1c-9ea4-08d1-063d3aacb69c'
Node name: 'az1-consul-server-0'
Datacenter: 'dc1' (Segment: '<all>')
Server: true (Bootstrap: false)
Client Addr: [0.0.0.0] (HTTP: 8500, HTTPS: -1, DNS: 8600)
Cluster Addr: 10.244.9.4 (LAN: 8301, WAN: 8302)
Encrypt: Gossip: false, TLS-Outgoing: false, TLS-Incoming: false
==> Log data will now stream in as it occurs:
2018/10/17 01:42:00 [INFO] raft: Initial configuration (index=0): []
2018/10/17 01:42:00 [INFO] raft: Node at 10.244.9.4:8300 [Follower] entering Follower state (Leader: "")
2018/10/17 01:42:00 [INFO] serf: EventMemberJoin: az1-consul-server-0.dc1 10.244.9.4
2018/10/17 01:42:00 [INFO] serf: EventMemberJoin: az1-consul-server-0 10.244.9.4
2018/10/17 01:42:00 [INFO] consul: Handled member-join event for server "az1-consul-server-0.dc1" in area "wan"
2018/10/17 01:42:00 [INFO] consul: Adding LAN server az1-consul-server-0 (Addr: tcp/10.244.9.4:8300) (DC: dc1)
2018/10/17 01:42:00 [INFO] agent: Started DNS server 0.0.0.0:8600 (udp)
2018/10/17 01:42:00 [WARN] agent/proxy: running as root, will not start managed proxies
2018/10/17 01:42:00 [INFO] agent: Started DNS server 0.0.0.0:8600 (tcp)
2018/10/17 01:42:00 [INFO] agent: Started HTTP server on [::]:8500 (tcp)
2018/10/17 01:42:00 [INFO] agent: started state syncer
2018/10/17 01:42:00 [INFO] agent: Retry join LAN is supported for: aliyun aws azure digitalocean gce k8s os packet scaleway softlayer triton vsphere
2018/10/17 01:42:00 [INFO] agent: Joining LAN cluster...
2018/10/17 01:42:00 [INFO] agent: (LAN) joining: [az1-consul-server-0.az1-consul-server.consul.svc az1-consul-server-1.az1-consul-server.consul.svc az1-consul-server-2.az1-consul-server.consul.svc]
2018/10/17 01:42:05 [WARN] raft: no known peers, aborting election
2018/10/17 01:42:07 [ERR] agent: failed to sync remote state: No cluster leader
==> Failed to check for updates: Get https://checkpoint-api.hashicorp.com/v1/check/consul?arch=amd64&os=linux&signature=a447a0f9-f6b7-3a33-a136-81045c4b26d6&version=1.2.3: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
2018/10/17 01:42:28 [ERR] agent: Coordinate update error: No cluster leader
2018/10/17 01:42:38 [ERR] agent: failed to sync remote state: No cluster leader
2018/10/17 01:42:52 [ERR] agent: Coordinate update error: No cluster leader
2018/10/17 01:43:06 [INFO] agent: (LAN) joined: 1 Err: <nil>
2018/10/17 01:43:06 [INFO] agent: Join LAN completed. Synced with 1 initial agents
2018/10/17 01:43:06 [ERR] agent: failed to sync remote state: No cluster leader
2018/10/17 01:43:19 [ERR] agent: Coordinate update error: No cluster leader
2018/10/17 01:43:29 [INFO] serf: EventMemberJoin: az1-consul-6lw7d 10.244.9.3
2018/10/17 01:43:41 [ERR] agent: failed to sync remote state: No cluster leader
2018/10/17 01:43:52 [ERR] agent: Coordinate update error: No cluster leader
2018/10/17 01:44:17 [ERR] agent: failed to sync remote state: No cluster leader
2018/10/17 01:44:22 [ERR] agent: Coordinate update error: No cluster leader
2018/10/17 01:44:42 [ERR] agent: failed to sync remote state: No cluster leader
2018/10/17 01:44:49 [ERR] agent: Coordinate update error: No cluster leader
2018/10/17 01:45:15 [ERR] agent: failed to sync remote state: No cluster leader
2018/10/17 01:45:18 [ERR] agent: Coordinate update error: No cluster leader
2018/10/17 01:45:47 [ERR] agent: failed to sync remote state: No cluster leader
2018/10/17 01:45:51 [ERR] agent: Coordinate update error: No cluster leader
2018/10/17 01:46:18 [ERR] agent: failed to sync remote state: No cluster leader
2018/10/17 01:46:26 [ERR] agent: Coordinate update error: No cluster leader
2018/10/17 01:46:44 [ERR] agent: failed to sync remote state: No cluster leader
2018/10/17 01:46:55 [ERR] agent: Coordinate update error: No cluster leader
2018/10/17 01:47:07 [ERR] agent: failed to sync remote state: No cluster leader
2018/10/17 01:47:28 [ERR] agent: Coordinate update error: No cluster leader
2018/10/17 01:47:38 [ERR] agent: failed to sync remote state: No cluster leader
2018/10/17 01:47:58 [ERR] agent: Coordinate update error: No cluster leader
2018/10/17 01:48:03 [ERR] agent: failed to sync remote state: No cluster leader
2018/10/17 01:48:26 [ERR] agent: failed to sync remote state: No cluster leader
2018/10/17 01:48:35 [ERR] agent: Coordinate update error: No cluster leader
2018/10/17 01:49:00 [ERR] agent: Coordinate update error: No cluster leader
2018/10/17 01:49:01 [ERR] agent: failed to sync remote state: No cluster leader
(Pulling up the logs in the kubernetes UI repeat the same [ERR]*
reports for the last few pages now that it has been up for over 12 hours).
Kubectl exec into consul-consul-server-0
/ # consul members
Node Address Status Type Build Protocol DC Segment
az1-consul-server-0 10.244.9.4:8301 alive server 1.2.3 2 dc1 <all>
az1-consul-6lw7d 10.244.9.3:8301 alive client 1.2.3 2 dc1 <default>
Kubectl exec into consul-consul-server-2
(consul members
had equivalent information on server-1)
> kubectl exec az1-consul-server-2 -n consul -it -- /bin/sh
/ # consul members
Node Address Status Type Build Protocol DC Segment
az1-consul-server-2 10.244.8.4:8301 alive server 1.2.3 2 dc1 <all>
/ # consul join az1-consul-server-0
Error joining address 'az1-consul-server-0': Unexpected response code: 500 (1 error(s) occurred:
* Failed to resolve az1-consul-server-0: lookup az1-consul-server-0 on 10.0.0.10:53: read udp 10.244.8.4:56376->10.0.0.10:53: i/o timeout)
Failed to join any nodes.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.