coreos-distro's People
coreos-distro's Issues
Using images/containers instead of curl/wget binary
Instead of doing curl the kubernetes binary. We should consider to use docker/rkt
image instead.
Something like below
# kube-apiserver.service
[Unit]
Description=Kubernetes API Server
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
Requires=etcd2.service setup-network-environment.service
After=etcd2.service setup-network-environment.service
[Service]
EnvironmentFile=/etc/network-environment
ExecStartPre=-/usr/bin/docker kill api-server
ExecStartPre=-/usr/bin/docker rm api-server
ExecStartPre=/usr/bin/docker pull mattma/kube-apiserver:1.0.1
ExecStart=/usr/bin/docker run --name api-server mattma/kube-apiserver:1.0.1
ExecStop=/usr/bin/docker stop api-server
Restart=always
RestartSec=10
[X-Fleet]
Global=true
MachineMetadata=role=master
mattma/kube-apiserver:1.0.1 image is located at here. To build out this image, I am using the configuration which is matching exactly what we are currently have.
The tricking thing to note, when docker container is created, it will be running in an isolated environment, how does it go to talk to the outside world. In current implementation, the binary is running inside the host machine so it is never run into this situation.
Confirm docker is landed in flannel's network
flannel.service
start first to setup the flannel network. All future docker containers in Node machines should land in flannel network.
It seems working currently without any configuration. Not sure it is the case. Someone need to confirm docker.service
works as expected.
three master setup
Update the README.md
and etcd-user-data
with three master setup instruction.
One master node works as expected. Three master node does not work.
Question 1: When all three machines loads up with etcd-user-data
file as cloud-init setting, they both load with the static value listen-client-urls: http://0.0.0.0:2379,http://0.0.0.0:4001
. See here.
When I follow the current README doc, in the first master, I got
coreos-01 core # etcdctl cluster-health
cluster is healthy
member 6ae27f9fa2984b1d is healthy
But it did not detect the 2nd master. Even though I have the setting (Have done daemon-reload
on both machine after setting the new value in initial-cluster.conf
)
# /etc/systemd/system/etcd2.service.d/initial-cluster.conf
[Service]
Environment="ETCD_INITIAL_CLUSTER=784767c97ce5410f931f0cfee19523f8=http://172.17.8.101:2380,c1f07ef72ede4399b9fc35d14e5db0a8=http://172.17.9.101:2380"
In the 2nd machine, follow the same step in One master node with the setting above in initial-cluster.conf
. So in this case, both master one and master two have the exactly same value in /etc/systemd/system/etcd2.service.d/initial-cluster.conf
.
Master two etcd
is running as fine. But when got error below when run
node-01 core # etcdctl cluster-health
Error: cannot sync with the cluster using endpoints http://127.0.0.1:4001, http://127.0.0.1:2379
I believe there must be the static value issue in cloud-init file, may need to be updated as well.
Another question, If I set the environment variable of ROLE
. E.G: ROLE=master with vagrant up
will load up etcd-user-data
. And ROLE=node will load up user-data
. This is the legacy stuff, right? It will always gonna load etcd-user-data
in this case?????
Kube-kubelet does not setup correctly with `dns`
core@kube-node-02 ~ $ sudo journalctl -u kube-kubelet
-- Logs begin at Fri 2015-08-21 18:05:09 UTC, end at Fri 2015-08-21 19:40:15 UTC. --
Aug 21 18:11:10 kube-node-02 systemd[1]: [/run/fleet/units/kube-kubelet.service:15] Unknown lvalue '--cluster_dns' in section 'Service'
Aug 21 18:11:10 kube-node-02 systemd[1]: [/run/fleet/units/kube-kubelet.service:16] Unknown lvalue '--cluster_domain' in section 'Service'
Aug 21 18:11:10 kube-node-02 systemd[1]: Starting Kubernetes Kubelet...
Aug 21 18:11:10 kube-node-02 rm[1452]: /usr/bin/rm: cannot remove '/opt/bin/kubelet': No such file or directory
Aug 21 18:11:10 kube-node-02 curl[1454]: Warning: Illegal date format for -z, --timecond (and not a file name).
Aug 21 18:11:10 kube-node-02 curl[1454]: Warning: Disabling time condition. See curl_getdate(3) for valid date syntax.
Aug 21 18:11:10 kube-node-02 curl[1454]: % Total % Received % Xferd Average Speed Time Time Time Current
Aug 21 18:11:10 kube-node-02 curl[1454]: Dload Upload Total Spent Left Speed
Aug 21 18:11:14 kube-node-02 curl[1454]: [471B blob data]
Aug 21 18:11:14 kube-node-02 systemd[1]: Started Kubernetes Kubelet.
Aug 21 18:11:14 kube-node-02 kubelet[1476]: W0821 18:11:14.227446 1476 server.go:462] Could not load kubeconfig file /var/lib/kubelet/kubeconfig: stat /var/lib/kubelet/kubeconfig: no such file or directory. Trying auth path instead.
Aug 21 18:11:14 kube-node-02 kubelet[1476]: W0821 18:11:14.227710 1476 server.go:424] Could not load kubernetes auth path /var/lib/kubelet/kubernetes_auth: stat /var/lib/kubelet/kubernetes_auth: no such file or directory. Continuing with defaults.
Aug 21 18:11:14 kube-node-02 kubelet[1476]: I0821 18:11:14.227882 1476 manager.go:127] cAdvisor running in container: "/system.slice"
Aug 21 18:11:14 kube-node-02 kubelet[1476]: I0821 18:11:14.228296 1476 fs.go:93] Filesystem partitions: map[/dev/sda9:{mountpoint:/ major:8 minor:9} /dev/sda3:{mountpoint:/usr major:8 minor:3} /dev/sda6:{mountpoint:/usr/share/oem major:8 minor:6}]
Aug 21 18:11:14 kube-node-02 kubelet[1476]: I0821 18:11:14.229182 1476 manager.go:156] Machine: {NumCores:1 CpuFrequency:2798419 MemoryCapacity:1045966848 MachineID:305215dab8894a50a74d3e5a305c8396 SystemUUID:C7BCCB48-62B8-4A24-B7A6-1428FCEEF09D BootID
Aug 21 18:11:14 kube-node-02 kubelet[1476]: I0821 18:11:14.231838 1476 manager.go:163] Version: {KernelVersion:4.1.5-coreos ContainerOsVersion:CoreOS 779.0.0 DockerVersion:1.7.1 CadvisorVersion:0.15.1}
Aug 21 18:11:14 kube-node-02 kubelet[1476]: I0821 18:11:14.232189 1476 plugins.go:69] No cloud provider specified.
Aug 21 18:11:15 kube-node-02 kubelet[1476]: I0821 18:11:15.076494 1476 docker.go:295] Connecting to docker on unix:///var/run/docker.sock
Aug 21 18:11:15 kube-node-02 kubelet[1476]: I0821 18:11:15.076870 1476 server.go:661] Watching apiserver
Aug 21 18:11:15 kube-node-02 kubelet[1476]: I0821 18:11:15.121402 1476 plugins.go:56] Registering credential provider: .dockercfg
Aug 21 18:11:15 kube-node-02 kubelet[1476]: I0821 18:11:15.127058 1476 server.go:623] Started kubelet
Aug 21 18:11:15 kube-node-02 kubelet[1476]: E0821 18:11:15.127933 1476 kubelet.go:682] Image garbage collection failed: unable to find data for container /
Aug 21 18:11:15 kube-node-02 kubelet[1476]: I0821 18:11:15.133172 1476 kubelet.go:702] Running in container "/kubelet"
Aug 21 18:11:15 kube-node-02 kubelet[1476]: I0821 18:11:15.133251 1476 server.go:63] Starting to listen on 0.0.0.0:10250
Aug 21 18:11:15 kube-node-02 kubelet[1476]: I0821 18:11:15.291570 1476 factory.go:226] System is using systemd
Aug 21 18:11:15 kube-node-02 kubelet[1476]: I0821 18:11:15.292315 1476 factory.go:234] Registering Docker factory
Aug 21 18:11:15 kube-node-02 kubelet[1476]: I0821 18:11:15.292856 1476 factory.go:89] Registering Raw factory
Aug 21 18:11:15 kube-node-02 kubelet[1476]: I0821 18:11:15.307898 1476 kubelet.go:821] Successfully registered node 172.17.8.102
Aug 21 18:11:15 kube-node-02 kubelet[1476]: I0821 18:11:15.342549 1476 manager.go:946] Started watching for new ooms in manager
Aug 21 18:11:15 kube-node-02 kubelet[1476]: I0821 18:11:15.342760 1476 oomparser.go:183] oomparser using systemd
Aug 21 18:11:15 kube-node-02 kubelet[1476]: I0821 18:11:15.345129 1476 manager.go:243] Starting recovery of all containers
Aug 21 18:11:15 kube-node-02 kubelet[1476]: W0821 18:11:15.359541 1476 container.go:255] Failed to create summary reader for "/system.slice/sys-kernel-debug.mount": none of the resources are being tracked.
Aug 21 18:11:15 kube-node-02 kubelet[1476]: W0821 18:11:15.360516 1476 container.go:255] Failed to create summary reader for "/system.slice/system-systemd\\x2dfsck.slice": none of the resources are being tracked.
Aug 21 18:11:15 kube-node-02 kubelet[1476]: W0821 18:11:15.362517 1476 container.go:255] Failed to create summary reader for "/system.slice/systemd-vconsole-setup.service": none of the resources are being tracked.
Aug 21 18:11:15 kube-node-02 kubelet[1476]: W0821 18:11:15.364128 1476 container.go:255] Failed to create summary reader for "/system.slice/tmp.mount": none of the resources are being tracked.
Aug 21 18:11:15 kube-node-02 kubelet[1476]: W0821 18:11:15.368040 1476 container.go:255] Failed to create summary reader for "/system.slice/dev-mqueue.mount": none of the resources are being tracked.
Aug 21 18:11:15 kube-node-02 kubelet[1476]: W0821 18:11:15.368972 1476 container.go:255] Failed to create summary reader for "/system.slice/etcd2.service": none of the resources are being tracked.
Aug 21 18:11:15 kube-node-02 kubelet[1476]: W0821 18:11:15.370248 1476 container.go:255] Failed to create summary reader for "/system.slice/fleet.service": none of the resources are being tracked.
Aug 21 18:11:15 kube-node-02 kubelet[1476]: W0821 18:11:15.371185 1476 container.go:255] Failed to create summary reader for "/system.slice/kube-kubelet.service": none of the resources are being tracked.
Aug 21 18:11:15 kube-node-02 kubelet[1476]: W0821 18:11:15.372476 1476 container.go:255] Failed to create summary reader for "/system.slice/ldconfig.service": none of the resources are being tracked.
Aug 21 18:11:15 kube-node-02 kubelet[1476]: W0821 18:11:15.373742 1476 container.go:255] Failed to create summary reader for "/system.slice/rpc-statd.service": none of the resources are being tracked.
Aug 21 18:11:15 kube-node-02 kubelet[1476]: W0821 18:11:15.379591 1476 container.go:255] Failed to create summary reader for "/system.slice/boot.mount": none of the resources are being tracked.
Aug 21 18:11:15 kube-node-02 kubelet[1476]: W0821 18:11:15.383116 1476 container.go:255] Failed to create summary reader for "/user.slice": none of the resources are being tracked.
Aug 21 18:11:15 kube-node-02 kubelet[1476]: W0821 18:11:15.385733 1476 container.go:255] Failed to create summary reader for "/system.slice/rpcbind.service": none of the resources are being tracked.
Aug 21 18:11:15 kube-node-02 kubelet[1476]: W0821 18:11:15.388105 1476 container.go:255] Failed to create summary reader for "/system.slice/systemd-tmpfiles-setup.service": none of the resources are being tracked.
Aug 21 18:11:15 kube-node-02 kubelet[1476]: W0821 18:11:15.389048 1476 container.go:255] Failed to create summary reader for "/system.slice/system-addon\\x2dconfig.slice": none of the resources are being tracked.
Aug 21 18:11:15 kube-node-02 kubelet[1476]: W0821 18:11:15.391693 1476 container.go:255] Failed to create summary reader for "/system.slice/systemd-journal-catalog-update.service": none of the resources are being tracked.
Aug 21 18:11:15 kube-node-02 kubelet[1476]: W0821 18:11:15.395006 1476 container.go:255] Failed to create summary reader for "/system.slice/audit-rules.service": none of the resources are being tracked.
Aug 21 18:11:15 kube-node-02 kubelet[1476]: W0821 18:11:15.396578 1476 container.go:255] Failed to create summary reader for "/system.slice/docker.service": none of the resources are being tracked.
Aug 21 18:11:15 kube-node-02 kubelet[1476]: W0821 18:11:15.398537 1476 container.go:255] Failed to create summary reader for "/system.slice/media.mount": none of the resources are being tracked.
Aug 21 18:11:15 kube-node-02 kubelet[1476]: W0821 18:11:15.398992 1476 container.go:255] Failed to create summary reader for "/system.slice/setup-network-environment.service": none of the resources are being tracked.
core@kube-node-02 ~ $ sudo systemctl status kube-kubelet
● kube-kubelet.service - Kubernetes Kubelet
Loaded: loaded (/run/fleet/units/kube-kubelet.service; linked-runtime; vendor preset: disabled)
Active: active (running) since Fri 2015-08-21 18:11:14 UTC; 1h 31min ago
Docs: https://github.com/GoogleCloudPlatform/kubernetes,http://kubernetes.io/v1.0/docs/admin/kubelet.html
Process: 1475 ExecStartPre=/usr/bin/mkdir -p /opt/kubernetes/manifests/ (code=exited, status=0/SUCCESS)
Process: 1472 ExecStartPre=/usr/bin/chmod +x /opt/bin/kubelet (code=exited, status=0/SUCCESS)
Process: 1454 ExecStartPre=/usr/bin/curl -L -o /opt/bin/kubelet -z /opt/bin/kubelet https://storage.googleapis.com/kubernetes-release/release/v1.0.1/bin/linux/amd64/kubelet (code=exited, status=0/SUCCESS)
Process: 1452 ExecStartPre=/usr/bin/rm /opt/bin/kubelet (code=exited, status=1/FAILURE)
Process: 1449 ExecStartPre=/usr/bin/mkdir -p /opt/bin (code=exited, status=0/SUCCESS)
Main PID: 1476 (kubelet)
CGroup: /system.slice/kube-kubelet.service
├─1476 /opt/bin/kubelet --address=0.0.0.0 --port=10250 --hostname_override=172.17.8.102 --api_servers=http://172.17.8.100:8080 --allow_privileged=true # cluster_dns matches `setup/dns/dns-service.yaml` @ `spec.clusterIP`
└─1492 journalctl -f
Aug 21 19:33:16 kube-node-02 kubelet[1476]: W0821 19:33:16.134458 1476 container.go:255] Failed to create summary reader for "/system.slice/motdgen.service": none of the resources are being tracked.
Aug 21 19:34:16 kube-node-02 kubelet[1476]: W0821 19:34:16.154107 1476 container.go:255] Failed to create summary reader for "/system.slice/motdgen.service": none of the resources are being tracked.
Aug 21 19:35:16 kube-node-02 kubelet[1476]: W0821 19:35:16.194820 1476 container.go:255] Failed to create summary reader for "/system.slice/motdgen.service": none of the resources are being tracked.
Aug 21 19:36:16 kube-node-02 kubelet[1476]: W0821 19:36:16.212533 1476 container.go:255] Failed to create summary reader for "/system.slice/motdgen.service": none of the resources are being tracked.
Aug 21 19:37:16 kube-node-02 kubelet[1476]: W0821 19:37:16.232000 1476 container.go:255] Failed to create summary reader for "/system.slice/motdgen.service": none of the resources are being tracked.
Aug 21 19:38:16 kube-node-02 kubelet[1476]: W0821 19:38:16.247220 1476 container.go:255] Failed to create summary reader for "/system.slice/motdgen.service": none of the resources are being tracked.
Aug 21 19:39:16 kube-node-02 kubelet[1476]: W0821 19:39:16.269459 1476 container.go:255] Failed to create summary reader for "/system.slice/motdgen.service": none of the resources are being tracked.
Aug 21 19:40:16 kube-node-02 kubelet[1476]: W0821 19:40:16.281662 1476 container.go:255] Failed to create summary reader for "/system.slice/motdgen.service": none of the resources are being tracked.
Aug 21 19:41:16 kube-node-02 kubelet[1476]: W0821 19:41:16.299104 1476 container.go:255] Failed to create summary reader for "/system.slice/motdgen.service": none of the resources are being tracked.
Aug 21 19:42:16 kube-node-02 kubelet[1476]: W0821 19:42:16.314775 1476 container.go:255] Failed to create summary reader for "/system.slice/motdgen.service": none of the resources are being tracked.
Latest new branch failed on starting on Kube-proxy and Kubelet on Node machine
branch is the latest greatest. Whatever in the repo now, it works since kube-proxy
and kubelet
still using the static binary instead of the image, to reproduce the issue, simply copy the below service file to replace what is in the kube-proxy
.
#kube-proxy.service
[Unit]
Description=Kubernetes Proxy
Documentation=https://github.com/GoogleCloudPlatform/kubernetes,http://kubernetes.io/v1.0/docs/admin/kube-proxy.html
Requires=setup-network-environment.service
After=setup-network-environment.service
[Service]
EnvironmentFile=/etc/sysconfig/kubernetes-config
ExecStartPre=-/usr/bin/docker kill kube-proxy
ExecStartPre=-/usr/bin/docker rm kube-proxy
ExecStartPre=/usr/bin/docker pull mattma/kube-proxy:${KUBERNETES_VERSION}
ExecStart=/usr/bin/docker run \
--net=host \
--name kube-proxy \
mattma/kube-proxy:${KUBERNETES_VERSION} \
--master=http://${API_SERVER_IP}:${INSECURE_PORT} \
--logtostderr=true
Restart=always
RestartSec=10
[X-Fleet]
Global=true
MachineMetadata=role=node
Error response from daemon: Cannot start container f4e9a61799493a04da1f0fea940ce1d12dd08ac6724684b0d18d51fd01e4c147: [8] System error: no such file or directory
core@kube-node-02 ~ $ docker logs f4e9a6179949
no such file or directory
Which file or directory is missing?
All instructions works on Master machine, but similar setting does not work on Node machine, Error below. could not figure out why? @yichengq
Road map to Stable v1.0 release
- Current, if you run the update-demo example in the cluster, it will work in all cases. Except that it cannot access the pod data by using API server endpoint via
proxy/namespaces/default/pods/" + server.podId + "/data.json
. Compare with the original update-demo in Kubernetes repo. - Using images/containers instead of curl/wget binary. See Issue 5
- configuration instead of hard code values. See Issue 7
- User need an authentication to access the API server. via
secret
,ssh key
, etc. - Test in the production machines. Form a cluster, to see how it perform via
kubectl
command from local machine. - Write an documentation for how to deploy to a production environment. Ex: "digitial ocean", "AWS", etc
- Three master nodes for high availabilties. Based on the design goal
- Provide an easy update path when a newer version of the kubernetes release via
Environment Variable
- Current implementation of
dns controller
is borrowed from here. Need someone who is knowledgable on this topic to review it. - HTTP proxy service. Do we need this proxy service?
- Do we need to have rpcbind.service and rpc-statd.service?
- Someone from In-depth kubernetes background could review all the units files.
- Support for service-account and tokens. See here
Kube-proxy got error status on WatchServices and WatchEndpoints
core@kube-node-02 ~ $ sudo systemctl status -l kube-proxy
● kube-proxy.service - Kubernetes Proxy
Loaded: loaded (/run/fleet/units/kube-proxy.service; linked-runtime; vendor preset: disabled)
Active: active (running) since Thu 2015-08-20 03:53:41 UTC; 1h 22min ago
Docs: https://github.com/GoogleCloudPlatform/kubernetes,http://kubernetes.io/v1.0/docs/admin/kube-proxy.html
Process: 4010 ExecStartPre=/usr/bin/chmod +x /opt/bin/kube-proxy (code=exited, status=0/SUCCESS)
Process: 3994 ExecStartPre=/usr/bin/curl -L -o /opt/bin/kube-proxy -z /opt/bin/kube-proxy https://storage.googleapis.com/kubernetes-release/release/v1.0.1/bin/linux/amd64/kube-proxy (code=exited, status=0/SUCCESS)
Process: 3992 ExecStartPre=/usr/bin/rm /opt/bin/kube-proxy (code=exited, status=0/SUCCESS)
Process: 3989 ExecStartPre=/usr/bin/mkdir -p /opt/bin (code=exited, status=0/SUCCESS)
Main PID: 4013 (kube-proxy)
Memory: 4.3M
CPU: 9.706s
CGroup: /system.slice/kube-proxy.service
└─4013 /opt/bin/kube-proxy --master=http://172.17.8.100:8080 --logtostderr=true
Aug 20 03:53:13 kube-node-02 systemd[1]: Starting Kubernetes Proxy...
Aug 20 03:53:13 kube-node-02 curl[3994]: Warning: Illegal date format for -z, --timecond (and not a file name).
Aug 20 03:53:13 kube-node-02 curl[3994]: Warning: Disabling time condition. See curl_getdate(3) for valid date syntax.
Aug 20 03:53:13 kube-node-02 curl[3994]: % Total % Received % Xferd Average Speed Time Time Time Current
Aug 20 03:53:13 kube-node-02 curl[3994]: Dload Upload Total Spent Left Speed
Aug 20 03:53:38 kube-node-02 curl[3994]: [1.9K blob data]
Aug 20 03:53:41 kube-node-02 curl[3994]: [320B blob data]
Aug 20 03:53:41 kube-node-02 systemd[1]: Started Kubernetes Proxy.
Aug 20 05:02:04 kube-node-02 kube-proxy[4013]: W0820 05:02:04.397432 4013 api.go:153] Got error status on WatchServices channel: &{TypeMeta:{Kind: APIVersion:} ListMeta:{SelfLink: ResourceVersion:} Status:Failure Message:401: The event in requested index is outdated and cleared (the requested history has been cleared [58876/51366]) [59875] Reason: Details:<nil> Code:0}
Aug 20 05:06:22 kube-node-02 kube-proxy[4013]: W0820 05:06:22.938093 4013 api.go:224] Got error status on WatchEndpoints channel: &{TypeMeta:{Kind: APIVersion:} ListMeta:{SelfLink: ResourceVersion:} Status:Failure Message:401: The event in requested index is outdated and cleared (the requested history has been cleared [59748/59285]) [60747] Reason: Details:<nil> Code:0}
core@kube-node-02 ~ $ sudo journalctl -u kube-proxy
-- Logs begin at Tue 2015-08-18 17:51:34 UTC, end at Thu 2015-08-20 05:23:19 UTC. --
Aug 18 17:57:16 kube-node-02 systemd[1]: Starting Kubernetes Proxy...
Aug 18 17:57:16 kube-node-02 rm[1358]: /usr/bin/rm: cannot remove '/opt/bin/kube-proxy': No such file or directory
Aug 18 17:57:16 kube-node-02 curl[1361]: Warning: Illegal date format for -z, --timecond (and not a file name).
Aug 18 17:57:16 kube-node-02 curl[1361]: Warning: Disabling time condition. See curl_getdate(3) for valid date syntax.
Aug 18 17:57:16 kube-node-02 curl[1361]: % Total % Received % Xferd Average Speed Time Time Time Current
Aug 18 17:57:16 kube-node-02 curl[1361]: Dload Upload Total Spent Left Speed
Aug 18 17:57:17 kube-node-02 curl[1361]: [234B blob data]
Aug 18 17:57:17 kube-node-02 systemd[1]: Started Kubernetes Proxy.
Aug 18 18:51:38 kube-node-02 kube-proxy[1364]: W0818 18:51:38.734295 1364 api.go:153] Got error status on WatchServices channel: &{TypeMeta:{Kind: APIVersion:} ListMeta:{SelfLink: ResourceVersion:} Status:Failure Message:401: The event in requested in
Aug 18 19:05:36 kube-node-02 kube-proxy[1364]: W0818 19:05:36.534740 1364 api.go:224] Got error status on WatchEndpoints channel: &{TypeMeta:{Kind: APIVersion:} ListMeta:{SelfLink: ResourceVersion:} Status:Failure Message:401: The event in requested i
Aug 18 20:01:28 kube-node-02 kube-proxy[1364]: W0818 20:01:28.246425 1364 api.go:153] Got error status on WatchServices channel: &{TypeMeta:{Kind: APIVersion:} ListMeta:{SelfLink: ResourceVersion:} Status:Failure Message:401: The event in requested in
Aug 18 20:04:27 kube-node-02 kube-proxy[1364]: W0818 20:04:27.595492 1364 api.go:224] Got error status on WatchEndpoints channel: &{TypeMeta:{Kind: APIVersion:} ListMeta:{SelfLink: ResourceVersion:} Status:Failure Message:401: The event in requested i
Aug 18 22:35:56 kube-node-02 kube-proxy[1364]: W0818 22:35:56.749712 1364 api.go:153] Got error status on WatchServices channel: &{TypeMeta:{Kind: APIVersion:} ListMeta:{SelfLink: ResourceVersion:} Status:Failure Message:401: The event in requested in
Aug 18 23:03:50 kube-node-02 kube-proxy[1364]: W0818 23:03:50.948246 1364 api.go:224] Got error status on WatchEndpoints channel: &{TypeMeta:{Kind: APIVersion:} ListMeta:{SelfLink: ResourceVersion:} Status:Failure Message:401: The event in requested i
Aug 19 05:43:46 kube-node-02 kube-proxy[1364]: W0819 05:43:46.742629 1364 api.go:224] Got error status on WatchEndpoints channel: &{TypeMeta:{Kind: APIVersion:} ListMeta:{SelfLink: ResourceVersion:} Status:Failure Message:401: The event in requested i
Aug 19 05:46:01 kube-node-02 kube-proxy[1364]: W0819 05:46:01.276691 1364 api.go:153] Got error status on WatchServices channel: &{TypeMeta:{Kind: APIVersion:} ListMeta:{SelfLink: ResourceVersion:} Status:Failure Message:401: The event in requested in
Aug 19 06:30:58 kube-node-02 kube-proxy[1364]: W0819 06:30:58.203745 1364 api.go:153] Got error status on WatchServices channel: &{TypeMeta:{Kind: APIVersion:} ListMeta:{SelfLink: ResourceVersion:} Status:Failure Message:401: The event in requested in
Aug 19 06:40:10 kube-node-02 kube-proxy[1364]: W0819 06:40:10.843763 1364 api.go:224] Got error status on WatchEndpoints channel: &{TypeMeta:{Kind: APIVersion:} ListMeta:{SelfLink: ResourceVersion:} Status:Failure Message:401: The event in requested i
Aug 20 03:46:46 kube-node-02 kube-proxy[1364]: E0820 03:46:46.830407 1364 proxysocket.go:99] Dial failed: dial tcp 10.244.45.2:27017: connection refused
Aug 20 03:46:46 kube-node-02 kube-proxy[1364]: E0820 03:46:46.831744 1364 proxysocket.go:99] Dial failed: dial tcp 10.244.45.2:27017: connection refused
Aug 20 03:46:46 kube-node-02 kube-proxy[1364]: E0820 03:46:46.831787 1364 proxysocket.go:99] Dial failed: dial tcp 10.244.45.2:27017: connection refused
Aug 20 03:46:46 kube-node-02 kube-proxy[1364]: E0820 03:46:46.831823 1364 proxysocket.go:99] Dial failed: dial tcp 10.244.45.2:27017: connection refused
Aug 20 03:46:46 kube-node-02 kube-proxy[1364]: E0820 03:46:46.833005 1364 proxysocket.go:133] Failed to connect to balancer: failed to connect to an endpoint.
Aug 20 03:46:46 kube-node-02 kube-proxy[1364]: E0820 03:46:46.834083 1364 proxysocket.go:99] Dial failed: dial tcp 10.244.45.2:27017: connection refused
Aug 20 03:46:46 kube-node-02 kube-proxy[1364]: E0820 03:46:46.834127 1364 proxysocket.go:99] Dial failed: dial tcp 10.244.45.2:27017: connection refused
Aug 20 03:46:46 kube-node-02 kube-proxy[1364]: E0820 03:46:46.834162 1364 proxysocket.go:99] Dial failed: dial tcp 10.244.45.2:27017: connection refused
Aug 20 03:46:46 kube-node-02 kube-proxy[1364]: E0820 03:46:46.834194 1364 proxysocket.go:99] Dial failed: dial tcp 10.244.45.2:27017: connection refused
Aug 20 03:46:46 kube-node-02 kube-proxy[1364]: E0820 03:46:46.834201 1364 proxysocket.go:133] Failed to connect to balancer: failed to connect to an endpoint.
....
Refer to the issue 9713 and issue 9310
kubernetes-ro service is missing
kubectl get svc
Only kubernetes
will show up. the kubernetes-ro
is missing at the moment.
configuration instead of hard code values
Case 1:
localhost usage, it used throughout the api-server
, control-manager
, schuduler
:
--etcd_servers=http://127.0.0.1:2379,http://127.0.0.1:4001 \
Should we use an Environment variable instead of? like $private_ivp4
Case 2:
insecure port used to indicate the open port on master node, so user could do http://172.17.8.100:8080
to check api-server
health, schdule something. Several other service is using it to talk to api-server
. We should make the port
configurable without hard code value.
Case 3
cluster ip range is used by flannel
network overlay which is later being saved in etcd
after flannel.service
started. Should be an variable here.
Case 4
api-server public address. Can this one be something like --api_servers=http://${ETH1_IPV4}:8080 \
?
Case 5
cluster-domain and cluster-dns is shared by yaml
and service
files, how could we introduce an variable to work in different type of file format.
In general, we should use as many as Environment variable instead of static value, use as many as Environment variable instead of user configurable value, if we absolutely need to use user configurable value, we should do it cleanly and intuitively.
In conclusion, we should make it much easy to go in production without less configurable value, I think the absolutely thing would be master ip address
which used to talk to api-server
, master machine id
which used in node
etcd
cluster setting?
etcd2.service status=1/FAILURE
To reproduce the issues. Follow steps:
sudo systemctl cat etcd2
# /etc/systemd/system/etcd2.service
[Install]
WantedBy=default.target
[Unit]
Description=etcd2
Conflicts=etcd.service
[Service]
User=etcd
Environment=ETCD_DATA_DIR=/var/lib/etcd2
Environment=ETCD_NAME=%m
ExecStart=/usr/bin/etcd2
Restart=always
RestartSec=10s
LimitNOFILE=40000
# /run/systemd/system/etcd2.service.d/20-cloudinit.conf
[Service]
Environment="ETCD_ADVERTISE_CLIENT_URLS=http://172.17.8.101:2379"
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=http://172.17.8.101:2380"
Environment="ETCD_INITIAL_CLUSTER_STATE=new"
Environment="ETCD_LISTEN_CLIENT_URLS=http://0.0.0.0:2379,http://0.0.0.0:4001"
Environment="ETCD_LISTEN_PEER_URLS=http://172.17.8.101:2380,http://172.17.8.101:7001"
sudo systemctl start etcd2
Success: system does not return anything
sudo systemctl enable etcd2
Created symlink from /etc/systemd/system/default.target.wants/etcd2.service to /etc/systemd/system/etcd2.service.
sudo systemctl status etcd2
● etcd2.service - etcd2
Loaded: loaded (/etc/systemd/system/etcd2.service; enabled; vendor preset: disabled)
Drop-In: /run/systemd/system/etcd2.service.d
└─20-cloudinit.conf
Active: activating (auto-restart) (Result: exit-code) since Fri 2015-07-31 02:17:52 UTC; 11s ago
Main PID: 1078 (code=exited, status=1/FAILURE)
Memory: 0B
CPU: 0
CGroup: /system.slice/etcd2.service
Jul 31 02:17:52 coreos-01 systemd[1]: etcd2.service: Main process exited, code=exited, status=1/FAILURE
Jul 31 02:17:52 coreos-01 systemd[1]: etcd2.service: Unit entered failed state.
Jul 31 02:17:52 coreos-01 systemd[1]: etcd2.service: Failed with result 'exit-code'.
sudo journalctl -u etcd2
repeatedly getting the failure issues
Jul 31 02:17:52 coreos-01 systemd[1]: Starting etcd2...
Jul 31 02:17:52 coreos-01 etcd2[1078]: 2015/07/31 02:17:52 etcdmain: setting maximum number of CPUs to 1, total number of available CPUs is 1
Jul 31 02:17:52 coreos-01 etcd2[1078]: 2015/07/31 02:17:52 etcdmain: listening for peers on http://172.17.8.101:2380
Jul 31 02:17:52 coreos-01 etcd2[1078]: 2015/07/31 02:17:52 etcdmain: listening for peers on http://172.17.8.101:7001
Jul 31 02:17:52 coreos-01 etcd2[1078]: 2015/07/31 02:17:52 etcdmain: listening for client requests on http://0.0.0.0:2379
Jul 31 02:17:52 coreos-01 etcd2[1078]: 2015/07/31 02:17:52 etcdmain: listening for client requests on http://0.0.0.0:4001
Jul 31 02:17:52 coreos-01 etcd2[1078]: 2015/07/31 02:17:52 etcdmain: stopping listening for client requests on http://0.0.0.0:4001
Jul 31 02:17:52 coreos-01 etcd2[1078]: 2015/07/31 02:17:52 etcdmain: stopping listening for client requests on http://0.0.0.0:2379
Jul 31 02:17:52 coreos-01 etcd2[1078]: 2015/07/31 02:17:52 etcdmain: stopping listening for peers on http://172.17.8.101:7001
Jul 31 02:17:52 coreos-01 etcd2[1078]: 2015/07/31 02:17:52 etcdmain: stopping listening for peers on http://172.17.8.101:2380
Jul 31 02:17:52 coreos-01 etcd2[1078]: 2015/07/31 02:17:52 etcdmain: advertise URLs of "36cdb64424934ae7a3fcc2ec8b2d44ea" do not match in --initial-advertise-peer-urls [http://172.17.8.101:2380] and --initial-cluster [http://localhost:2380 http://localhos
Jul 31 02:17:52 coreos-01 systemd[1]: etcd2.service: Main process exited, code=exited, status=1/FAILURE
Jul 31 02:17:52 coreos-01 systemd[1]: etcd2.service: Unit entered failed state.
Jul 31 02:17:52 coreos-01 systemd[1]: etcd2.service: Failed with result 'exit-code'.
Jul 31 02:18:05 coreos-01 systemd[1]: etcd2.service: Service hold-off time over, scheduling restart.
Jul 31 02:18:05 coreos-01 systemd[1]: Started etcd2.
Jul 31 02:18:05 coreos-01 systemd[1]: Starting etcd2...
Jul 31 02:18:05 coreos-01 etcd2[1113]: 2015/07/31 02:18:05 etcdmain: setting maximum number of CPUs to 1, total number of available CPUs is 1
Jul 31 02:18:05 coreos-01 etcd2[1113]: 2015/07/31 02:18:05 etcdmain: listening for peers on http://172.17.8.101:2380
Jul 31 02:18:05 coreos-01 etcd2[1113]: 2015/07/31 02:18:05 etcdmain: listening for peers on http://172.17.8.101:7001
Jul 31 02:18:05 coreos-01 etcd2[1113]: 2015/07/31 02:18:05 etcdmain: listening for client requests on http://0.0.0.0:2379
Jul 31 02:18:05 coreos-01 etcd2[1113]: 2015/07/31 02:18:05 etcdmain: listening for client requests on http://0.0.0.0:4001
Jul 31 02:18:05 coreos-01 etcd2[1113]: 2015/07/31 02:18:05 etcdmain: stopping listening for client requests on http://0.0.0.0:4001
Jul 31 02:18:05 coreos-01 etcd2[1113]: 2015/07/31 02:18:05 etcdmain: stopping listening for client requests on http://0.0.0.0:2379
Jul 31 02:18:05 coreos-01 etcd2[1113]: 2015/07/31 02:18:05 etcdmain: stopping listening for peers on http://172.17.8.101:7001
Jul 31 02:18:05 coreos-01 etcd2[1113]: 2015/07/31 02:18:05 etcdmain: stopping listening for peers on http://172.17.8.101:2380
Jul 31 02:18:05 coreos-01 etcd2[1113]: 2015/07/31 02:18:05 etcdmain: advertise URLs of "36cdb64424934ae7a3fcc2ec8b2d44ea" do not match in --initial-advertise-peer-urls [http://172.17.8.101:2380] and --initial-cluster [http://localhost:2380 http://localhos
Jul 31 02:18:05 coreos-01 systemd[1]: etcd2.service: Main process exited, code=exited, status=1/FAILURE
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.