Giter VIP home page Giter VIP logo

ansible-role-kubernetes's Introduction

Ansible Role: Kubernetes

CI

An Ansible Role that installs Kubernetes on Linux.

Requirements

Requires a compatible Container Runtime; recommended role for CRI installation: geerlingguy.containerd.

Role Variables

Available variables are listed below, along with default values (see defaults/main.yml):

kubernetes_packages:
  - name: kubelet
    state: present
  - name: kubectl
    state: present
  - name: kubeadm
    state: present
  - name: kubernetes-cni
    state: present

Kubernetes packages to be installed on the server. You can either provide a list of package names, or set name and state to have more control over whether the package is present, absent, latest, etc.

kubernetes_version: '1.25'
kubernetes_version_rhel_package: '1.25.1'

The minor version of Kubernetes to install. The plain kubernetes_version is used to pin an apt package version on Debian, and as the Kubernetes version passed into the kubeadm init command (see kubernetes_version_kubeadm). The kubernetes_version_rhel_package variable must be a specific Kubernetes release, and is used to pin the version on Red Hat / CentOS servers.

kubernetes_role: control_plane

Whether the particular server will serve as a Kubernetes control_plane (default) or node. The control plane will have kubeadm init run on it to intialize the entire K8s control plane, while nodes will have kubeadm join run on them to join them to the control_plane.

Variables to configure kubeadm and kubelet with kubeadm init through a config file (recommended)

With this role, kubeadm init will be run with --config <FILE>.

kubernetes_kubeadm_kubelet_config_file_path: '/etc/kubernetes/kubeadm-kubelet-config.yaml'

Path for <FILE>. If the directory does not exist, this role will create it.

The following variables are parsed as options to . To understand its syntax, see kubelet-integration and kubeadm-config-file . The skeleton (apiVersion, kind) of the config file will be created by this role, so do not define them within the variables. (See templates/kubeadm-kubelet-config.j2).

kubernetes_config_init_configuration:
  localAPIEndpoint:
    advertiseAddress: "{{ kubernetes_apiserver_advertise_address | default(ansible_default_ipv4.address, true) }}"

Defines the options under kind: InitConfiguration. Including kubernetes_apiserver_advertise_address here is for backward-compatibilty to older versions of this role, where kubernetes_apiserver_advertise_address was used with a command-line-option.

kubernetes_config_cluster_configuration:
  networking:
    podSubnet: "{{ kubernetes_pod_network.cidr }}"
  kubernetesVersion: "{{ kubernetes_version_kubeadm }}"

Options under kind: ClusterConfiguration. Including kubernetes_pod_network.cidr and kubernetes_version_kubeadm here are for backward-compatibilty to older versions of this role, where they were used with command-line-options.

kubernetes_config_kubelet_configuration:
  cgroupDriver: systemd

Options to configure kubelet on any nodes in your cluster through the kubeadm init process. For syntax options read the kubelet config file and kubelet integration documentation.

NOTE: This is the recommended way to do the kubelet-configuration. Most command-line-options are deprecated.

NOTE: The recommended cgroupDriver depends on your Container Runtime. When using this role with Docker instead of containerd, this value should be changed to cgroupfs.

kubernetes_config_kube_proxy_configuration: {}

Options to configure kubelet's proxy configuration in the KubeProxyConfiguration section of the kubelet configuration.

Variables to configure kubeadm and kubelet through command-line-options

kubernetes_kubelet_extra_args: ""
kubernetes_kubelet_extra_args_config_file: /etc/default/kubelet

Extra args to pass to kubelet during startup. E.g. to allow kubelet to start up even if there is swap is enabled on your server, set this to: "--fail-swap-on=false". Or to specify the node-ip advertised by kubelet, set this to "--node-ip={{ ansible_host }}". This option is deprecated. Please use kubernetes_config_kubelet_configuration instead.

kubernetes_kubeadm_init_extra_opts: ""

Extra args to pass to kubeadm init during K8s control plane initialization. E.g. to specify extra Subject Alternative Names for API server certificate, set this to: "--apiserver-cert-extra-sans my-custom.host"

kubernetes_join_command_extra_opts: ""

Extra args to pass to the generated kubeadm join command during K8s node initialization. E.g. to ignore certain preflight errors like swap being enabled, set this to: --ignore-preflight-errors=Swap

Additional variables

kubernetes_allow_pods_on_control_plane: true

Whether to remove the taint that denies pods from being deployed to the Kubernetes control plane. If you have a single-node cluster, this should definitely be True. Otherwise, set to False if you want a dedicated Kubernetes control plane which doesn't run any other pods.

kubernetes_pod_network:
  # Flannel CNI.
  cni: 'flannel'
  cidr: '10.244.0.0/16'
  #
  # Calico CNI.
  # cni: 'calico'
  # cidr: '192.168.0.0/16'
  #
  # Weave CNI.
  # cni: 'weave'
  # cidr: '192.168.0.0/16'

This role currently supports flannel (default), calico or weave for cluster pod networking. Choose only one for your cluster; converting between them is not done automatically and could result in broken networking; if you need to switch from one to another, it should be done outside of this role.

kubernetes_apiserver_advertise_address: ''`
kubernetes_version_kubeadm: 'stable-{{ kubernetes_version }}'`
kubernetes_ignore_preflight_errors: 'all'

Options passed to kubeadm init when initializing the Kubernetes control plane. The kubernetes_apiserver_advertise_address defaults to ansible_default_ipv4.address if it's left empty.

kubernetes_apt_release_channel: "stable"
kubernetes_apt_keyring_file: "/etc/apt/keyrings/kubernetes-apt-keyring.asc"
kubernetes_apt_repository: "deb [signed-by={{ kubernetes_apt_keyring_file }}] https://pkgs.k8s.io/core:/{{ kubernetes_apt_release_channel }}:/v{{ kubernetes_version }}/deb/ /"

Apt repository options for Kubernetes installation.

kubernetes_yum_base_url: "https://pkgs.k8s.io/core:/stable:/v{{ kubernetes_version }}/rpm/"
kubernetes_yum_gpg_key: "https://pkgs.k8s.io/core:/stable:/v{{ kubernetes_version }}/rpm/repodata/repomd.xml.key"
kubernetes_yum_gpg_check: true
kubernetes_yum_repo_gpg_check: true

Yum repository options for Kubernetes installation. You can change kubernete_yum_gpg_key to a different url if you are behind a firewall or provide a trustworthy mirror. Usually in combination with changing kubernetes_yum_base_url as well.

kubernetes_flannel_manifest_file: https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

Flannel manifest file to apply to the Kubernetes cluster to enable networking. You can copy your own files to your server and apply them instead, if you need to customize the Flannel networking configuration.

kubernetes_calico_manifest_file: https://projectcalico.docs.tigera.io/manifests/calico.yaml

Calico manifest file to apply to the Kubernetes cluster (if using Calico instead of Flannel).

Dependencies

None.

Example Playbooks

Single node (control-plane-only) cluster

- hosts: all

  vars:
    kubernetes_allow_pods_on_control_plane: true

  roles:
    - geerlingguy.docker
    - geerlingguy.kubernetes

Two or more nodes (single control-plane) cluster

Control plane inventory vars:

kubernetes_role: "control_plane"

Node(s) inventory vars:

kubernetes_role: "node"

Playbook:

- hosts: all

  vars:
    kubernetes_allow_pods_on_control_plane: true

  roles:
    - geerlingguy.docker
    - geerlingguy.kubernetes

Then, log into the Kubernetes control plane, and run kubectl get nodes as root, and you should see a list of all the servers.

License

MIT / BSD

Author Information

This role was created in 2018 by Jeff Geerling, author of Ansible for DevOps.

ansible-role-kubernetes's People

Contributors

anqiuy avatar clementgautier avatar colonelpopcorn avatar elkouhen avatar fengye87 avatar geerlingguy avatar jhujasonw avatar keducoop avatar melkouhen avatar meyerbro avatar notsag avatar palankarravi avatar priitliivak avatar ptr-dorjin avatar rdxmb avatar rfranks-securenet avatar shkiv avatar wdennis avatar weakcamel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ansible-role-kubernetes's Issues

Fail when deploying flannel

Hello Jeff. Here is the problem:

TASK [geerlingguy.kubernetes : Configure Flannel networking.] **************************************************************************************************************
failed: [174.35.241.200] (item=kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel-rbac.yml) => {"changed": fa
lse, "cmd": ["kubectl", "apply", "-f", "https://raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel-rbac.yml"], "delta": "0:00:00.36889
2", "end": "2018-09-22 22:57:48.898430", "item": "kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel-rbac.yml
", "msg": "non-zero return code", "rc": 1, "start": "2018-09-22 22:57:48.529538", "stderr": "unable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/D
ocumentation/k8s-manifests/kube-flannel-rbac.yml\": Get https://10.0.2.15:6443/api?timeout=32s: dial tcp 10.0.2.15:6443: connect: connection refused\nunable to recognize \"
https://raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel-rbac.yml\": Get https://10.0.2.15:6443/api?timeout=32s: dial tcp 10.0.2.15:
6443: connect: connection refused", "stderr_lines": ["unable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel
-rbac.yml\": Get https://10.0.2.15:6443/api?timeout=32s: dial tcp 10.0.2.15:6443: connect: connection refused", "unable to recognize \"https://raw.githubusercontent.com/cor
eos/flannel/master/Documentation/k8s-manifests/kube-flannel-rbac.yml\": Get https://10.0.2.15:6443/api?timeout=32s: dial tcp 10.0.2.15:6443: connect: connection refused"],
"stdout": "", "stdout_lines": []}
failed: [174.35.241.200] (item=kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml) => {"changed": false, "cmd": ["kubec
tl", "apply", "-f", "https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml"], "delta": "0:00:00.345040", "end": "2018-09-22 22:57:49.614758
", "item": "kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml", "msg": "non-zero return code", "rc": 1, "start": "2018
-09-22 22:57:49.269718", "stderr": "unable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml\": Get https://10.0.2.15:64
43/api?timeout=32s: dial tcp 10.0.2.15:6443: connect: connection refused\nunable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-f
lannel.yml\": Get https://10.0.2.15:6443/api?timeout=32s: dial tcp 10.0.2.15:6443: connect: connection refused\nunable to recognize \"https://raw.githubusercontent.com/core
os/flannel/master/Documentation/kube-flannel.yml\": Get https://10.0.2.15:6443/api?timeout=32s: dial tcp 10.0.2.15:6443: connect: connection refused\nunable to recognize \"
https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml\": Get https://10.0.2.15:6443/api?timeout=32s: dial tcp 10.0.2.15:6443: connect: conn
ection refused\nunable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml\": Get https://10.0.2.15:6443/api?timeout=32s:
dial tcp 10.0.2.15:6443: connect: connection refused\nunable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml\": Get ht
tps://10.0.2.15:6443/api?timeout=32s: dial tcp 10.0.2.15:6443: connect: connection refused\nunable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Do
cumentation/kube-flannel.yml\": Get https://10.0.2.15:6443/api?timeout=32s: dial tcp 10.0.2.15:6443: connect: connection refused\nunable to recognize \"https://raw.githubus
ercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml\": Get https://10.0.2.15:6443/api?timeout=32s: dial tcp 10.0.2.15:6443: connect: connection refused\nunab
le to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml\": Get https://10.0.2.15:6443/api?timeout=32s: dial tcp 10.0.2.15:6
443: connect: connection refused", "stderr_lines": ["unable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml\": Get htt
ps://10.0.2.15:6443/api?timeout=32s: dial tcp 10.0.2.15:6443: connect: connection refused", "unable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/D
ocumentation/kube-flannel.yml\": Get https://10.0.2.15:6443/api?timeout=32s: dial tcp 10.0.2.15:6443: connect: connection refused", "unable to recognize \"https://raw.githu
busercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml\": Get https://10.0.2.15:6443/api?timeout=32s: dial tcp 10.0.2.15:6443: connect: connection refused",
"unable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml\": Get https://10.0.2.15:6443/api?timeout=32s: dial tcp 10.0.2
.15:6443: connect: connection refused", "unable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml\": Get https://10.0.2.
15:6443/api?timeout=32s: dial tcp 10.0.2.15:6443: connect: connection refused", "unable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation
/kube-flannel.yml\": Get https://10.0.2.15:6443/api?timeout=32s: dial tcp 10.0.2.15:6443: connect: connection refused", "unable to recognize \"https://raw.githubusercontent
.com/coreos/flannel/master/Documentation/kube-flannel.yml\": Get https://10.0.2.15:6443/api?timeout=32s: dial tcp 10.0.2.15:6443: connect: connection refused", "unable to r
ecognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml\": Get https://10.0.2.15:6443/api?timeout=32s: dial tcp 10.0.2.15:6443: co
nnect: connection refused", "unable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml\": Get https://10.0.2.15:6443/api?
timeout=32s: dial tcp 10.0.2.15:6443: connect: connection refused"], "stdout": "", "stdout_lines": []}

Deployed on:

[ansible@ansible-server lab-kubernetes]$ ssh 174.35.241.200
Last login: Sat Sep 22 22:57:49 2018 from 174.35.241.10
[ansible@k8s-node2 ~]$ cat /etc/redhat-release
CentOS Linux release 7.5.1804 (Core)
[ansible@k8s-node2 ~]$ uname -a
Linux k8s-node2 3.10.0-862.11.6.el7.x86_64 #1 SMP Tue Aug 14 21:49:04 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
[ansible@k8s-node2 ~]$ docker version
Client:
 Version:           18.06.1-ce
 API version:       1.38
 Go version:        go1.10.3
 Git commit:        e68fc7a
 Built:             Tue Aug 21 17:23:03 2018
 OS/Arch:           linux/amd64
 Experimental:      false
Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get http://%2Fvar%2Frun%2Fdocker.sock/v1.38/version: dial unix /var/run/docker.sock: connect: permission denied
[ansible@k8s-node2 ~]$

Can you help me?

Fail on "Symlink the kubectl admin.conf to ~/.kube/conf"

Everything went smoothly until:

TASK [geerlingguy.kubernetes : Symlink the kubectl admin.conf to ~/.kube/conf.] *********************************
fatal: [104.233.73.44]: FAILED! => {"changed": false, "msg": "src file does not exist, use \"force=yes\" if you really want to create the link: /etc/kubernetes/admin.conf", "path": "/root/.kube/config", "src": "/etc/kubernetes/admin.conf", "state": "absent"}

My server runs Debian 8.8.

Stuck at "Join node to Kubernetes master"

I have a 3-nodes setup - 1 master node, 2 worker nodes. The ansible playbook stucks at the following taks

TASK [ansible.kubernetes : Join node to Kubernetes master]

means it just stays there and does not continue

If I check on the worker node, apparently the kubelet service does not start up

Dec 27 22:45:01 worker2 kubelet[15002]: I1227 22:45:01.557815   15002 server.go:407] Version: v1.13.1
Dec 27 22:45:01 worker2 kubelet[15002]: I1227 22:45:01.558328   15002 plugins.go:103] No cloud provider specified.
Dec 27 22:45:01 worker2 kubelet[15002]: F1227 22:45:01.558412   15002 server.go:261] failed to run Kubelet: unable to load bootstrap kubeconfig: stat /etc/kubernetes/bootstrap-kubelet.conf: no such file or directory
Dec 27 22:45:01 worker2 systemd[1]: kubelet.service: Main process exited, code=exited, status=255/n/a
Dec 27 22:45:01 worker2 systemd[1]: kubelet.service: Unit entered failed state.
Dec 27 22:45:01 worker2 systemd[1]: kubelet.service: Failed with result 'exit-code'.

So the file is missing: /etc/kubernetes/bootstrap-kubelet.conf and I do not entirely understand how the file should be where it is expected? Is it part of the kubernetes package or shall it come from the ansible role as template?

Fail when join node to master!

HI guys,

I follow this roles to setup and it's done when setup master but when i run kubernetes_roles=node, which have a problem :
"The task includes an option with an undefined variable. The error was: 'kubernetes_join_command' is undefined"
I also check kubernetes_join_command is define in task/main.yml.How can i fix this problem?
Tks so much.

Undefined Variable

Using this role I get an undefined variable when trying to join nodes to master.
My ansible inventory looks like

[master]
master1=<ip>

[worker]
worker1=<ip>
worker2=<ip>

[master:vars]
kubernetes_role="master"
kubernetes_enable_web_ui=True

[worker:vars]
kubernetes_role="node"

My ansible play in equally simple (there is a separate play that installs docker).

- hosts: all
  remote_user: centos
  vars:
    kubernetes_allow_pods_on_master: False
  tasks:
    - name: Install Kubes
      become: yes
      become_method: sudo
      import_role:
        name: geerlingguy.kubernetes

The output is:

TASK [geerlingguy.kubernetes : Join node to Kubernetes master] **********************************************************************************************************************
task path: /media/sf_dWork/cf_repos/testKubernets/roles/geerlingguy.kubernetes/tasks/node-setup.yml:2
fatal: [worker1]: FAILED! => {
    "msg": "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'stdout'\n\nThe error appears to have been in '/testKubernets/roles/geerlingguy.kubernetes/tasks/node-setup.yml': line 2, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n---\n- name: Join node to Kubernetes master\n  ^ here\n"
}
META: noop
fatal: [worker2]: FAILED! => {
    "msg": "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'stdout'\n\nThe error appears to have been in '/testKubernets/roles/geerlingguy.kubernetes/tasks/node-setup.yml': line 2, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n---\n- name: Join node to Kubernetes master\n  ^ here\n"
}

Install fails ubuntu / centos 3 node (1 master, 2 node)

Hi there, I tried to deploy this onto 3 x ubuntu 16 instances but the result is that the kubernetes api and other services do not start. The error is actually at the point where it tries to download and import the yaml, because it cannot connect to port 6443 (because the service is not starting).

so I destroyed those 3 instances and then deployed centos 7. Again, install fails, I am trying to debug. Are you aware of these issues? Any suggestions? Both ubuntu and centos are vanilla and are running on openstack.

Feature Request - Disable Swap

Hello Jeff

- name: Remove swap from /etc/fstab
  lineinfile:
    dest: /etc/fstab
    regexp: '^.* swap .*$'
    state: absent
- name: Disable swap
  command: swapoff -a
  when: ansible_swaptotal_mb > 0

It would be nice to execute those tasks to disable Swap before installing Kubelet as it fails to start if Swap is enabled.

Playbook is not working

Dear all,

I am trying to execute the default yml playbook (test.yml) you have but i have the following error:
Can you please advice?


command:


[root@desktop-2g9ao3g geerlingguy.kubernetes]# ansible-playbook test.yml --extra-vars "ansible_ssh_user=root ansible_ssh_pass=1234567"


error:


ERROR! the role 'geerlingguy.docker' was not found in /root/.ansible/roles/geerlingguy.kubernetes/roles:/root/.ansible/roles:/usr/share/ansible/roles:/etc/ansible/roles:/root/.ansible/roles/geerlingguy.kubernetes

The error appears to be in '/root/.ansible/roles/geerlingguy.kubernetes/test.yml': line 7, column 7, but may
be elsewhere in the file depending on the exact syntax problem.

The offending line appears to be:

roles:
- geerlingguy.docker
^ here


yml file:


[root@desktop-2g9ao3g geerlingguy.kubernetes]# cat test.yml

  • hosts: all

    vars:
    kubernetes_allow_pods_on_master: true

    roles:

    • geerlingguy.docker
    • geerlingguy.kubernetes
      [root@desktop-2g9ao3g geerlingguy.kubernetes]#

Thanks,
George

Failure on kubeadm join - "The IPVS proxier will not be used"

I use this role pretty regularly in a continuous integration pipeline that I have. For some reason, I get this error intermittently. I don't know of an exact way to reproduce it. It just seems to happen randomly:

screen shot 2019-02-08 at 8 53 03 pm

It honestly seems like more of a kubeadm issue than an issue with your particular role, however I can't say for certain. Maybe there's a way to catch the error in this role, and respond appropriately? The error does give some suggestions on what to try.

For the particular screenshot I provided, its a 3 node cluster - 1 master and 2 workers. All nodes are CentOS 7.

I see kubernetes-admin

I see a kubernetes-admin user in the /etc/kubernetes/admin.conf file.

I was wondering if we should expect there to be a new user named kubernetes-admin in the provisioned image?

I don't see one on my provisioned image/machine. Should I create one manually? I might be misunderstanding that section of the kubeconfig file...

Make role work with nodes (e.g. to join master)

See: https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/#4-4-joining-your-nodes

Basically, need to get the appropriate values from the master node, store them in a variable/variables, then run the command on each node (non-master):

kubeadm join --token <token> <master-ip>:<master-port> --discovery-token-ca-cert-hash sha256:<hash>

Verify the node joined the master by running kubectl get nodes on the master; it should list all the joined nodes.

Probably also add an example to the README for this, as well as an example playbook with two or more VirtualBox VMs in my ansible-vagrant-examples repo.

kubelet can't create aufs mounts, causes kubeadm to hang

Locally testing with kubeadm init --pod-network-cidr=10.0.1.0/16 --apiserver-advertise-address=172.17.0.2 --kubernetes-version stable-1.10 --ignore-preflight-errors=all, once kubelet is running, it seems to go okay, but hangs at the end here:

...
[certificates] Using the existing sa key.
[certificates] Using the existing front-proxy-ca certificate and key.
[certificates] Using the existing front-proxy-client certificate and key.
[certificates] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[kubeconfig] Using existing up-to-date KubeConfig file: "/etc/kubernetes/admin.conf"
[kubeconfig] Using existing up-to-date KubeConfig file: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Using existing up-to-date KubeConfig file: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Using existing up-to-date KubeConfig file: "/etc/kubernetes/scheduler.conf"
[controlplane] Wrote Static Pod manifest for component kube-apiserver to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[controlplane] Wrote Static Pod manifest for component kube-controller-manager to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
[controlplane] Wrote Static Pod manifest for component kube-scheduler to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[etcd] Wrote Static Pod manifest for a local etcd instance to "/etc/kubernetes/manifests/etcd.yaml"
[init] Waiting for the kubelet to boot up the control plane as Static Pods from directory "/etc/kubernetes/manifests".
[init] This might take a minute or longer if the control plane images have to be pulled.

Looking in /var/log/syslog, I'm seeing a ton of errors from dockerd and kubelet, like the following:

May  9 15:03:09 0c273a23d55b kubelet[6523]: E0509 15:03:09.316253    6523 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:460: Failed to list *v1.Node: Get https://172.17.0.2:6443/api/v1/nodes?fieldSelector=metadata.name%3D0c273a23d55b&limit=500&resourceVersion=0: dial tcp 172.17.0.2:6443: getsockopt: connection refused
May  9 15:03:09 0c273a23d55b kubelet[6523]: E0509 15:03:09.316914    6523 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:451: Failed to list *v1.Service: Get https://172.17.0.2:6443/api/v1/services?limit=500&resourceVersion=0: dial tcp 172.17.0.2:6443: getsockopt: connection refused
May  9 15:03:09 0c273a23d55b kubelet[6523]: E0509 15:03:09.317913    6523 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://172.17.0.2:6443/api/v1/pods?fieldSelector=spec.nodeName%3D0c273a23d55b&limit=500&resourceVersion=0: dial tcp 172.17.0.2:6443: getsockopt: connection refused
May  9 15:03:10 0c273a23d55b kubelet[6523]: I0509 15:03:10.280198    6523 kubelet_node_status.go:271] Setting node annotation to enable volume controller attach/detach
May  9 15:03:10 0c273a23d55b kubelet[6523]: E0509 15:03:10.317148    6523 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:460: Failed to list *v1.Node: Get https://172.17.0.2:6443/api/v1/nodes?fieldSelector=metadata.name%3D0c273a23d55b&limit=500&resourceVersion=0: dial tcp 172.17.0.2:6443: getsockopt: connection refused
May  9 15:03:10 0c273a23d55b kubelet[6523]: E0509 15:03:10.317964    6523 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:451: Failed to list *v1.Service: Get https://172.17.0.2:6443/api/v1/services?limit=500&resourceVersion=0: dial tcp 172.17.0.2:6443: getsockopt: connection refused
May  9 15:03:10 0c273a23d55b kubelet[6523]: E0509 15:03:10.319211    6523 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://172.17.0.2:6443/api/v1/pods?fieldSelector=spec.nodeName%3D0c273a23d55b&limit=500&resourceVersion=0: dial tcp 172.17.0.2:6443: getsockopt: connection refused
May  9 15:03:10 0c273a23d55b dockerd[2329]: time="2018-05-09T15:03:10.590715524Z" level=warning msg="Couldn't run auplink before unmount /var/lib/docker/aufs/mnt/7292d62a7d81e456076c1b00bf07390c0f7016acaf876a6e12e82937e796c18d-init: exit status 22"
May  9 15:03:10 0c273a23d55b dockerd[2329]: time="2018-05-09T15:03:10.591358364Z" level=error msg="Handler for POST /v1.31/containers/create returned error: error creating aufs mount to /var/lib/docker/aufs/mnt/7292d62a7d81e456076c1b00bf07390c0f7016acaf876a6e12e82937e796c18d-init: invalid argument"
May  9 15:03:10 0c273a23d55b kubelet[6523]: E0509 15:03:10.592148    6523 remote_runtime.go:92] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = failed to create a sandbox for pod "kube-apiserver-0c273a23d55b": Error response from daemon: error creating aufs mount to /var/lib/docker/aufs/mnt/7292d62a7d81e456076c1b00bf07390c0f7016acaf876a6e12e82937e796c18d-init: invalid argument
May  9 15:03:10 0c273a23d55b kubelet[6523]: E0509 15:03:10.592253    6523 kuberuntime_sandbox.go:54] CreatePodSandbox for pod "kube-apiserver-0c273a23d55b_kube-system(d111819eec36689723e08c03ec6f632c)" failed: rpc error: code = Unknown desc = failed to create a sandbox for pod "kube-apiserver-0c273a23d55b": Error response from daemon: error creating aufs mount to /var/lib/docker/aufs/mnt/7292d62a7d81e456076c1b00bf07390c0f7016acaf876a6e12e82937e796c18d-init: invalid argument
May  9 15:03:10 0c273a23d55b kubelet[6523]: E0509 15:03:10.592270    6523 kuberuntime_manager.go:646] createPodSandbox for pod "kube-apiserver-0c273a23d55b_kube-system(d111819eec36689723e08c03ec6f632c)" failed: rpc error: code = Unknown desc = failed to create a sandbox for pod "kube-apiserver-0c273a23d55b": Error response from daemon: error creating aufs mount to /var/lib/docker/aufs/mnt/7292d62a7d81e456076c1b00bf07390c0f7016acaf876a6e12e82937e796c18d-init: invalid argument
May  9 15:03:10 0c273a23d55b kubelet[6523]: E0509 15:03:10.592576    6523 pod_workers.go:186] Error syncing pod d111819eec36689723e08c03ec6f632c ("kube-apiserver-0c273a23d55b_kube-system(d111819eec36689723e08c03ec6f632c)"), skipping: failed to "CreatePodSandbox" for "kube-apiserver-0c273a23d55b_kube-system(d111819eec36689723e08c03ec6f632c)" with CreatePodSandboxError: "CreatePodSandbox for pod \"kube-apiserver-0c273a23d55b_kube-system(d111819eec36689723e08c03ec6f632c)\" failed: rpc error: code = Unknown desc = failed to create a sandbox for pod \"kube-apiserver-0c273a23d55b\": Error response from daemon: error creating aufs mount to /var/lib/docker/aufs/mnt/7292d62a7d81e456076c1b00bf07390c0f7016acaf876a6e12e82937e796c18d-init: invalid argument"
May  9 15:03:10 0c273a23d55b kernel: [61641.342219] aufs test_add:266:dockerd[4556]: already stacked, /var/lib/docker/aufs/diff/7292d62a7d81e456076c1b00bf07390c0f7016acaf876a6e12e82937e796c18d-init (overlay)

Deployment fails with Vagrant centos/7

I'm trying to use this role as a provisioning configuration for Vagrant, with a base Centos/7 box.

A few things that the role should do :

  • Make sure swap is disabled
  • Make sure net.bridge.bridge-nf-call-iptables is 1

Here is a role I use between geerlingguy.docker and geerlingguy.kubernetes as a workaround

---
# tasks file for prepare-kubernetes
- name: Disable swap
  replace:
    path: /etc/fstab
    regexp: '(^[^#].* swap .*)'
    replace: '#\1'
  register: swap

- name: Disable swap runtime
  command: "swapoff -a"
  when: swap.changed
  
- name: Set bridge-nf-call-iptables
  sysctl:
    name: net.bridge.bridge-nf-call-iptables
    value: 1
    sysctl_set: yes
    state: present
    reload: yes

Alternative to Flannel CNI

I would like to use Weave instead of Flannel, for example, as the networking plugin. Is this something you would consider?

Don't ignore kubeadm init failures on the master node

Sometimes, kubeadm init fails for some reasons, for example swap is on, preflight errors, image pull timeout due to corporate proxy configuration that has not been ported to docker etc ..., don't ignore it using the failed_when: False flag, it's very confusing to see the playbook run successfully for the init command and crash later on when it comes to the task Symlink the kubectl admin.conf to ~/.kube/conf. without and explicit stack trace of what went wrong and why

Allow configuration of different networking layers

Currently this role hardcodes setting up Flannel as the networking layer when setting up the master. It would probably be good to have the networking configurable in case someone doesn't want to use Flannel!

At least support:

  • Calico
  • Flannel
  • Weave

Get role working on CentOS 7 (and Red Hat, by extension)

Currently, I have almost everything working... except for it gets stuck on the kubeadm init task now (it wasn't getting stuck earlier today... so some fix I applied for something else must be blocking init from completing):

TASK [role_under_test : Initialize the Kubernetes master with kubeadm init.] ***
No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself.
Check the details on how to adjust your build configuration on: https://docs.travis-ci.com/user/common-build-problems/#Build-times-out-because-no-output-was-received
The build has been terminated

Failed build: https://travis-ci.org/geerlingguy/ansible-role-kubernetes/jobs/377042422

Probably something blocking/breaking kubelet startup (this was happening for a variety of reasons earlier in this role's as-yet short history).

apt packages are version 1.12.0-rc.1-00 now -> fails deploying

Hi

obviously yesterday new rc versions for kubernetes 1.12 got uploaded to the official apt repo. That breaks installing from your module. I tried some approaches but I am not that fluent in ansible yet.

somehow we need to make ansible call "apt-get install kubectl=v*** kubeadm=v*** ... "
All options I found in ansible where you can specify a version will lead to upgrade & downgrade behavior. Maybe you have an idea ;)

my quick fix is deploying a /etc/apt/preferences.d/kubectl file with

Package: kubectl
Pin: version 1.11.*
Pin-Priority: 1000

Package: kubeadm
Pin: version 1.11.*
Pin-Priority: 1000

Package: kubelet
Pin: version 1.11.*
Pin-Priority: 1000

It needs to run before the include or your role of course.

let me know if i can help.

add kubectl bash completion

This is a FEATURE REQUEST.
Could we add bash_completion for kubectl in this repository? I think this could be done via

    - name: add kubectl bash completion
      lineinfile:
        path: ~/.bashrc
        line: source <(kubectl completion bash)
        state: present

kubectl fails if you run it twice in Ansible

If you run this playbook twice you will get a failure on anything that invokes kubectl (refused to connect to host). However fine the first time this playbook runs.

A quick search shows that other people on the internet have seen this behaviour with kubectl and Ansible. For now I am just using ignore_error: true on those tasks.

CentOS8: Errors while making cache if GPG keys change

I'm seeing an error using GCP CentOS8 instances for my master and worker nodes:

TASK [geerlingguy.kubernetes : Make cache if Kubernetes GPG key changed.] *******************************************************************************************************************************************************************
Wednesday 05 February 2020  14:44:51 +0000 (0:00:01.290)       0:03:34.103 ****
fatal: [54.161.207.59]: FAILED! => {"changed": true, "cmd": ["yum", "-q", "makecache", "-y", "--disablerepo=*", "--enablerepo=kubernetes"], "delta": "0:00:00.726136", "end": "2020-02-05 14:44:56.576533", "msg": "non-zero return code", "rc": -13, "start": "2020-02-05 14:44:55.850397", "stderr": "Importing GPG key 0xA7317B0F:\n Userid     : \"Google Cloud Packages Automatic Signing Key <[email protected]>\"\n Fingerprint: D0BC 747F D8CA F711 7500 D6FA 3746 C208 A731 7B0F\n From       : https://packages.cloud.google.com/yum/doc/yum-key.gpg", "stderr_lines": ["Importing GPG key 0xA7317B0F:", " Userid     : \"Google Cloud Packages Automatic Signing Key <[email protected]>\"", " Fingerprint: D0BC 747F D8CA F711 7500 D6FA 3746 C208 A731 7B0F", " From       : https://packages.cloud.google.com/yum/doc/yum-key.gpg"], "stdout": "", "stdout_lines": []}
...

When I ssh into one of the instances and run the command directly, it seems to work ok:

[root@ip-172-31-51-134 ~]# yum -q makecache -y --disablerepo=\* --enablerepo=kubernetes
Importing GPG key 0xBA07F4FB:
 Userid     : "Google Cloud Packages Automatic Signing Key <[email protected]>"
 Fingerprint: 54A6 47F9 048D 5688 D7DA 2ABE 6A03 0B21 BA07 F4FB
 From       : https://packages.cloud.google.com/yum/doc/yum-key.gpg
Importing GPG key 0x3E1BA8D5:
 Userid     : "Google Cloud Packages RPM Signing Key <[email protected]>"
 Fingerprint: 3749 E1BA 95A8 6CE0 5454 6ED2 F09C 394C 3E1B A8D5
 From       : https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
[root@ip-172-31-51-134 ~]# echo $?
0

The GCP keys and fingerprints are different are different from the failure but I don't know what the significance is. If I start over from scratch with new instances, it fails at the same point with the same key and fingerprint from the failure.

Typo in documentation

The apiserver_advertise_address defaults to ansible_default_ipv4.address if it's left empty.

should be :

The kubernetes_apiserver_advertise_address defaults to ansible_default_ipv4.address if it's left empty.

Kubernetes 1.11 changes the way KUBELET_EXTRA_ARGS works

Apparently in Kubernetes 1.11, the /etc/systemd/system/kubelet.service.d/10-kubeadm.conf file changed slightly; there is now a section:

# This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use
# the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file.
EnvironmentFile=-/etc/default/kubelet

And the contents of that file are, by default:

KUBELET_EXTRA_ARGS=

So the configuration in this role needs to be directed at /etc/default/kubelet instead of /etc/systemd/system/kubelet.service.d/10-kubeadm.conf. ๐Ÿคฆโ€โ™‚๏ธ

Kubernetes master must be first host on the list

Because of this part of the code:

- name: Get the kubeadm join command from the Kubernetes master.
  shell: kubeadm token create --print-join-command
  changed_when: False
  when: kubernetes_role == 'master'
  run_once: True
  register: kubernetes_join_command

And the note in run_once docs:

Any conditional (i.e when:) will use the variables of the โ€˜first hostโ€™ to decide if the task runs or not, no other hosts will be tested.

Means that this recipe will work only, if the kubernetes master is the first host. If it's not the first one, then the when will evaluate to False and never will be run.

As my Ansible foo is quite low for now, I didn't found (yet) any fix for this

k8 module does nt work with jenkins user

jenkins@an-1:/opt/play/hello-tomcat-projects/k8s$ cat k2.yml

  • hosts: dev
    connection: local
    pre_tasks:
    • name: Ensure Pip is installed.
      package:
      name: python-pip
      state: present

    • name: Ensure OpenShift client is installed.
      pip:
      name: openshift
      state: present
      tasks:

    • name: Create a k8s namespace
      k8s:
      name: testing
      api_version: v1
      kind: Namespace
      state: present

    • name: create nginx deploy
      k8s:
      state: present
      definition: "{{ lookup('template', 'files/nginx-deploy.yml.j2') }}"

cat files/nginx-deploy.yml.j2
apiVersion: v1
kind: Pod
metadata:
name: nginx
labels:
app: nginx
namespace: testing
spec:
containers:

  • name: 1st
    image: "{{ ni }}"

ansible-playbook -i inv k1.yml --extra-vars "ni=xxxxxxx.dkr.ecr.us-east-1.amazonaws.com/connector-dev:nginx-6"

it is able to pull docker image from ecr and start the container when do execute as root user.but it does not work when do execute as jenkins user. Jenkins user has sudo privilege and added to docker daemon group.

Node-Selectors:
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message


Normal Scheduled 10m default-scheduler Successfully assigned testing/nginx to k8m
Normal Pulling 8m51s (x4 over 10m) kubelet, k8m Pulling image "xxxxxx.dkr.ecr.us-east-1.amazonaws.com/connector-dev:nginx-6"
Warning Failed 8m50s (x4 over 10m) kubelet, k8m Failed to pull image "xxxxx.dkr.ecr.us-east-1.amazonaws.com/connector-dev:nginx-6": rpc error: code = Unknown desc = Error response from daemon: Get https://xxxx.dkr.ecr.us-east-1.amazonaws.com/v2/connector-dev/manifests/nginx-6: no basic auth credentials
Warning Failed 8m50s (x4 over 10m) kubelet, k8m Error: ErrImagePull
Normal BackOff 8m35s (x6 over 10m) kubelet, k8m Back-off pulling image "xxxxx.dkr.ecr.us-east-1.amazonaws.com/connector-dev:nginx-6"
Warning Failed 20s (x41 over 10m) kubelet, k8m Error: ImagePullBackOff

Configure flannel networking task fails on Ubuntu 18.04 and Debian 9 in Travis CI currently

    TASK [geerlingguy.kubernetes : Configure Flannel networking.] ******************
    failed: [instance] (item=kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel-rbac.yml) => {"changed": false, "cmd": ["kubectl", "apply", "-f", "https://raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel-rbac.yml"], "delta": "0:00:00.301566", "end": "2019-03-27 22:04:06.108628", "item": "kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel-rbac.yml", "msg": "non-zero return code", "rc": 1, "start": "2019-03-27 22:04:05.807062", "stderr": "unable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel-rbac.yml\": Get https://172.17.0.2:6443/api?timeout=32s: dial tcp 172.17.0.2:6443: connect: connection refused\nunable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel-rbac.yml\": Get https://172.17.0.2:6443/api?timeout=32s: dial tcp 172.17.0.2:6443: connect: connection refused", "stderr_lines": ["unable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel-rbac.yml\": Get https://172.17.0.2:6443/api?timeout=32s: dial tcp 172.17.0.2:6443: connect: connection refused", "unable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel-rbac.yml\": Get https://172.17.0.2:6443/api?timeout=32s: dial tcp 172.17.0.2:6443: connect: connection refused"], "stdout": "", "stdout_lines": []}

See failed build: https://travis-ci.org/geerlingguy/ansible-role-kubernetes/jobs/512093468

One thing I did notice was a mention of 2+ GB being required for smooth kubeadm operations (kubernetes-sigs/kind#57 (comment))... but this is for the Flannel manifest application, not for kubeadm. So not sure what's causing this.

I remember someone else, somewhere else running into the same issue... but don't remember where or who that was.

Support and test Debian 10 Buster

It was just released a day or so ago... I've just updated my Docker test image and am also running Raspbian 10 on my local Pi 4 cluster and in a VirtualBox test cluster as well.

Swap is not removed in the role

Hello Jeff. I'm continuing my saga ๐Ÿ—ก

Now, I've noticed that the role don't remove the swap, so the playbook fails when deploying the flannel.

Thank you for the support.

[BUG] environment-lines are overwritten in /etc/systemd/system/kubelet.service.d/10-kubeadm.conf

With the lines
https://github.com/geerlingguy/ansible-role-kubernetes/blob/master/tasks/kubelet-setup.yml#L16-L18
some important stuff seems to be overwritten in
/etc/systemd/system/kubelet.service.d/10-kubeadm.conf

Don't know why, but I have one Ubuntu-Node which has no
/etc/default/kubelet

But the real problem is that multiple files begin with Environment and the first of them will be overwritten then:

correct:

# cat /etc/default/kubelet 
KUBELET_EXTRA_ARGS=
# cat /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
# Note: This dropin only works with kubeadm and kubelet v1.11+
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
# This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically
EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
# This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use
# the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file.
EnvironmentFile=-/etc/default/kubelet
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS

wrong (after changed by the lines linked above)

# cat /etc/default/kubelet
cat: /etc/default/kubelet: No such file or directory
# cat /etc/systemd/system/kubelet.service.d/10-kubeadm.conf 
# Note: This dropin only works with kubeadm and kubelet v1.11+
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_EXTRA_ARGS="
# This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically
EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
# This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use
# the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file.
EnvironmentFile=-/etc/default/kubelet
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS

so the config-file /var/lib/kubelet/config.yaml will not be included here, which means that the dns is missing

Warning MissingClusterDNS 2m17s (x260 over 62m) kubelet, zwei kubelet does not have ClusterDNS IP configured and cannot create Pod using "ClusterFirst" policy. Falling back to "Default" policy.

# kubelet --version
Kubernetes v1.13.5
# cat /etc/lsb-release 
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=16.04
DISTRIB_CODENAME=xenial
DISTRIB_DESCRIPTION="Ubuntu 16.04.6 LTS"

taint node-role.kubernetes.io/master not found

I've reset my kubernetes cluster and re-installed with latest role. I use

kubernetes_allow_pods_on_master: True

Which makes the role fail:

TASK [geerlingguy.kubernetes : Allow pods on master node (if configured).] *******************************************************************************************
fatal: [prds0001]: FAILED! => {"changed": true, "cmd": ["kubectl", "taint", "nodes", "--all", "node-role.kubernetes.io/master-"], "delta": "0:00:00.063716", "end": "2019-08-01 19:31:55.417939", "msg": "non-zero return code", "rc": 1, "start": "2019-08-01 19:31:55.354223", "stderr": "taint \"node-role.kubernetes.io/master:\" not found\ntaint \"node-role.kubernetes.io/master:\" not found", "stderr_lines": ["taint \"node-role.kubernetes.io/master:\" not found", "taint \"node-role.kubernetes.io/master:\" not found"], "stdout": "node/prds0001 untainted", "stdout_lines": ["node/prds0001 untainted"]}

My nodes are running Debian Buster

> cat /etc/os-release 
PRETTY_NAME="Debian GNU/Linux 10 (buster)"
NAME="Debian GNU/Linux"
VERSION_ID="10"
VERSION="10 (buster)"
VERSION_CODENAME=buster
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

Here my playbook:

- hosts: servers

  vars:
    - docker_users: ['ansible']
    - docker_edition: 'ce'

    - kubernetes_version: '1.15'
    - kubernetes_version_rhel_package: '1.15.0'
    - kubernetes_allow_pods_on_master: True
    - kubernetes_kubelet_extra_args: ""
    - kubernetes_kubeadm_init_extra_opts: "--apiserver-cert-extra-sans prds0001.intra"
    - kubernetes_apt_repository: "deb http://apt.kubernetes.io/ kubernetes-xenial {{ kubernetes_apt_release_channel }}"

  pre_tasks:
    - name: disable swap
      command: swapoff -a
  
  roles:
    - geerlingguy.docker
    - geerlingguy.kubernetes

No package matching 'kubelet-1.16.4-0'

failed: [hdi_master1] (item={'name': 'kubelet-1.16.4-0', 'state': 'present'}) => {"ansible_loop_var": "item", "changed": false, "item": {"name": "kubelet-1.16.4-0", "state": "present"}, "msg": "No package matching 'kubelet-1.16.4-0' found available, installed or updated", "rc": 126, "results": ["No package matching 'kubelet-1.16.4-0' found available, installed or updated"]}
failed: [hdi_master1] (item={'name': 'kubectl-1.16.4-0', 'state': 'present'}) => {"ansible_loop_var": "item", "changed": false, "item": {"name": "kubectl-1.16.4-0", "state": "present"}, "msg": "No package matching 'kubectl-1.16.4-0' found available, installed or updated", "rc": 126, "results": ["No package matching 'kubectl-1.16.4-0' found available, installed or updated"]}
failed: [hdi_master1] (item={'name': 'kubeadm-1.16.4-0', 'state': 'present'}) => {"ansible_loop_var": "item", "changed": false, "item": {"name": "kubeadm-1.16.4-0", "state": "present"}, "msg": "No package matching 'kubeadm-1.16.4-0' found available, installed or updated", "rc": 126, "results": ["No package matching 'kubeadm-1.16.4-0' found available, installed or updated"]}
failed: [hdi_master1] (item={'name': 'kubernetes-cni', 'state': 'present'}) => {"ansible_loop_var": "item", "changed": false, "item": {"name": "kubernetes-cni", "state": "present"}, "msg": "No package matching 'kubernetes-cni' found available, installed or updated", "rc": 126, "results": ["No package matching 'kubernetes-cni' found available, installed or updated"]

Docker version is greater than the most recently validated version

Hello, Jeff.

Now, I receive this error:

TASK [geerlingguy.kubernetes : Join node to Kubernetes master] ***********************
fatal: [174.35.241.201]: FAILED! => {"changed": true, "cmd": "kubeadm join 10.0.2.15:6443 --token l7ltia.jve4p1520jmt8yw5 --discovery-token-ca-cert-hash sha256:3c55f99c2710fee05c7b6ef701aeec13902b35d8c245dcc109e015cc0fa03a48", "delta": "0:00:00.169237", "end": "2018-09-28 21:20:16.499901", "msg": "non-zero return code", "rc": 2, "start": "2018-09-28 21:20:16.330664", "stderr": "I0928 21:20:16.374776   24804 kernel_validator.go:81] Validating kernel version\nI0928 21:20:16.374840   24804 kernel_validator.go:96]Validating kernel config\n\t[WARNING SystemVerification]: docker version is greater than the most recently validated version. Docker version: 18.06.1-ce. Max validated version: 17.03\n[preflight] Some fatal errors occurred:\n\t[ERROR FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables contents are not set to 1\n[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`", "stderr_lines": ["I0928 21:20:16.374776   24804 kernel_validator.go:81] Validating kernel version", "I0928 21:20:16.374840   24804 kernel_validator.go:96] Validating kernel config", "\t[WARNING SystemVerification]: docker version is greater than the most recently validated version. Docker version: 18.06.1-ce. Max validated version: 17.03", "[preflight] Some fatal errors occurred:", "\t[ERROR FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables contents are not set to 1", "[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`"], "stdout": "[preflight] running pre-flight checks", "stdout_lines": ["[preflight] running pre-flight checks"]}
        to retry, use: --limit @/home/ansible/projects/lab-kubernetes/playbook.retry

Thank you

Fail when deploy flannel with role 2.0.0

As requested from #20 here the issue. Role is failing when deploying flannel with this error:

failed: [kubemaster
] (item=kubectl apply -f https: //raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel-rbac.yml) => {"changed": false,"cmd": ["kubectl", "apply", "-f", "https://raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel-rbac.yml"], "delta": "0:00:00.280182","end": "2018-12-01 12:25:54.029806", "item": "kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel-rbac.yml","msg": "non-zero return code", "rc": 1, "start": "2018-12-01 12:25:53.749624", "stderr": "unable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel-rbac.yml\": Get https://x.x.x.x:6443/api?timeout=32s: dial tcp x.x.x.x:6443: connect: connection refused\nunable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel-rbac.yml\": Get https://x.x.x.x:6443/api?timeout=32s: dial tcp x.x.x.x:6443: connect: connection refused", "stderr_lines": ["unable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel-rbac.yml\": Get https://x.x.x.x:6443/api?timeout=32s: dial tcp x.x.x.x:6443: connect: connection refused", "unable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel-rbac.yml\": Get https://x.x.x.x:6443/api?timeout=32s: dial tcp x.x.x.x:6443: connect: connection refused"], "stdout": "", "stdout_lines": []}
failed: [kubemaster
] (item=kubectl apply -f https: //raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml) => {"changed": false, "cmd": ["kubectl","apply", "-f", "https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml"], "delta": "0:00:00.317886", "end": "2018-12-01 12:25:55.137868", "item": "kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml", "msg": "non-zero return code", "rc": 1, "start": "2018-12-01 12:25:54.819982", "stderr": "unable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml\": Get https://x.x.x.x:6443/api?timeout=32s: dial tcp x.x.x.x:6443: connect: connection refused\nunable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml\": Get https://x.x.x.x:6443/api?timeout=32s: dial tcp x.x.x.x:6443: connect: connection refused\nunable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml\": Get https://x.x.x.x:6443/api?timeout=32s: dial tcp x.x.x.x:6443: connect: connection refused\nunable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml\": Get https://x.x.x.x:6443/api?timeout=32s: dial tcp x.x.x.x:6443: connect: connection refused\nunable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml\": Get https://x.x.x.x:6443/api?timeout=32s: dial tcp x.x.x.x:6443: connect: connection refused\nunable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml\": Get https://x.x.x.x:6443/api?timeout=32s: dial tcp x.x.x.x:6443: connect: connection refused\nunable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml\": Get https://x.x.x.x:6443/api?timeout=32s: dial tcp x.x.x.x:6443: connect: connection refused\nunable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml\": Get https://x.x.x.x:6443/api?timeout=32s: dial tcp x.x.x.x:6443: connect: connection refused\nunable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml\": Get https://x.x.x.x:6443/api?timeout=32s: dial tcp x.x.x.x:6443: connect: connection refused", "stderr_lines": ["unable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml\": Get https://x.x.x.x:6443/api?timeout=32s: dial tcp x.x.x.x:6443: connect: connection refused", "unable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml\": Get https://x.x.x.x:6443/api?timeout=32s: dial tcp x.x.x.x:6443: connect: connection refused", "unable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml\": Get https://x.x.x.x:6443/api?timeout=32s: dial tcp x.x.x.x:6443: connect: connection refused", "unable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml\": Get https://x.x.x.x:6443/api?timeout=32s: dial tcp x.x.x.x:6443: connect: connection refused", "unable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml\": Get https://x.x.x.x:6443/api?timeout=32s: dial tcp x.x.x.x:6443: connect: connection refused", "unable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml\": Get https://x.x.x.x:6443/api?timeout=32s: dial tcp x.x.x.x:6443: connect: connection refused", "unable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml\": Gethttps://x.x.x.x:6443/api?timeout=32s: dial tcp x.x.x.x:6443: connect: connection refused", "unable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml\": Get https://x.x.x.x:6443/api?timeout=32s: dial tcp x.x.x.x:6443: connect: connection refused", "unable to recognize \"https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml\": Get https://x.x.x.x:6443/api?timeout=32s: dial tcp x.x.x.x:6443: connect: connection refused"], "stdout": "", "stdout_lines": []}

I am running into the issue on my Debian stretch

Linux Debian-95-stretch-64-minimal 4.9.0-8-amd64 #1 SMP Debian 4.9.110-3+deb9u6 (2018-10-08) x86_64 GNU/Linux

I am using

geerlingguy.kubernetes, 2.0.0

The problem also exists if I set

kubernetes_pod_network_cidr: 10.244.0.0/16

not updating kubernetes apt preferences

Hi, this is a feature request.

When this repository updates the variables for the kubernetes-version defined here , the apt-pinning will be overwritten. However, this does not follow the upgrade process defined by kubernetes, e.g. https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade-1-12/

Of course it is possible to use own variables, but I think it would be better to use the template task here with
force: false

What do you think? If you agree, I will create a PR

CentOS 7 CI test fails with '/etc/default/kubelet does not exist'

See failed build: https://travis-ci.org/geerlingguy/ansible-role-kubernetes/jobs/425572547

TASK [role_under_test : Configure KUBELET_EXTRA_ARGS.] *************************
fatal: [localhost]: FAILED! => {"changed": false, "msg": "Destination /etc/default/kubelet does not exist !", "rc": 257}

This was introduced in #15 โ€” and maybe I'll have to have some sort of backwards-compatible shim to preserve the old way if the Kubernetes version is < 1.11.

Compare to kubespray

I'm wondering why you choose to develop an own ansible role and not use kubespray ? COuld you pls document this/compare both in your README.md ?

Kubelet startup config flags deprecated - use config file instead

From Kubelet's startup logs via journalctl -f, when running Kubernetes 1.10+:

Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --resolv-conf has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --fail-swap-on has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.

See docs: https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/

Configure K8 with Cloud Provider

Thanks for this wonderful playbook. It works great for getting a cluster up and operational. I'm wondering if it's possible in the current playbook implementation to configure the cluster to work with a specific cloud provider like GCP or AWS?

Currently I run into conditions where K8 is unable to provision EBS volumes and other AWS resources even though the proper IAM permissions seemed to be applied to the EC2 instance.

Similar situation: https://stackoverflow.com/questions/56064860/failed-to-get-aws-cloud-provider-getcloudprovider-returned-nil-instead

Here is a specific example of the error encountered when attempting to create a Persistent Volume via AWS EBS.

Deployed K8 Storage Class:

kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
name: ssd
annotations:
    storageclass.kubernetes.io/is-default-class: "true"
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
encrypted: "true"
fsType: "xfs"
reclaimPolicy: Delete
allowVolumeExpansion: true

K8 Error displayed when describing the PVC

Warning ProvisioningFailed 2m9s (x11 over 10m) persistentvolume-controller Failed to provision volume with StorageClass "ssd": Failed to get AWS Cloud Provider. GetCloudProvider returned instead

Any insight on how I might accomplish this using the playbook?

Enable passing service-cidr to kubeadm init

It would be great to be able to pass service-cidr to kubeadm init. Some default var like kubernetes_kubeadm_init_extra_args or kubernetes_service_network_cidr: "10.96.0.0/12" would be very helpful.

Error during node setup

I get the following error on my "worker" nodes when trying to use this role.

TASK [geerlingguy.kubernetes : Join node to Kubernetes master] ********************************************************************
...
...
fatal: [publicip]: FAILED! => {
    "changed": true,
    "cmd": "kubeadm join privateip:6443 --token xfr3l9.0b4q4ywkl4r2wxol --discovery-token-ca-cert-hash sha256:f08127b739d4993778864a4415e1c2f56a4e7bfcd8b878c1fb8f06de07f8c3cd",
    "delta": "0:00:00.659428",
    "end": "2018-08-24 21:53:13.746266",
    "invocation": {
        "module_args": {
            "_raw_params": "kubeadm join privateip:6443 --token xfr3l9.0b4q4ywkl4r2wxol --discovery-token-ca-cert-hash sha256:f08127b739d4993778864a4415e1c2f56a4e7bfcd8b878c1fb8f06de07f8c3cd",
            "_uses_shell": true,
            "argv": null,
            "chdir": null,
            "creates": "/etc/kubernetes/kubelet.conf",
            "executable": null,
            "removes": null,
            "stdin": null,
            "warn": true
        }
    },
    "msg": "non-zero return code",
    "rc": 1,
    "start": "2018-08-24 21:53:13.086838",
    "stderr": "I0824 21:53:13.197909   11924 kernel_validator.go:81] Validating kernel version\nI0824 21:53:13.197988   11924 kernel_validator.go:96] Validating kernel config\nconfigmaps \"kubelet-config-1.11\" is forbidden: User \"system:bootstrap:xfr3l9\" cannot get configmaps in the namespace \"kube-system\"",
    "stderr_lines": [
        "I0824 21:53:13.197909   11924 kernel_validator.go:81] Validating kernel version",
        "I0824 21:53:13.197988   11924 kernel_validator.go:96] Validating kernel config",
        "configmaps \"kubelet-config-1.11\" is forbidden: User \"system:bootstrap:xfr3l9\" cannot get configmaps in the namespace \"kube-system\""
    ],
    "stdout": "[preflight] running pre-flight checks\n[discovery] Trying to connect to API Server \"privateip:6443\"\n[discovery] Created cluster-info discovery client, requesting info from \"https://privateip:6443\"\n[discovery] Requesting info from \"https://privateip:6443\" again to validate TLS against the pinned public key\n[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server \"privateip:6443\"\n[discovery] Successfully established connection with API Server \"privateip:6443\"\n[kubelet] Downloading configuration for the kubelet from the \"kubelet-config-1.11\" ConfigMap in the kube-system namespace",
    "stdout_lines": [
        "[preflight] running pre-flight checks",
        "[discovery] Trying to connect to API Server \"privateip:6443\"",
        "[discovery] Created cluster-info discovery client, requesting info from \"https://privateip:6443\"",
        "[discovery] Requesting info from \"https://privateip:6443\" again to validate TLS against the pinned public key",
        "[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server \"privateip:6443\"",
        "[discovery] Successfully established connection with API Server \"privateip:6443\"",
        "[kubelet] Downloading configuration for the kubelet from the \"kubelet-config-1.11\" ConfigMap in the kube-system namespace"
    ]
}

Failure when joining nodes to master

I've been working with a Vagrant file to try and bootstrap a Kubernetes cluster using your role. I've set up a three machine file with one master and two nodes. At first, I had a lot of trouble with the script hanging at the join nodes to master step, but then I realized I was advertising the Kubernetes API on the NAT that Vagrant applies to all machines. After setting up a private network and changing the advertising address for Kubernetes I was able to get an actual error instead of hanging. Here's the link to my repo, and here's the output of the role:

TASK [kubernetes : include_tasks] **********************************************
skipping: [kube-master]
skipping: [kube-worker-1]
skipping: [kube-worker-2]

TASK [kubernetes : include_tasks] **********************************************
included: /home/jonathan/Documents/Projects/home-server/scripts/roles/kubernetes/tasks/setup-Debian.yml for kube-master, kube-worker-1, kube-worker-2

TASK [kubernetes : Ensure dependencies are installed.] *************************
ok: [kube-worker-1] => (item=[u'apt-transport-https', u'ca-certificates'])
ok: [kube-master] => (item=[u'apt-transport-https', u'ca-certificates'])
ok: [kube-worker-2] => (item=[u'apt-transport-https', u'ca-certificates'])

TASK [kubernetes : Add Kubernetes apt key.] ************************************
changed: [kube-worker-1]
changed: [kube-master]
changed: [kube-worker-2]

TASK [kubernetes : Add Kubernetes repository.] *********************************
changed: [kube-master]
changed: [kube-worker-2]
changed: [kube-worker-1]

TASK [kubernetes : Ensure dependencies are installed.] *************************
ok: [kube-worker-2]
ok: [kube-worker-1]
ok: [kube-master]

TASK [kubernetes : Install Kubernetes packages.] *******************************
changed: [kube-master] => (item={u'state': u'present', u'name': u'kubelet'})
changed: [kube-worker-1] => (item={u'state': u'present', u'name': u'kubelet'})
changed: [kube-master] => (item={u'state': u'present', u'name': u'kubeadm'})
ok: [kube-master] => (item={u'state': u'present', u'name': u'kubectl'})
changed: [kube-worker-2] => (item={u'state': u'present', u'name': u'kubelet'})
ok: [kube-master] => (item={u'state': u'present', u'name': u'kubernetes-cni'})
changed: [kube-worker-1] => (item={u'state': u'present', u'name': u'kubeadm'})
ok: [kube-worker-1] => (item={u'state': u'present', u'name': u'kubectl'})
ok: [kube-worker-1] => (item={u'state': u'present', u'name': u'kubernetes-cni'})
changed: [kube-worker-2] => (item={u'state': u'present', u'name': u'kubeadm'})
ok: [kube-worker-2] => (item={u'state': u'present', u'name': u'kubectl'})
ok: [kube-worker-2] => (item={u'state': u'present', u'name': u'kubernetes-cni'})

TASK [kubernetes : Configure KUBELET_EXTRA_ARGS.] ******************************
changed: [kube-worker-1]
changed: [kube-worker-2]
changed: [kube-master]

TASK [kubernetes : Reload systemd unit if args were changed.] ******************
changed: [kube-worker-2]
changed: [kube-worker-1]
changed: [kube-master]

TASK [kubernetes : Ensure kubelet is started and enabled at boot.] *************
ok: [kube-master]
ok: [kube-worker-1]
ok: [kube-worker-2]

TASK [kubernetes : Check if Kubernetes has already been initialized.] **********
ok: [kube-worker-1]
ok: [kube-worker-2]
ok: [kube-master]

TASK [kubernetes : include_tasks] **********************************************
skipping: [kube-worker-1]
skipping: [kube-worker-2]
included: /home/jonathan/Documents/Projects/home-server/scripts/roles/kubernetes/tasks/master-setup.yml for kube-master

TASK [kubernetes : Initialize Kubernetes master with kubeadm init.] ************
changed: [kube-master]

TASK [kubernetes : Print the init output to screen.] ***************************
skipping: [kube-master]

TASK [kubernetes : Ensure .kube directory exists.] *****************************
changed: [kube-master]

TASK [kubernetes : Symlink the kubectl admin.conf to ~/.kube/conf.] ************
changed: [kube-master]

TASK [kubernetes : Configure Flannel networking.] ******************************
changed: [kube-master] => (item=kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel-rbac.yml)
changed: [kube-master] => (item=kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml)

TASK [kubernetes : Allow pods on master node (if configured).] *****************
changed: [kube-master]

TASK [kubernetes : Check if Kubernetes Dashboard UI service already exists.] ***
ok: [kube-master]

TASK [kubernetes : Enable the Kubernetes Web Dashboard UI (if configured).] ****
skipping: [kube-master]

TASK [kubernetes : Get the kubeadm join command from the Kubernetes master.] ***
ok: [kube-master]

TASK [kubernetes : include_tasks] **********************************************
skipping: [kube-master]
included: /home/jonathan/Documents/Projects/home-server/scripts/roles/kubernetes/tasks/node-setup.yml for kube-worker-1, kube-worker-2

TASK [kubernetes : Join node to Kubernetes master] *****************************
fatal: [kube-worker-1]: FAILED! => {"changed": true, "cmd": "kubeadm join 10.0.0.10:6443 --token 3w17ow.5vh1zdgcg6lgzgtj --discovery-token-ca-cert-hash sha256:f8276059fb2b765a55b4723cfb5e7ba413c021c58c5eb81bcdccac2933a730c0", "delta": "0:00:10.863798", "end": "2018-07-09 11:22:15.461549", "msg": "non-zero return code", "rc": 1, "start": "2018-07-09 11:22:04.597751", "stderr": "\t[WARNING RequiredIPVSKernelModulesAvailable]: the IPVS proxier will not be used, because the following required kernel modules are not loaded: [ip_vs ip_vs_rr ip_vs_wrr ip_vs_sh] or no builtin kernel ipvs support: map[ip_vs:{} ip_vs_rr:{} ip_vs_wrr:{} ip_vs_sh:{} nf_conntrack_ipv4:{}]\nyou can solve this problem with following methods:\n 1. Run 'modprobe -- ' to load missing kernel modules;\n2. Provide the missing builtin kernel ipvs support\n\nI0709 11:22:04.695777    7794 kernel_validator.go:81] Validating kernel version\nI0709 11:22:04.695970    7794 kernel_validator.go:96] Validating kernel config\n\t[WARNING SystemVerification]: docker version is greater than the most recently validated version.Docker version: 18.03.1-ce. Max validated version: 17.03\nconfigmaps \"kubelet-config-1.11\" is forbidden: User \"system:bootstrap:3w17ow\" cannot get configmaps in the namespace \"kube-system\"", "stderr_lines": ["\t[WARNING RequiredIPVSKernelModulesAvailable]: the IPVS proxier will not be used, because the following required kernel modules are not loaded: [ip_vs ip_vs_rr ip_vs_wrr ip_vs_sh] or no builtin kernel ipvs support: map[ip_vs:{} ip_vs_rr:{} ip_vs_wrr:{} ip_vs_sh:{} nf_conntrack_ipv4:{}]", "you can solve this problem with following methods:", " 1. Run 'modprobe -- ' to load missing kernel modules;", "2. Provide the missing builtin kernel ipvs support", "", "I0709 11:22:04.695777    7794 kernel_validator.go:81] Validating kernel version", "I0709 11:22:04.695970    7794 kernel_validator.go:96] Validating kernel config", "\t[WARNING SystemVerification]: docker version is greater than the most recently validated version. Docker version: 18.03.1-ce. Max validated version: 17.03", "configmaps \"kubelet-config-1.11\" is forbidden: User \"system:bootstrap:3w17ow\" cannot get configmaps in the namespace \"kube-system\""], "stdout": "[preflight] running pre-flight checks\n[discovery] Trying to connect to API Server \"10.0.0.10:6443\"\n[discovery] Created cluster-info discovery client, requesting info from \"https://10.0.0.10:6443\"\n[discovery] Failed to connect to API Server \"10.0.0.10:6443\": token id \"3w17ow\" is invalid for this cluster or it has expired. Use \"kubeadm token create\" on the master node to creating a new valid token\n[discovery] Trying to connect to API Server \"10.0.0.10:6443\"\n[discovery] Created cluster-info discovery client, requesting info from \"https://10.0.0.10:6443\"\n[discovery] Failed to connect to API Server \"10.0.0.10:6443\": token id \"3w17ow\" is invalid for this cluster or it has expired. Use \"kubeadm token create\" on the master node to creating a new valid token\n[discovery] Trying to connect to API Server \"10.0.0.10:6443\"\n[discovery] Created cluster-info discovery client, requesting info from \"https://10.0.0.10:6443\"\n[discovery] Requesting info from \"https://10.0.0.10:6443\" again to validate TLS against the pinned public key\n[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server \"10.0.0.10:6443\"\n[discovery] Successfully established connection with API Server \"10.0.0.10:6443\"\n[kubelet] Downloading configuration for the kubelet from the \"kubelet-config-1.11\" ConfigMap inthe kube-system namespace", "stdout_lines": ["[preflight] running pre-flight checks", "[discovery] Trying to connect to API Server \"10.0.0.10:6443\"", "[discovery] Created cluster-info discovery client, requesting info from \"https://10.0.0.10:6443\"", "[discovery] Failed to connect to API Server \"10.0.0.10:6443\": token id \"3w17ow\" is invalid for this cluster or it has expired. Use \"kubeadm token create\" on the master node to creating a new valid token", "[discovery] Trying to connect to API Server \"10.0.0.10:6443\"", "[discovery] Created cluster-info discovery client, requesting info from \"https://10.0.0.10:6443\"", "[discovery] Failed to connect to API Server \"10.0.0.10:6443\": token id \"3w17ow\" is invalid for this cluster or it has expired. Use \"kubeadm token create\" on the master node to creating a new valid token", "[discovery] Trying to connect to API Server \"10.0.0.10:6443\"", "[discovery] Created cluster-info discovery client, requesting info from \"https://10.0.0.10:6443\"", "[discovery] Requesting info from \"https://10.0.0.10:6443\" again to validate TLS against the pinned public key", "[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server \"10.0.0.10:6443\"", "[discovery] Successfully established connection with API Server \"10.0.0.10:6443\"", "[kubelet] Downloading configuration for the kubelet from the \"kubelet-config-1.11\" ConfigMap in the kube-system namespace"]}
fatal: [kube-worker-2]: FAILED! => {"changed": true, "cmd": "kubeadm join 10.0.0.10:6443 --token 3w17ow.5vh1zdgcg6lgzgtj --discovery-token-ca-cert-hash sha256:f8276059fb2b765a55b4723cfb5e7ba413c021c58c5eb81bcdccac2933a730c0", "delta": "0:00:10.880070", "end": "2018-07-09 11:22:14.974144", "msg": "non-zero return code", "rc": 1, "start": "2018-07-09 11:22:04.094074", "stderr": "\t[WARNING RequiredIPVSKernelModulesAvailable]: the IPVS proxier will not be used, because the following required kernel modules are not loaded: [ip_vs_sh ip_vs ip_vs_rr ip_vs_wrr] or no builtin kernel ipvs support: map[ip_vs_rr:{} ip_vs_wrr:{} ip_vs_sh:{} nf_conntrack_ipv4:{} ip_vs:{}]\nyou can solve this problem with following methods:\n 1. Run 'modprobe -- ' to load missing kernel modules;\n2. Provide the missing builtin kernel ipvs support\n\nI0709 11:22:04.206006    7786 kernel_validator.go:81] Validating kernel version\nI0709 11:22:04.206166    7786 kernel_validator.go:96] Validating kernel config\n\t[WARNING SystemVerification]: docker version is greater than the most recently validated version.Docker version: 18.03.1-ce. Max validated version: 17.03\nconfigmaps \"kubelet-config-1.11\" is forbidden: User \"system:bootstrap:3w17ow\" cannot get configmaps in the namespace \"kube-system\"", "stderr_lines": ["\t[WARNING RequiredIPVSKernelModulesAvailable]: the IPVS proxier will not be used, because the following required kernel modules are not loaded: [ip_vs_sh ip_vs ip_vs_rr ip_vs_wrr] or no builtin kernel ipvs support: map[ip_vs_rr:{} ip_vs_wrr:{} ip_vs_sh:{} nf_conntrack_ipv4:{} ip_vs:{}]", "you can solve this problem with following methods:", " 1. Run 'modprobe -- ' to load missing kernel modules;", "2. Provide the missing builtin kernel ipvs support", "", "I0709 11:22:04.206006    7786 kernel_validator.go:81] Validating kernel version", "I0709 11:22:04.206166    7786 kernel_validator.go:96] Validating kernel config", "\t[WARNING SystemVerification]: docker version is greater than the most recently validated version. Docker version: 18.03.1-ce. Max validated version: 17.03", "configmaps \"kubelet-config-1.11\" is forbidden: User \"system:bootstrap:3w17ow\" cannot get configmaps in the namespace \"kube-system\""], "stdout": "[preflight] running pre-flight checks\n[discovery] Trying to connect to API Server \"10.0.0.10:6443\"\n[discovery] Created cluster-info discovery client, requesting info from \"https://10.0.0.10:6443\"\n[discovery] Failed to connect to API Server \"10.0.0.10:6443\": token id \"3w17ow\" is invalid for this cluster or it has expired. Use \"kubeadm token create\" on the master node to creating a new valid token\n[discovery] Trying to connect to API Server \"10.0.0.10:6443\"\n[discovery] Created cluster-info discovery client, requesting info from \"https://10.0.0.10:6443\"\n[discovery] Failed to connect to API Server \"10.0.0.10:6443\": token id \"3w17ow\" is invalid for this cluster or it has expired. Use \"kubeadm token create\" on the master node to creating a new valid token\n[discovery] Trying to connect to API Server \"10.0.0.10:6443\"\n[discovery] Created cluster-info discovery client, requesting info from \"https://10.0.0.10:6443\"\n[discovery] Requesting info from \"https://10.0.0.10:6443\" again to validate TLS against the pinned public key\n[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server \"10.0.0.10:6443\"\n[discovery] Successfully established connection with API Server \"10.0.0.10:6443\"\n[kubelet] Downloading configuration for the kubelet from the \"kubelet-config-1.11\" ConfigMap inthe kube-system namespace", "stdout_lines": ["[preflight] running pre-flight checks", "[discovery] Trying to connect to API Server \"10.0.0.10:6443\"", "[discovery] Created cluster-info discovery client, requesting info from \"https://10.0.0.10:6443\"", "[discovery] Failed to connect to API Server \"10.0.0.10:6443\": token id \"3w17ow\" is invalid for this cluster or it has expired. Use \"kubeadm token create\" on the master node to creating a new valid token", "[discovery] Trying to connect to API Server \"10.0.0.10:6443\"", "[discovery] Created cluster-info discovery client, requesting info from \"https://10.0.0.10:6443\"", "[discovery] Failed to connect to API Server \"10.0.0.10:6443\": token id \"3w17ow\" is invalid for this cluster or it has expired. Use \"kubeadm token create\" on the master node to creating a new valid token", "[discovery] Trying to connect to API Server \"10.0.0.10:6443\"", "[discovery] Created cluster-info discovery client, requesting info from \"https://10.0.0.10:6443\"", "[discovery] Requesting info from \"https://10.0.0.10:6443\" again to validate TLS against the pinned public key", "[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server \"10.0.0.10:6443\"", "[discovery] Successfully established connection with API Server \"10.0.0.10:6443\"", "[kubelet] Downloading configuration for the kubelet from the \"kubelet-config-1.11\" ConfigMap in the kube-system namespace"]}

RUNNING HANDLER [kubernetes : restart kubelet] *********************************
changed: [kube-master]
        to retry, use: --limit @/home/jonathan/Documents/Projects/home-server/scripts/bootstrap.retry

PLAY RECAP *********************************************************************
kube-master                : ok=31   changed=16   unreachable=0    failed=0
kube-worker-1              : ok=23   changed=10   unreachable=0    failed=1
kube-worker-2              : ok=23   changed=10   unreachable=0    failed=1

Ansible failed to complete successfully. Any error output should be
visible above. Please fix these errors and try again.

I think I might need to change OS or add a package, but I'm kind of lost at this point.
EDIT: Just read the rest of the error output and noticed that the token isn't working. Is this a formatting issue? Should I print the join command fully and see what's happening with it?

Configure Flannel networking

(item=kubectl apply -f /tmp/kac.yml) => {"ansible_loop_var": "item", "changed": false, "cmd": ["kubectl", "apply", "-f", "/tmp/kac.yml"], "delta": "0:00:00.330669", "end": "2019-06-13 16:09:26.484968", "item": "kubectl apply -f /tmp/kac.yml", "msg": "non-zero return code", "rc": 1, "start": "2019-06-13 16:09:26.154299", "stderr": "error: the path "/tmp/kac.yml" does not exist", "stderr_lines": ["error: the path "/tmp/kac.yml" does not exist"], "stdout": "", "stdout_lines": []}

Not installing dashboard if the cluster is running

I forgot to enable the dashboard and tried to run the role again. I got this:

TASK [geerlingguy.kubernetes : Check if Kubernetes Dashboard UI service already exists.] *******************************
ok: [192.168.88.247]

TASK [geerlingguy.kubernetes : Enable the Kubernetes Web Dashboard UI (if configured).] ********************************
skipping: [192.168.88.247]

When I run the command manually, I get nothing, obviously because there is no dashboard running.

kubectl get services --namespace kube-system | grep -q kubernetes-dashboard

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.