Giter VIP home page Giter VIP logo

ansible-kubernetes-openshift-pi3's Introduction

Ansible 2 Playbooks for installing Kubernetes on Raspberry Pis 3

Here are the Ansible playbooks for a Raspberry Pi Cluster running Docker and Kubernetes as described in this Blog Post. These playbooks require Ansible 2.0 and won't work with Ansible 1.x.

The goals of thise project are

  • Using Ansible for not only a one-shot installation but also maintainance and upgrades.
  • Using WiFi for connecting the cluster. See below for the reason.
  • Get OpenShift Origin running and be able to switch between Kubernetes and OpenShift via Ansible.
  • Create a demonstration platform for my favourite development and integration platform fabric8.

Shopping List

Here's a shopping list for a Raspberry Pi 3 cluster, along with (non-affiliate) links to (German) shops (as of April 2016), but I'm sure you can find them elswhere, too.

Amount Part Price
4 Raspberry Pi 3 4 * 38 EUR
4 Micro SD Card 32 GB 4 * 11 EUR
1 WLAN Router 22 EUR
4 USB wires 9 EUR
1 Power Supply 30 EUR
1 Case 10 EUR
3 Intermediate Case Plate 3 * 7 EUR

All in all, a 4 node Pi cluster for 288 EUR (as of April 2016).

Some remarks:

  • Using WiFi for the connection has the big advantage that the Raspberry Pi 3 integrated BCM43438 WiFi chip doesn't go over USB and saves valuable bandwidth used for IO in general. That way you are able to to get ~ 25 MB/s for disk IO and network traffic, respectively. And also less cables, of course. You can alway plug the power wire for demos, too ;-)
  • Use a class 10 Mirco SD but it doesn't have to be the fastest on the world at the USB bus only allows around 25 MB/s anyway.

Initial Pi Setup

Most of the installation is automated by using Ansible. Thanks to Hypriot images a complete headless setup is possible.

  1. Download the latest Hyoriot image and store it as hypriot.zip :

     curl -L https://github.com/hypriot/image-builder-rpi/releases/download/v1.7.1/hypriotos-rpi-v1.7.1.img.zip -o hypriot.zip
    
  2. Install Hypriot's flash installer script. Follow the directions on the installation page. Important: For using the latest Hypriot Images >= 1.7.0 please use the Shell script from the master branch. The latest release 0.2.0 does not yet support the new configuration used by Hypriot 1.7.0. The script must be reachable from within your $PATH and it must be executable.

  3. Insert you Micro-SD card in your Desktop computer (via an adapter possibly) and run the wrapper script

tools/flash-hypriot.sh --hostname n0 --ssid "mysid" --password "secret" --image hypriot.zip

"mysid" is your WLAN SID and "secret" the corresponding password. You will be asked to which device to write. Check this carefully, otherwise you could destroy your Desktop OS if selecting the the wrong device. Typically its something like /dev/disk2 on OS X, but depends on the number of hard drives you have.

  1. Repeat step 2. to 3. for each Micro SD card. Please adapt the hostname before each round to n1, n2, n3.

Network Setup

It is now time to configure your WLAN router. This of course depends on which router you use. The following instructions are based on a TP-Link TL-WR802N which is quite inexepensive but still absolutely ok for our purposes since it sits very close to the cluster and my notebook anyway.

First of all you need to setup the SSID and password. Use the same credentials with which you have configured your images.

My setup is, that I span a private network 192.168.23.0/24 for the Pi cluster which my MacBook also joins via its integrated WiFi.

The addresses I have chosen are :

IP Device
192.168.23.1 WLAN Router
192.168.23.100 MacBook's WLAN
192.168.23.200 ... 192.168.23.203 Raspberry Pis

The MacBook is setup for NAT and forwarding from this private network to the internet. This script helps in setting up the forwarding and NAT rules on OS X.

In order to configure your WLAN router you need to connect to it according to its setup instructions. The router is setup in Access Point mode with DHCP enabled. As soon as the MAC of the Pis are known (which you can see as soon as they connect for the first time via WiFi), I configured them to always use the same DHCP lease. For the TL-WR802N this can be done in the configuration section DHCP -> Address Reservation. In the DHCP -> DHCP-Settings the default gateway is set to 192.168.23.100, which my notebook's WLAN IP.

Startup all nodes, you should be able to ping every node in your cluster. I added n0 ... n3 to my notebook's /etc/hosts pointing to 192.168.23.200 ... 192.168.23.203 for convenience.

You should be able to ssh into every Pi with user pirate and password hypriot. Also, if you set up the forwarding on your desktop properly you should be able to ping from within the pi to the outside world. Internet access from the nodes is mandatory for setting up the nodes with Ansible

Ansible Playbooks

After this initial setup is done, the next step is to initialize the base system with Ansible. You will need Ansible 2 installed on your desktop (e.g. brew install ansible when running on OS X)

Ansible Configuration

  1. Checkout the Ansible playbooks:

     git clone https://github.com/Project31/ansible-kubernetes-openshift-pi3.git k8s-pi
     cd k8s-pi
    
  2. Copy over hosts.example and adapt it to your needs

     cp hosts.example hosts
     vi hosts
    

    There are three groups:

    • pis contains all members of your cluster where one is marked as "master" in the field host_extra. This group will be added to every node in its /etc/hosts. It is important that one host is marked as "master", since the playbooks rely on this host alias for accessing the API server.
    • master IP address of the Master
    • nodes All nodes which are not Master
  3. If required, copy over the configuration and adapt it:

     cp config.yml.example config.yml
     vi config.yml
    

Init machine-id

Because of a pecularity of Hypriot OS 1.5 which causes every machine id to be the same, /etc/machine-id need to be initialized once for each node. This is required later e.g. by the Weave overlay network as it calculates its virtual Mac address from this datum.

To do so, call the following Ansible ad-hoc command:

ansible pis -u pirate -k -i hosts --become -m shell --args "dbus-uuidgen > /etc/machine-id"

Use "hypriot" as password here. You can also use the script tools/init_machine_id.sh. If you get errors during this command, please check that you don't have stale entries

Basic Node Setup

If you have already created a cluster with these playbooks and want to start a fresh, please be sure that you cleanup your ~/.ssh/known_hosts from the old host keys. The script tools/cleanup_known_hosts.sh can be used for this. You should be able to ssh into each of the nodes without warnings. Also you must be able to reach the internet from the nodes.

In the next step the basic setup (without Kubernetes) is performed. This is done by

ansible-playbook -k -i hosts setup.yml

When you are prompted for the password, use hypriot. You will probably also need to confirm the SSH authentity for each host with yes.

The following steps will be applied by this command (which may take a bit):

  • Docker will be installed from the Hypriot repositories
  • Your public SSH key .ssh/id_rsa.pub is copied over to pi's authenticated_keys and the users password will be taken from config.yml
  • Some extra tools are installed for your convenience and some benchmarking:
    • hdparm
    • iperf
    • mtr
    • vim
    • dnsutils
    • jq
  • Hostname is set to the name of the node configured. Also /etc/hosts is setup to contain all nodes with their short names.

With this basic setup you have already a working Docker environment.

Ingress

As ingress controller we use traefik. It will get deployed as part of management playbook and will run as DaemonSet.

To test ingress add <nodeIPAddress> traefik-ui.pi.local dashboard.pi.local to your /etc/hosts file.

For any other resource you want to export - create ingress resource:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: traefik-web-ui
  namespace: kube-system
  labels:
    k8s-app: traefik-ingress-lb
spec:
  rules:
  - host: traefik-ui.pi.local
    http:
      paths:
      - path: /
        backend:
          serviceName: traefik-service
          servicePort: admin

Kubernetes Setup

The final step for a working Kubernetes cluster is to run

ansible-playbook -i hosts kubernetes.yml

This will install one master at n0 and threed additional nodes n1, n2, n3 with the help of kubeadm

In addition this playbook does the following:

  • Creates a token in run/kubeadm-token.txt if not done already and use it for installing master and nodes
  • Installs kubectl and an alias k
  • Creates a run/pi-cluster.cfg which can be used for kubectl on the local host to access the pi cluster's master. Either use kubectl --kubeconfig run/pi-cluster.cfg or set the environment variable export KUBECONFIG=$(pwd)/run/pi-cluster.cfg

The initial installation may take a bit until all infrastructure docker images has been pulled from the registry. Eventually you should be able to use kubectl get nodes from e.g. n0 or from the localhost (if you set the config as described above).

Full Kubernetes reset

In case you need a full cleanup of the Kubernetes setup, use:

ansible-playbook -i hosts kubernetes-full-reset.yml

This is also needed in case you want to change one of the Pod or Services subnets.

Tools

In the tools/ directory you find some useful scripts:

  • cleanup_known_hosts.sh for removing the entries for n0, n1, n2 and n3 in ~/.ssh/known_hosts in case you want to completely reinstall the cluster
  • setup_nat_on_osx.sh switches on NAT so that the cluster can reach the Internet for loading the required images. Call it without arguments for usage informations
  • setup_nat_off_osx.sh switches off NAT again.
  • halt_pis.sh stop the cluster (needs still a bit of tuning)
  • reboot_pis.sh reboot the cluster (needs still a bit of tuning)
  • init_machine_id.sh initialize the /etc/machine-id to a random value on every host

FAQ

  • I have random DNS issues when resolving external IP adresses

One reason for this could be, that your external DNS provider does some nasty things when a resolution fails (which then might be even cached by Kubernetes). E.g. the Deutsche Telekom is known that by default it enables a so called "Navigationshilfe" which redirect for a failed DNS lookup to their own pages. You can turn this off in the "Kundencenter" preferences. More on the symptoms can be found in this issue

Next steps ...

For the future we plan the following features to add:

  • Volume support
  • Registry
  • OpenShift support

Acknowledgements

  • Many thanks goes out to Lucas Käldström whose kubeadm workshop gave a lot of inspiration to these playbooks.
  • Thanks to Sergio Sisternes for the inspiration to switch to kubeadm which makes things much easier and the manual setup of etcd and flanneld superfluous.
  • Many kudos to Robert Peteuil for a thorough review of the Ansible tasks and update information for Hypriot 1.5. This has simplified the role definitions considerably.
  • Thanks to Mangirdas Judeikis for updating the playbooks to Kubernetes 1.9, Ansible 2.4 and introducing Traefik as load balancer.

ansible-kubernetes-openshift-pi3's People

Contributors

calphool avatar kenden avatar maxromanovsky avatar mjudeikis avatar rhuss avatar sandor-nemeth avatar sitoch avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ansible-kubernetes-openshift-pi3's Issues

kubeadm 1.7.1 cannot start due to bug

Kubeadm 1.7.1 is broken, when the init would run one'll get an error:

$ kubeadm init --config /etc/kubernetes/kubeadm.yml                                                                                                                 
[kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters.            
[init] Using Kubernetes version: v1.7.1         
[init] Using Authorization modes: [Node RBAC]   
[preflight] Running pre-flight checks           
can not mix '--config' with other arguments  

The solution is to downgrade to 1.7.0 with the following change in roles/kubernetes/tasks/apt.yml:

- name: Install Packages
  apt:
    name: "{{ item }}"
    force: yes
    state: present
  with_items:
    - kubelet=1.7.0-00
    - kubeadm=1.7.0-00
    - kubectl=1.7.0-00
    - kubernetes-cni

After this it works.

The corresponding kubernetes bug is: kubernetes/kubeadm#345

This affects Kubernetes 1.7.1, should be resolved when that update hits the repos.

Bug with kube-proxy

Hello.

All the tasks succeed, but in dmesg, as seen in hypriot/image-builder-rpi#166 .

I know it's not specific to ansible scripts however. Is it worth trying with a 3.16 kernel?

[ 1334.159754] ------------[ cut here ]------------
[ 1334.164707] WARNING: CPU: 1 PID: 3942 at kernel/sched/core.c:2966 preempt_count_add+0xfc/0x118()
[ 1334.174007] DEBUG_LOCKS_WARN_ON((preempt_count() & PREEMPT_MASK) >= PREEMPT_MASK - 10)
[ 1334.182183] Modules linked in:
[ 1334.185466]  xt_comment xt_mark ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype iptable_filter ip_tables xt_conntrack x_tables nf_nat nf_conntrack br_netfilter bridge stp llc bnep hci_uart btbcm bluetooth dm_thin_pool dm_bio_prison dm_persistent_data dm_bufio dm_mod brcmfmac brcmutil cfg80211 rfkill snd_bcm2835 snd_pcm snd_timer snd bcm2835_gpiomem bcm2835_wdt uio_pdrv_genirq uio overlay ipv6
[ 1334.226879] CPU: 1 PID: 3942 Comm: kube-proxy Not tainted 4.4.50-hypriotos-v7+ #1
[ 1334.234710] Hardware name: BCM2709
[ 1334.238323] [<80019468>] (unwind_backtrace) from [<80014a14>] (show_stack+0x20/0x24)
[ 1334.246453] [<80014a14>] (show_stack) from [<803362dc>] (dump_stack+0xbc/0x108)
[ 1334.254147] [<803362dc>] (dump_stack) from [<8002672c>] (warn_slowpath_common+0x8c/0xc8)
[ 1334.262638] [<8002672c>] (warn_slowpath_common) from [<800267a8>] (warn_slowpath_fmt+0x40/0x48)
[ 1334.271759] [<800267a8>] (warn_slowpath_fmt) from [<8005005c>] (preempt_count_add+0xfc/0x118)
[ 1334.280698] [<8005005c>] (preempt_count_add) from [<805bf608>] (_raw_spin_lock+0x20/0x60)
[ 1334.289314] [<805bf608>] (_raw_spin_lock) from [<7f3cd45c>] (nf_conntrack_set_hashsize+0xa4/0x200 [nf_conntrack])
[ 1334.391752] [<7f3cd45c>] (nf_conntrack_set_hashsize [nf_conntrack]) from [<80043530>] (param_attr_store+0x6c/0xc4)
[ 1334.491015] [<80043530>] (param_attr_store) from [<80042864>] (module_attr_store+0x30/0x3c)
[ 1334.588651] [<80042864>] (module_attr_store) from [<801dfd1c>] (sysfs_kf_write+0x54/0x58)
[ 1334.685822] [<801dfd1c>] (sysfs_kf_write) from [<801df4f4>] (kernfs_fop_write+0xc8/0x1c8)
[ 1334.783030] [<801df4f4>] (kernfs_fop_write) from [<8016ad94>] (__vfs_write+0x34/0xe8)
[ 1334.879753] [<8016ad94>] (__vfs_write) from [<8016b658>] (vfs_write+0xa0/0x1a8)
[ 1334.976101] [<8016b658>] (vfs_write) from [<8016bf78>] (SyS_write+0x4c/0xa0)
[ 1335.027879] [<8016bf78>] (SyS_write) from [<8000fc40>] (ret_fast_syscall+0x0/0x1c)
[ 1335.124128] ---[ end trace 41b89ceb5c5e0202 ]---
[ 1335.176234] BUG: scheduling while atomic: kube-proxy/3942/0x00000401
[ 1335.226418] Modules linked in: xt_nat xt_recent ipt_REJECT nf_reject_ipv4 xt_tcpudp xt_comment xt_mark ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype iptable_filter ip_tables xt_conntrack x_tables nf_nat nf_conntrack br_netfilter bridge stp llc bnep hci_uart btbcm bluetooth dm_thin_pool dm_bio_prison dm_persistent_data dm_bufio dm_mod brcmfmac brcmutil cfg80211 rfkill snd_bcm2835 snd_pcm snd_timer snd bcm2835_gpiomem bcm2835_wdt uio_pdrv_genirq uio overlay ipv6
[ 1335.587102] Preemption disabled at:[<  (null)>]   (null)

[ 1335.679506] CPU: 1 PID: 3942 Comm: kube-proxy Tainted: G        W       4.4.50-hypriotos-v7+ #1
[ 1335.772295] Hardware name: BCM2709
[ 1335.816601] [<80019468>] (unwind_backtrace) from [<80014a14>] (show_stack+0x20/0x24)
[ 1335.906485] [<80014a14>] (show_stack) from [<803362dc>] (dump_stack+0xbc/0x108)
[ 1335.995536] [<803362dc>] (dump_stack) from [<8010c4d4>] (__schedule_bug+0xac/0xd0)
[ 1336.085542] [<8010c4d4>] (__schedule_bug) from [<805bbc60>] (__schedule+0x6a0/0x750)
[ 1336.176499] [<805bbc60>] (__schedule) from [<805bbf1c>] (schedule+0x58/0xb8)
[ 1336.225211] [<805bbf1c>] (schedule) from [<80014210>] (do_work_pending+0x3c/0xd4)
[ 1336.315047] [<80014210>] (do_work_pending) from [<8000fc68>] (slow_work_pending+0xc/0x20)
HypriotOS/armv7: root@master01 in ~

persistentvolume-binder disabled by default

Hi,
is there a reason why the persistentvolume-binder is disabled by default?

roles/kubernetes/templates/kubeadm.yml:

controllers: "*,-persistentvolume-binder,bootstrapsigner,tokencleaner"

In my fork I'm using GlusterFS and I had to enable the persistentvolume-binder in order to bind PersistentVolumeClaims to PersistentVolumes. So I far I didn't have issues and my pod with MySQL is bound correctly.

Thank you

Sito

v0.2.0 One or more undefined variables: 'dict object' has no attribute 'etcd'

TASK: [etcd | Download Binaries] **********************************************
fatal: [163.172.142.208 -> 127.0.0.1] => One or more undefined variables: 'dict object' has no attribute 'etcd'

also (before me copying the file to the right position)
[h32@UbuntuMATE ansible-kubernetes-openshift-pi3]$ ansible-playbook -vvvv -i hosts kubernetes.yml
ERROR: file could not read: /data/h32/ansible-kubernetes-openshift-pi3/roles/includes/install_binaries.yml

It seems to me that the state is a little broken for the kubernetes (not openshift) setup...

BTW are you doing tests on Scaleway's ARM servers? I'm currently trying to get your scripts running there...

ansible instructions fails with "No hosts matched"

To workaround the problem I commented out and run become: yes and become_method: sudo from setup.yml.

After that I ran ansible-playbook -i setup-host setup.yml -sk instead of ansible-playbook -i setup-host setup.yml -k

typo in header

Header of repo reads

Experimental Ansible playbook for setting up Kubernetes on Rasperry Pi 3

which should be

Experimental Ansible playbook for setting up Kubernetes on Raspberry Pi 3

also, would be good to tag the repo with some keywords for ease of location and further retrieval - I had partially lost this one from memory until another issue had been opened to bring it back to recall.

Task: "Check for an already generated token" requires local sudo without password

Hello.

I'm in the process to install kubernetes on my raspberry pi from my Linux box. On this box, my user uses sudo, but sudo is not configured to connect without password.

So, ansible's local actions fail:

TASK [kubernetes : Check for an already generated token]

fatal: [192.168.0.20 -> localhost]: FAILED! => {"changed": false, "failed": true, "module_stderr": "sudo: a  password is required\n", "module_stdout": "", "msg": "MODULE FAILURE", "rc": 1}

Enabling sudo without password on the ansible box make it work.

This should be indicated in the readme (or I missed it).

Management Role includes "includes/install_k8s_resource.yml"?

Is it meant to point to includes/install_binaries.yml?

FYI - i love your repo. It's very clean & it helped proactively fix a lot of problems before they occurred (like the same machine_id on all nodes, etc...)

There are a few tasks that are no longer necessary with HypriotOS 1.5. Should I just list them for you, or would you like me to submit a PR?

DNS issues

Hey there,

I've been inspired by your work of using your ansible playbooks to provision a K8S cluster with 4 RPis. I tried to get a cluster up and running as well using your scripts. (with the example config and without wifi)

The problem is that I cannot reach other pods or external servers from within a pod. (wanted to put the gitlab runner on there)
Using nslookup kubernetes.default on hypriot/rpi-alpine:3.6 gives the following:

nslookup: can't resolve '(null)': Name does not resolve

Name:      kubernetes.default
Address 1: 10.96.0.1 kubernetes.default.svc.cluster.local

/etc/resolv.conf looks like this:

nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local routerbfb8b0.com
options ndots:5

I found out that there's a known issue with alpine up to version 3.3 but I don't use any of these old versions. I tried it with hypriot/rpi-alpine:3.6 and resin/rpi-raspbian:jessie and busybox.
https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#known-issues

I also used an upgrade weave(2.0.5) but that did not help as well. I couldn't try with flannel since your scripts are not 100% finished there. kube-dns logs does not show any errors.
Do you have any suggestions? I don't know where else to look.

Thank you very much!

EDIT:
I found out that internal names can be resolved. So I assume kube-dns is basically working but I cannot get external names to be resolved.

EDIT 2:
Seems like I cannot access the internet at all with the following images:

  • hypriot/rpi-alpine:3.6
  • resin/rpi-raspbian:jessie
    busybox seems to only image which works.
    I can work around this "limitation" by specifying hostNetwork: true but this is not something I want to prefer as a solution. I see that the pod is then getting the node ip and is able to go through my router. :/ Also by using that I cannot resolve K8S related services anymore.
    Any ideas how to get around this setting?

After clean Install: Port occupied

During ansible-playbook -i hosts kubernetes.yml:

ASK [kubernetes : Run kubeadm init on master] ************************************************************************************************************************************
fatal: [192.168.0.230]: FAILED! => {"changed": true, "cmd": ["kubeadm", "init", "--config", "/etc/kubernetes/kubeadm.yml"], "delta": "0:00:06.811351", "end": "2017-10-22 15:50:01.583502", "failed": true, "rc": 2, "start": "2017-10-22 15:49:54.772151", "stderr": "[preflight] Some fatal errors occurred:\n\tPort 10250 is in use\n\tPort 10251 is in use\n\tPort 10252 is in use\n\t/etc/kubernetes/manifests is not empty\n\tPort 2379 is in use\n\t/var/lib/etcd is not empty\n[preflight] If you know what you are doing, you can skip pre-flight checks with --skip-preflight-checks", "stdout": "[kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters.\n[init] Using Kubernetes version: v1.8.2-beta.0\n[init] Using Authorization modes: [Node RBAC]\n[preflight] Running pre-flight checks", "stdout_lines": ["[kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters.", "[init] Using Kubernetes version: v1.8.2-beta.0", "[init] Using Authorization modes: [Node RBAC]", "[preflight] Running pre-flight checks"], "warnings": []}
to retry, use: --limit @/root/k8s-pi/kubernetes.retry

Fail when run Kubernetes playbook.

Hi,

Thank you for your code. I was able to run the code following the doc until I ran:

ansible-playbook -i hosts kubernetes.yml

I have got the following errors:

TASK [kubernetes : iptables | Add FORWARD ACCEPT for cni0 (in)] ****************************************************************************************************************
fatal: [192.168.7.247]: FAILED! => {"changed": true, "cmd": ["/sbin/iptables", "-A", "FORWARD", "-i", "cni0", "-j", "ACCEPT", "-m", "comment", "--comment", "CNI-Forward-Fix"], "delta": "0:00:00.009288", "end": "2019-02-21 20:58:39.514717", "msg": "non-zero return code", "rc": 1, "start": "2019-02-21 20:58:39.505429", "stderr": "iptables: No chain/target/match by that name.", "stderr_lines": ["iptables: No chain/target/match by that name."], "stdout": "", "stdout_lines": []}
to retry, use: --limit @/Volumes/Fifth/Backup/Softwares/Raspberry Pi/k8s-pi/kubernetes.retry

PLAY RECAP *********************************************************************************************************************************************************************
192.168.7.247 : ok=12 changed=1 unreachable=0 failed=1

Thank you for your help.

Not sure what's going on...

I followed the instructions, and I've definitely got docker up and running on the nodes. However, the Kubernetes install seems incomplete or something (no kubernetes master binaries?)

When I run kubectl cluster-info I get this:

The connection to the server master:8080 was refused - did you specify the right host or port?

However I can ping master just fine.

When I run:

netstat -a | grep 8080 on the master node, I get nothing back, so it seems that something was supposed to be installed, but it didn't get installed. When I look at the ansible scripts however I don't see where anything like the API listener was included.

What am I missing here?

Docker downgrade fails

Hi,

I'm trying to run this great ansible scripts, but the docker downgrade always fails.

RUNNING HANDLER [kubernetes : restart docker] ***************************************************************************************************************************************
fatal: [192.168.1.200]: FAILED! => {"changed": false, "failed": true, "msg": "Unable to start service docker: Job for docker.service failed. See 'systemctl status docker.service' and 'journalctl -xn' for details.\n"}

Anyone else also facing this issue ?

Getting an error running the kubernetes playbook

I get a fair way through the playbook when this error occurs:

TASK [kubernetes : Install kubelet service definition] *************************
fatal: [192.168.1.59]: FAILED! => {"changed": false, "failed": true, "msg": "AnsibleUndefinedVariable: 'dns' is undefined"}

I see dns.service_ip in the kubelet.service, but don't see where that should already have been defined.
Any assistance would be appreciated.

TASK [base : Copy SSH Key] Problem.

I am using Hypriot 11.2 image.

The ssh id_rsa.pub file is missing under Pi account.. how can I fix it?

task path: /home/g2david/k8s-pi/roles/base/tasks/user.yml:20
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: If you are using a module and expect the file to exist on the remote, see the remote_src option
fatal: [192.168.1.26]: FAILED! => {"changed": false, "msg": "Could not find or access '~/.ssh/id_rsa.pub' on the Ansible Controller.\nIf you are using a module and expect the file to exist on the remote, see the remote_src option"}
A

badly linked kubelet binary

This holds true for 1.2.0 and 1.2.3:

root@k8s-master:# file /usr/bin/apt
/usr/bin/apt: ELF 32-bit LSB shared object, ARM, EABI5 version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-armhf.so.3, for GNU/Linux 2.6.32, BuildID[sha1]=ae8ca4dda2b5e13978f904378bd43e02759b41b6, stripped
root@k8s-master:
# file /usr/bin/kubelet
/usr/bin/kubelet: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.3, for GNU/Linux 2.6.32, BuildID[sha1]=8451bc098c7ecbda570a38b5e9884a54eb00103a, not stripped
root@k8s-master:~# file /usr/bin/kubectl
/usr/bin/kubectl: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), statically linked, not stripped

As you can see, the kubelet is linked to the non-ARM LD version. Not sure why this works for you. I fixed it in my system by creating a hard-link from the ARM version to the searched path, but I guess that's not what you did :)

Storage driver

Hello,
first of all, thank you for this repository: works great and straightforward.
I just have a question about the default storage driver, why the default is devicemapper? I had some bad experiences with it on other OS and I've seen in your code that we can override it with overlay.
Did you try overlay and you had problems? To override it I just have to put it in the config.yaml, right?

Thank you

Sito

docker daemon doesn't launch after running 'base node setup' playbook

Hi,

I downloaded a fresh hypriot image, flashed my rpi 2 SD cards, downloaded the latest version of your ansible files, edited the ansible configs and executed the init machine-id command you mention in your howto. All working fine.

After executing the setup playbook and rebooting the rpis, their docker daemon does not start anymore.

Executing systemctl status docker.service generates the following error

● docker.service - Docker Application Container Engine
   Loaded: loaded (/lib/systemd/system/docker.service; enabled)
   Active: failed (Result: start-limit) since Sun 2017-06-04 10:25:13 CEST; 7min ago
     Docs: https://docs.docker.com
  Process: 969 ExecStart=/usr/bin/dockerd -H fd:// (code=exited, status=1/FAILURE)
 Main PID: 969 (code=exited, status=1/FAILURE)

Jun 04 10:25:11 n0 dockerd[969]: time="2017-06-04T10:25:11.907392501+02:00" level=info msg="libcontainerd: new containerd process, pid: 977"
Jun 04 10:25:12 n0 dockerd[969]: time="2017-06-04T10:25:12.983599830+02:00" level=warning msg="devmapper: Usage of loopback devices is strongly discouraged for production use. Please use `--storage-opt dm.thinpooldev` or use `man docker` to refer to dm.thinpooldev section."
Jun 04 10:25:13 n0 dockerd[969]: time="2017-06-04T10:25:13.106356942+02:00" level=warning msg="devmapper: Base device already exists and has filesystem ext4 on it. User specified filesystem  will be ignored."
Jun 04 10:25:13 n0 dockerd[969]: time="2017-06-04T10:25:13.221333633+02:00" level=fatal msg="Error starting daemon: error initializing graphdriver: \"/var/lib/docker\" contains several valid graphdrivers: devicemapper, overlay2; Please cleanup or explicitly choose storage driver (-s <DRIVER>)"
Jun 04 10:25:13 n0 systemd[1]: docker.service: main process exited, code=exited, status=1/FAILURE
Jun 04 10:25:13 n0 systemd[1]: Failed to start Docker Application Container Engine.
Jun 04 10:25:13 n0 systemd[1]: Unit docker.service entered failed state.
Jun 04 10:25:13 n0 systemd[1]: Starting Docker Application Container Engine...
Jun 04 10:25:13 n0 systemd[1]: docker.service start request repeated too quickly, refusing to start.
Jun 04 10:25:13 n0 systemd[1]: Failed to start Docker Application Container Engine.

Do you have any advise?

Thanks!

Two times a Destination not writeable error in kubernetes.yml

During the running of the playbook I get two simple errors:

TASK [etcd : Install etcd] *****************************************************
included: /home/harry/k8s-pi/includes/install_binaries.yml for 192.168.23.200

TASK [etcd : Download Binaries] ************************************************
fatal: [192.168.23.200 -> 127.0.0.1]: FAILED! => {"changed": false, "failed": true, "msg": "Destination /home/harry/k8s-pi/roles/etcd/files not writable"}

.... and .....

TASK [flannel : Install flanneld] **********************************************
included: /home/harry/k8s-pi/includes/install_binaries.yml for 192.168.23.200

TASK [flannel : Download Binaries] *********************************************
fatal: [192.168.23.200 -> 127.0.0.1]: FAILED! => {"changed": false, "failed": true, "msg": "Destination /home/harry/k8s-pi/roles/flannel/files not writable"}

I worked around this with:
mkdir /home/harry/k8s-pi/roles/etcd/files
mkdir /home/harry/k8s-pi/roles/flannel/files

Unbelievable, I just made a working Kubernetes cluster in two hours, including unpacking the hardware :-)

Iḿ going to have so much fun with this :-)

BTW, I just used the wireless router as a NAT gateway, because my W10 laptop cannot do nat :-)
And I used a Virtualbox VM with an Ubuntu VM to run Ansible.
And it worked straight away.

Error to exec playbook "kubernetes.yml"

I trying build the cluster but I have the follow error when I tried launch "ansible-playbook -i hosts kubernetes.yml"

"item": [
"kubelet=1.8.1*",
"kubeadm=1.8.1*",
"kubectl=1.8.1*",
"kubernetes-cni"
],
"msg": [
"'/usr/bin/apt-get -y -o "Dpkg::Options::=--force-confdef" -o "Dpkg::Options::=--force-confold" --force-yes install 'kubelet=1.8.1*' 'kubeadm=1.8.1*' 'kubectl=1.8.1*' 'kubernetes-cni'' failed: E: Unable to correct problems, you have held broken packages.",
""
],
"rc": "100",
"stderr": [
"E: Unable to correct problems, you have held broken packages.",
""
],
"stderr_lines": [
"E: Unable to correct problems, you have held broken packages."
],
"stdout": [
"Reading package lists...",
"Building dependency tree...",
"Reading state information...",
"Some packages could not be installed. This may mean that you have",
"requested an impossible situation or if you are using the unstable",
"distribution that some required packages have not yet been created",
"or been moved out of Incoming.",
"The following information may help to resolve the situation:",
"",
"The following packages have unmet dependencies:",
" kubelet : Depends: kubernetes-cni (= 0.5.1) but 0.6.0-00 is to be installed",
""
],

Could you kindly help me please?

Horizontal Pod Scaler cannot get resources

Hi,

I have installed the kubernetes on 4 Pi's with the playbooks (works perfect !).
Also the management playbook has been installed.
In the kubernetes UI I can see CPU/Memory graphs, so I assume heapster is working.

Nut when I want to use the horizontal pod autoscaler, I get errors the the CPU resources cannot be found
"unable to get metrics for resource cpu: unable to fetch metrics from API: the server could not find the requested resource (get pods.metrics.k8s.io)"

I found some issues reported that the value "--horizontal-pod-autoscaler-use-rest-clients=false", but using that value is also not working.

Do I need to install something extra ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.