Giter VIP home page Giter VIP logo

caasp-salt's Introduction

Description

A collection of Salt states used to provision a kubernetes cluster in the context of CaaSP/MicroOS.

Running Tests

First of all, you have to install tox:

% zypper in python-tox

After that, from the root of the project you can run style checks:

$ tox -e flake8

And unit tests:

$ tox -e tests-salt-2018.3.0-py27

If you want to run everything simply perform:

$ tox

Salt states and CaaSP architecture

The Salt state creates an architecture that depends on some (already existing) elements. This is the expected setup before running the orchestration:

The salt master is installed in the Admin Dashboard and is reponsible for orchestrating the deployment and controlling all the minions in the cluster, including a local minion that is used for generating the CA certificate.

An etcd instance is also used for bootstrapping the other etcd instances in the cluster. It will be configured from Salt with the number of etcd masters set in the pillar.

After running the orchestration, the architecture will look like this:

Some notes:

  • the etcd discovery server is not a member of the etcd cluster: it is only used as a rendezvous for bootstrapping the etcd cluster.
  • for the systems that need to talk to etcd, they connect to the local etcd instance. This local server will be either a full member of the etcd cluster or an instance that works in rw-proxy mode.

License

This project is licensed under the Apache License, Version 2.0. See LICENSE for the full license text.

caasp-salt's People

Contributors

ata18 avatar bergmannf avatar chentex avatar danielorf avatar dannysauer avatar davidcassany avatar drpaneas avatar emilianolangella avatar ereslibre avatar federicoceratto avatar flavio avatar grahamhayes avatar inercia avatar jgleissner avatar jimmykarily avatar jordimassaguerpla avatar jvanz avatar kiall avatar mallozup avatar maximilianmeister avatar mjura avatar mssola avatar pi-victor avatar rbwsam avatar robdaemon avatar slicknik avatar spiarh avatar thkukuk avatar vitoravelino avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

caasp-salt's Issues

API healtch check is running too fast

The healtch check is running too fast:

Jun 15 23:06:06 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Could not LazyLoad caasp_retriable.mod_watch: 'caasp_retriable.mod_watch' is not available.
Jun 15 23:06:06 alvaro-caasp-node-1 salt-minion[1663]: [INFO    ] Running state [https://api.infra.caasp.local:6444/healthz] at time 23:06:06.471015
Jun 15 23:06:06 alvaro-caasp-node-1 salt-minion[1663]: [INFO    ] Executing state caasp_retriable.retry for https://api.infra.caasp.local:6444/healthz
Jun 15 23:06:06 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:06 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:06 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:06 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:06 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:06 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:06 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:07 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:07 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:07 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:07 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:07 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:07 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:07 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:07 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:07 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:07 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:07 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:07 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:07 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:07 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:07 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:07 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:07 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:07 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:08 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:08 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:08 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:08 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:08 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:08 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:08 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:08 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:08 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:08 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:08 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:08 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:08 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:08 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:08 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:08 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:08 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:08 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:08 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:08 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:08 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:08 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:08 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:08 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:08 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:08 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:08 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:08 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:08 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:09 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:09 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:09 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:09 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:09 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:09 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:09 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:09 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:09 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:09 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:09 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:09 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:09 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:09 alvaro-caasp-node-1 salt-minion[1663]: [INFO    ] User saltapi Executing command grains.items with jid 20180615210609237691
Jun 15 23:06:09 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Command details {'tgt_type': 'glob', 'jid': '20180615210609237691', 'tgt': '78ffa47c530b4f2cb59c6c6aa1d3c816', 'ret': '', 'user': 'saltapi', 'arg': [], 'fun': 'grains.items'}
Jun 15 23:06:09 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:09 alvaro-caasp-node-1 salt-minion[1663]: [INFO    ] Starting a new job with PID 3530
Jun 15 23:06:09 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:09 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] LazyLoaded grains.items
Jun 15 23:06:09 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] LazyLoaded direct_call.get
Jun 15 23:06:09 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Minion return retry timer set to 10 seconds (randomized)
Jun 15 23:06:09 alvaro-caasp-node-1 salt-minion[1663]: [INFO    ] Returning information for job: 20180615210609237691
Jun 15 23:06:09 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', '78ffa47c530b4f2cb59c6c6aa1d3c816', 'tcp://192.168.113.248:4506', 'aes')
Jun 15 23:06:09 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Initializing new AsyncAuth for ('/etc/salt/pki/minion', '78ffa47c530b4f2cb59c6c6aa1d3c816', 'tcp://192.168.113.248:4506')
Jun 15 23:06:09 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:09 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:09 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:09 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:09 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:09 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:09 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:09 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:09 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:09 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:10 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:10 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:10 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:10 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:10 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:10 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:10 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:10 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:10 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:10 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:10 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:10 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:10 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:10 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:10 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:10 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:10 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method
Jun 15 23:06:10 alvaro-caasp-node-1 salt-minion[1663]: [DEBUG   ] Requesting URL https://api.infra.caasp.local:6444/healthz using GET method

This happens in a 3.0 master.

when using casp as an image a non existing repository gets added

it's a mix between opensuse and the casp version:

minion0:~ # zypper lr -u
Repository priorities are without effect. All enabled repositories share the same priority.

# | Alias      | Name       | Enabled | GPG Check | Refresh | URI                                                                                    
--+------------+------------+---------+-----------+---------+----------------------------------------------------------------------------------------
1 | containers | containers | Yes     | (  ) No   | Yes     | http://download.opensuse.org/repositories/Virtualization:/containers/openSUSE_Leap_1.0/

therefore the comment above https://github.com/kubic-project/salt/blob/master/salt/repositories/containers.repo#L8 is not really true

[Nice-To-Have] integrate salt efforts upstream

It would be nice to integrate some of our work, if possible, for deployment on openSUSE, upstream. So that we can make people aware that openSUSE is also supported for salt deployments of kubernetes not just Red Hat distros and Debian.

duplicate etcd config.

I think this is why the name of the nodes are messed up on bootstrap. The default install probably makes --name default to hostname. Then we generate the env var file and restart the service, which makes it pick up the right settings (partially false, the name should reflect the changes). This is also why we get that member directory that causes issues.
I'd suggest using a custom systemd unit file to avoid this in the future, or making sure the service isn't started till the env file is generated.

With:

ExecStart=/bin/bash -c "GOMAXPROCS=$(nproc) /usr/sbin/etcd --name=\"${ETCD_NAME}\" --data-dir=\"${ETCD_DATA_DIR}\" --listen-client-urls=\"${ETCD_LISTEN_CLIENT_URLS}\""

From the etcd start log

...
Mar 14 12:57:21 minion0 etcd[9369]: recognized environment variable ETCD_NAME, but unused: shadowed by corresponding flag 
Mar 14 12:57:21 minion0 etcd[9369]: recognized environment variable ETCD_DATA_DIR, but unused: shadowed by corresponding flag 
Mar 14 12:57:21 minion0 etcd[9369]: recognized environment variable ETCD_LISTEN_CLIENT_URLS, but unused: shadowed by corresponding flag 

after removing the flags from systemd service.

ExecStart=/bin/bash -c "GOMAXPROCS=$(nproc) /usr/sbin/etcd"
Mar 14 12:58:20 minion0 etcd[10075]: recognized and used environment variable ETCD_ADVERTISE_CLIENT_URLS=http://minion0.k8s.local:2379,http://10.17.3.103:2379
Mar 14 12:58:20 minion0 etcd[10075]: recognized and used environment variable ETCD_CA_FILE=/etc/pki/trust/anchors/SUSE_CaaSP_CA.crt
Mar 14 12:58:20 minion0 etcd[10075]: recognized and used environment variable ETCD_CERT_FILE=/etc/pki/minion.crt
Mar 14 12:58:20 minion0 etcd[10075]: recognized and used environment variable ETCD_DATA_DIR=/var/lib/etcd/
Mar 14 12:58:20 minion0 etcd[10075]: recognized and used environment variable ETCD_DISCOVERY=http://dashboard:2379/v2/keys/_etcd/registry/k8s
Mar 14 12:58:20 minion0 etcd[10075]: recognized and used environment variable ETCD_DISCOVERY_FALLBACK=proxy
Mar 14 12:58:20 minion0 etcd[10075]: recognized and used environment variable ETCD_INITIAL_ADVERTISE_PEER_URLS=https://minion0.k8s.local:2380,https://10.17.3.103:2380
Mar 14 12:58:20 minion0 etcd[10075]: recognized and used environment variable ETCD_INITIAL_CLUSTER_STATE=new
Mar 14 12:58:20 minion0 etcd[10075]: recognized and used environment variable ETCD_INITIAL_CLUSTER_TOKEN=k8s
Mar 14 12:58:20 minion0 etcd[10075]: recognized and used environment variable ETCD_KEY_FILE=/etc/pki/minion.key
Mar 14 12:58:20 minion0 etcd[10075]: recognized and used environment variable ETCD_LISTEN_CLIENT_URLS=http://0.0.0.0:2379
Mar 14 12:58:20 minion0 etcd[10075]: recognized and used environment variable ETCD_LISTEN_PEER_URLS=https://0.0.0.0:2380
Mar 14 12:58:20 minion0 etcd[10075]: recognized and used environment variable ETCD_NAME=minion0.k8s.local
Mar 14 12:58:20 minion0 etcd[10075]: recognized and used environment variable ETCD_PEER_CA_FILE=/etc/pki/trust/anchors/SUSE_CaaSP_CA.crt
Mar 14 12:58:20 minion0 etcd[10075]: recognized and used environment variable ETCD_PEER_CERT_FILE=/etc/pki/minion.crt
Mar 14 12:58:20 minion0 etcd[10075]: recognized and used environment variable ETCD_PEER_KEY_FILE=/etc/pki/minion.key
Mar 14 12:58:20 minion0 etcd[10075]: recognized and used environment variable ETCD_PEER_TRUSTED_CA_FILE=/etc/pki/trust/anchors/SUSE_CaaSP_CA.crt
Mar 14 12:58:20 minion0 etcd[10075]: recognized and used environment variable ETCD_TRUSTED_CA_FILE=/etc/pki/trust/anchors/SUSE_CaaSP_CA.crt

cc: @inercia @ereslibre

flannel: use the right backend

Right now flannel always uses the UDP backend. That means using overlay + a custom encapsulation protocol. We should change this approach to:

  • libvirtd: use host-gw (plain routing): this makes debugging easier and grants better performances
  • openstack: use vxlan because it's the fastest overlay driver (encapsulation/decapsulation is done in kernel space, not in userspace like with UDP)

It would be better to make this configurable through a pillar.

salt orchestration fails

When I run the salt orchestration to bootstrap a cluster, I get the following error:

      ID: etcd-setup
Function: salt.state
  Result: False
 Comment: Run failed on minions: caasp-node-afc4cf66, caasp-node-75b6cb18, caasp-node-f31e8f4a
          Failures:
              caasp-node-afc4cf66:
                  Data failed to compile:
              ----------
                  Rendering SLS 'base:ca-cert' failed: Jinja variable list object has no element 0
              ----------
                  Rendering SLS 'base:cert' failed: Jinja variable list object has no element 0
              /var/cache/salt/minion/files/base/_macros/certs.jinja(68):
              ---
              [...]
                    - file: /etc/pki
              
              {{ crt }}:
                caasp_retriable.retry:
                  - target: x509.certificate_managed
                  - ca_server: {{ salt['mine.get']('roles:ca', 'ca.crt', expr_form='grain').keys()[0] }} <======================
                  - signing_policy: minion
                  - public_key: {{ key }}
                {%- if cn %}
                  - CN: {{ cn|yaml_dquote }}
                {%- else %}
              [...]
              ---

It appears the mine.get call does not return the certificate as intended. And in fact, if I call salt-call mine.get 'roles:ca' ca.crt grain I get nothing back. Using 'ca' instead of 'roles:ca' works. Running salt ca mine.get 'roles:ca' ca.crt grain on the salt master also returns the certificate.

Any idea?

bogus parameter values in scheduler start command

on an opensuse deployment the jinja template that is supposed to pass parameters to the scheduler isn't getting parsed. i'm assuming it's because the value is empty.

kube      3470     1  0 13:27 ?        00:00:02 /usr/bin/hyperkube scheduler --logtostderr=true --v=2 --master=http://127.0.0.1:8080 --v=2 {{ pillar[components][scheduler][args]}}

/etc/kubernetes/scheduler

###
# kubernetes scheduler config

# default config should be adequate

# Add your own!
KUBE_SCHEDULER_ARGS="--v=2 {{ pillar['components']['scheduler']['args']}}"

Pillar override does not allow for unitary lists

We are using the external pillar override using mysql, where lists are interpreted as collisions. This has an important drawback on our case, and is that we lose the ability to represent unitary lists on the current implementation.

https://github.com/kubic-project/salt/blob/bace710019cc5bd7f7fcc175597c8057e15142d5/config/master.d/returner.conf#L54-L76

This will be read by salt as a value if no key collision happens, but a collision needs at least two values. In this situation we cannot represent unitary lists.

We should fix this in a way that the queries are crafted in a way that this is possible.

etcd traffic broken because of invalid cerificates

etcd is currently broken in our setup because of this:

...
May 08 16:11:21 minion0 etcd[2489]: recognized and used environment variable ETCD_ADVERTISE_CLIENT_URLS=http://minion0.k8s.local:2379,http://10.17.15.251:2379
May 08 16:11:21 minion0 etcd[2489]: recognized and used environment variable ETCD_CA_FILE=/etc/pki/trust/anchors/SUSE_CaaSP_CA.crt
May 08 16:11:21 minion0 etcd[2489]: recognized and used environment variable ETCD_CERT_FILE=/etc/pki/minion.crt
May 08 16:11:21 minion0 etcd[2489]: recognized and used environment variable ETCD_DISCOVERY=http://dashboard:2379/v2/keys/_etcd/registry/k8s
May 08 16:11:21 minion0 etcd[2489]: recognized and used environment variable ETCD_DISCOVERY_FALLBACK=proxy
May 08 16:11:21 minion0 etcd[2489]: recognized and used environment variable ETCD_INITIAL_ADVERTISE_PEER_URLS=https://minion0.k8s.local:2380,https://10.17.15.251:2380
May 08 16:11:21 minion0 etcd[2489]: recognized and used environment variable ETCD_INITIAL_CLUSTER_STATE=new
May 08 16:11:21 minion0 etcd[2489]: recognized and used environment variable ETCD_INITIAL_CLUSTER_TOKEN=k8s
May 08 16:11:21 minion0 etcd[2489]: recognized and used environment variable ETCD_KEY_FILE=/etc/pki/minion.key
May 08 16:11:21 minion0 etcd[2489]: recognized and used environment variable ETCD_LISTEN_PEER_URLS=https://0.0.0.0:2380
May 08 16:11:21 minion0 etcd[2489]: recognized and used environment variable ETCD_PEER_CA_FILE=/etc/pki/trust/anchors/SUSE_CaaSP_CA.crt
May 08 16:11:21 minion0 etcd[2489]: recognized and used environment variable ETCD_PEER_CERT_FILE=/etc/pki/minion.crt
May 08 16:11:21 minion0 etcd[2489]: recognized and used environment variable ETCD_PEER_KEY_FILE=/etc/pki/minion.key
May 08 16:11:21 minion0 etcd[2489]: recognized and used environment variable ETCD_PEER_TRUSTED_CA_FILE=/etc/pki/trust/anchors/SUSE_CaaSP_CA.crt
May 08 16:11:21 minion0 etcd[2489]: recognized and used environment variable ETCD_TRUSTED_CA_FILE=/etc/pki/trust/anchors/SUSE_CaaSP_CA.crt
May 08 16:11:21 minion0 etcd[2489]: recognized environment variable ETCD_NAME, but unused: shadowed by corresponding flag 
May 08 16:11:21 minion0 etcd[2489]: recognized environment variable ETCD_DATA_DIR, but unused: shadowed by corresponding flag 
May 08 16:11:21 minion0 etcd[2489]: recognized environment variable ETCD_LISTEN_CLIENT_URLS, but unused: shadowed by corresponding flag 
May 08 16:11:21 minion0 etcd[2489]: etcd Version: 3.1.0
May 08 16:11:21 minion0 etcd[2489]: Git SHA: Not provided (use ./build instead of go build)
May 08 16:11:21 minion0 etcd[2489]: Go Version: go1.7.5
May 08 16:11:21 minion0 etcd[2489]: Go OS/Arch: linux/amd64
May 08 16:11:21 minion0 etcd[2489]: setting maximum number of CPUs to 1, total number of available CPUs is 1
May 08 16:11:21 minion0 etcd[2489]: peerTLS: cert = /etc/pki/minion.crt, key = /etc/pki/minion.key, ca = /etc/pki/trust/anchors/SUSE_CaaSP_CA.crt, trusted-ca = /etc/pki/trust/anchors/SUSE_CaaSP_CA.crt, client-ce
May 08 16:11:21 minion0 etcd[2489]: listening for peers on https://0.0.0.0:2380
May 08 16:11:21 minion0 etcd[2489]: The scheme of client url http://0.0.0.0:2379 is HTTP while peer key/cert files are presented. Ignored key/cert files.
May 08 16:11:21 minion0 etcd[2489]: listening for client requests on 0.0.0.0:2379
May 08 16:11:21 minion0 etcd[2489]: warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
May 08 16:11:21 minion0 etcd[2489]: resolving minion0.k8s.local:2380 to 10.17.15.251:2380
May 08 16:11:21 minion0 etcd[2489]: resolving minion0.k8s.local:2380 to 10.17.15.251:2380
May 08 16:11:22 minion0 etcd[2489]: found self 448c9f97433d8eb2 in the cluster
May 08 16:11:22 minion0 etcd[2489]: found 1 peer(s), waiting for 2 more
May 08 16:11:22 minion0 etcd[2489]: found peer a86932d3a18a66f1 in the cluster
May 08 16:11:22 minion0 etcd[2489]: found 2 peer(s), waiting for 1 more
May 08 16:11:22 minion0 etcd[2489]: found peer f84a2d545e336936 in the cluster
May 08 16:11:22 minion0 etcd[2489]: found 3 needed peer(s)
May 08 16:11:22 minion0 etcd[2489]: name = 50c22c86e1bf4789818233573f703b4b
May 08 16:11:22 minion0 etcd[2489]: data dir = /var/lib/etcd/
May 08 16:11:22 minion0 etcd[2489]: member dir = /var/lib/etcd/member
May 08 16:11:22 minion0 etcd[2489]: heartbeat = 100ms
May 08 16:11:22 minion0 etcd[2489]: election = 1000ms
May 08 16:11:22 minion0 etcd[2489]: snapshot count = 10000
May 08 16:11:22 minion0 etcd[2489]: discovery URL= http://dashboard:2379/v2/keys/_etcd/registry/k8s
May 08 16:11:22 minion0 etcd[2489]: advertise client URLs = http://10.17.15.251:2379,http://minion0.k8s.local:2379
May 08 16:11:22 minion0 etcd[2489]: initial advertise peer URLs = https://10.17.15.251:2380,https://minion0.k8s.local:2380
May 08 16:11:22 minion0 etcd[2489]: initial cluster = 50c22c86e1bf4789818233573f703b4b=https://10.17.15.251:2380,50c22c86e1bf4789818233573f703b4b=https://minion0.k8s.local:2380
May 08 16:11:22 minion0 etcd[2489]: starting member 448c9f97433d8eb2 in cluster ad73cfff5d6c66f4
May 08 16:11:22 minion0 etcd[2489]: 448c9f97433d8eb2 became follower at term 0
May 08 16:11:22 minion0 etcd[2489]: newRaft 448c9f97433d8eb2 [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0]
May 08 16:11:22 minion0 etcd[2489]: 448c9f97433d8eb2 became follower at term 1
May 08 16:11:22 minion0 etcd[2489]: warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
May 08 16:11:22 minion0 etcd[2489]: warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
May 08 16:11:22 minion0 etcd[2489]: starting peer a86932d3a18a66f1...
May 08 16:11:22 minion0 etcd[2489]: started HTTP pipelining with peer a86932d3a18a66f1
May 08 16:11:22 minion0 etcd[2489]: started streaming with peer a86932d3a18a66f1 (writer)
May 08 16:11:22 minion0 etcd[2489]: started streaming with peer a86932d3a18a66f1 (writer)
May 08 16:11:22 minion0 etcd[2489]: started peer a86932d3a18a66f1
May 08 16:11:22 minion0 etcd[2489]: added peer a86932d3a18a66f1
May 08 16:11:22 minion0 etcd[2489]: starting peer f84a2d545e336936...
May 08 16:11:22 minion0 etcd[2489]: started HTTP pipelining with peer f84a2d545e336936
May 08 16:11:22 minion0 etcd[2489]: started streaming with peer a86932d3a18a66f1 (stream MsgApp v2 reader)
May 08 16:11:22 minion0 etcd[2489]: started streaming with peer a86932d3a18a66f1 (stream Message reader)
May 08 16:11:22 minion0 etcd[2489]: started streaming with peer f84a2d545e336936 (writer)
May 08 16:11:22 minion0 etcd[2489]: started peer f84a2d545e336936
May 08 16:11:22 minion0 etcd[2489]: added peer f84a2d545e336936
May 08 16:11:22 minion0 etcd[2489]: starting server... [version: 3.1.0, cluster version: to_be_decided]
May 08 16:11:22 minion0 etcd[2489]: ClientTLS: cert = /etc/pki/minion.crt, key = /etc/pki/minion.key, ca = /etc/pki/trust/anchors/SUSE_CaaSP_CA.crt, trusted-ca = /etc/pki/trust/anchors/SUSE_CaaSP_CA.crt, client-
May 08 16:11:22 minion0 etcd[2489]: started streaming with peer f84a2d545e336936 (writer)
May 08 16:11:22 minion0 etcd[2489]: started streaming with peer f84a2d545e336936 (stream MsgApp v2 reader)
May 08 16:11:22 minion0 etcd[2489]: started streaming with peer f84a2d545e336936 (stream Message reader)
May 08 16:11:22 minion0 etcd[2489]: added member 448c9f97433d8eb2 [https://10.17.15.251:2380 https://minion0.k8s.local:2380] to cluster ad73cfff5d6c66f4
May 08 16:11:22 minion0 etcd[2489]: added member a86932d3a18a66f1 [https://10.17.15.120:2380 https://minion1.k8s.local:2380] to cluster ad73cfff5d6c66f4
May 08 16:11:22 minion0 etcd[2489]: added member f84a2d545e336936 [https://10.17.15.154:2380 https://master.k8s.local:2380] to cluster ad73cfff5d6c66f4
May 08 16:11:22 minion0 etcd[2489]: peer a86932d3a18a66f1 became active
May 08 16:11:22 minion0 etcd[2489]: established a TCP streaming connection with peer a86932d3a18a66f1 (stream MsgApp v2 writer)
May 08 16:11:22 minion0 etcd[2489]: established a TCP streaming connection with peer a86932d3a18a66f1 (stream Message writer)
May 08 16:11:23 minion0 etcd[2489]: established a TCP streaming connection with peer a86932d3a18a66f1 (stream Message reader)
May 08 16:11:23 minion0 etcd[2489]: established a TCP streaming connection with peer a86932d3a18a66f1 (stream MsgApp v2 reader)
May 08 16:11:23 minion0 etcd[2489]: 448c9f97433d8eb2 is starting a new election at term 1
May 08 16:11:23 minion0 etcd[2489]: 448c9f97433d8eb2 became candidate at term 2
May 08 16:11:23 minion0 etcd[2489]: 448c9f97433d8eb2 received MsgVoteResp from 448c9f97433d8eb2 at term 2
May 08 16:11:23 minion0 etcd[2489]: 448c9f97433d8eb2 [logterm: 1, index: 3] sent MsgVote request to a86932d3a18a66f1 at term 2
May 08 16:11:23 minion0 etcd[2489]: 448c9f97433d8eb2 [logterm: 1, index: 3] sent MsgVote request to f84a2d545e336936 at term 2
May 08 16:11:23 minion0 etcd[2489]: 448c9f97433d8eb2 received MsgVoteResp rejection from a86932d3a18a66f1 at term 2
May 08 16:11:23 minion0 etcd[2489]: 448c9f97433d8eb2 [quorum:2] has received 1 MsgVoteResp votes and 1 vote rejections
May 08 16:11:23 minion0 etcd[2489]: peer f84a2d545e336936 became active
May 08 16:11:23 minion0 etcd[2489]: established a TCP streaming connection with peer f84a2d545e336936 (stream Message reader)
May 08 16:11:23 minion0 etcd[2489]: established a TCP streaming connection with peer f84a2d545e336936 (stream MsgApp v2 writer)
May 08 16:11:23 minion0 etcd[2489]: established a TCP streaming connection with peer f84a2d545e336936 (stream Message writer)
May 08 16:11:23 minion0 etcd[2489]: 448c9f97433d8eb2 [logterm: 1, index: 3, vote: 448c9f97433d8eb2] rejected MsgVote from f84a2d545e336936 [logterm: 1, index: 3] at term 2
May 08 16:11:23 minion0 etcd[2489]: 448c9f97433d8eb2 received MsgVoteResp rejection from f84a2d545e336936 at term 2
May 08 16:11:23 minion0 etcd[2489]: 448c9f97433d8eb2 [quorum:2] has received 1 MsgVoteResp votes and 2 vote rejections
May 08 16:11:23 minion0 etcd[2489]: 448c9f97433d8eb2 became follower at term 2
May 08 16:11:23 minion0 etcd[2489]: established a TCP streaming connection with peer f84a2d545e336936 (stream MsgApp v2 reader)
May 08 16:11:23 minion0 etcd[2489]: 448c9f97433d8eb2 [term: 2] received a MsgVote message with higher term from a86932d3a18a66f1 [term: 3]
May 08 16:11:23 minion0 etcd[2489]: 448c9f97433d8eb2 became follower at term 3
May 08 16:11:23 minion0 etcd[2489]: 448c9f97433d8eb2 [logterm: 1, index: 3, vote: 0] cast MsgVote for a86932d3a18a66f1 [logterm: 1, index: 3] at term 3
May 08 16:11:23 minion0 etcd[2489]: raft.node: 448c9f97433d8eb2 elected leader a86932d3a18a66f1 at term 3
May 08 16:11:23 minion0 etcd[2489]: published {Name:50c22c86e1bf4789818233573f703b4b ClientURLs:[http://10.17.15.251:2379 http://minion0.k8s.local:2379]} to cluster ad73cfff5d6c66f4
May 08 16:11:23 minion0 systemd[1]: Started Etcd Server.
May 08 16:11:23 minion0 etcd[2489]: ready to serve client requests
May 08 16:11:23 minion0 etcd[2489]: serving insecure client requests on [::]:2379, this is strongly discouraged!
May 08 16:11:23 minion0 systemd[1]: Started /usr/bin/systemctl enable etcd.service.
May 08 16:11:23 minion0 systemd[1]: Reloading.
May 08 16:11:24 minion0 etcd[2489]: set the initial cluster version to 3.1
May 08 16:11:24 minion0 etcd[2489]: enabled capabilities for version 3.1
May 08 16:11:27 minion0 etcd[2489]: health check for peer a86932d3a18a66f1 could not connect: x509: cannot validate certificate for 10.17.15.120 because it doesn't contain any IP SANs
May 08 16:11:27 minion0 etcd[2489]: health check for peer f84a2d545e336936 could not connect: x509: cannot validate certificate for 10.17.15.154 because it doesn't contain any IP SANs
May 08 16:11:27 minion0 etcd[2489]: lost the TCP streaming connection with peer f84a2d545e336936 (stream Message reader)
May 08 16:11:27 minion0 etcd[2489]: lost the TCP streaming connection with peer f84a2d545e336936 (stream MsgApp v2 reader)
May 08 16:11:27 minion0 etcd[2489]: failed to dial f84a2d545e336936 on stream Message (read tcp 10.17.15.251:40590->10.17.15.154:2380: read: connection reset by peer)
May 08 16:11:27 minion0 etcd[2489]: peer f84a2d545e336936 became inactive
May 08 16:11:32 minion0 etcd[2489]: health check for peer f84a2d545e336936 could not connect: dial tcp 10.17.15.154:2380: getsockopt: connection refused
May 08 16:11:34 minion0 etcd[2489]: lost the TCP streaming connection with peer f84a2d545e336936 (stream MsgApp v2 writer)
May 08 16:11:34 minion0 etcd[2489]: lost the TCP streaming connection with peer f84a2d545e336936 (stream Message writer)
May 08 16:11:37 minion0 etcd[2489]: health check for peer f84a2d545e336936 could not connect: dial tcp 10.17.15.154:2380: getsockopt: connection refused
May 08 16:11:42 minion0 etcd[2489]: health check for peer f84a2d545e336936 could not connect: dial tcp 10.17.15.154:2380: getsockopt: connection refused
May 08 16:11:47 minion0 etcd[2489]: health check for peer f84a2d545e336936 could not connect: dial tcp 10.17.15.154:2380: getsockopt: connection refused
May 08 16:11:52 minion0 etcd[2489]: health check for peer f84a2d545e336936 could not connect: dial tcp 10.17.15.154:2380: getsockopt: connection refused
May 08 16:11:57 minion0 etcd[2489]: health check for peer f84a2d545e336936 could not connect: dial tcp 10.17.15.154:2380: getsockopt: connection refused
May 08 16:12:02 minion0 etcd[2489]: health check for peer f84a2d545e336936 could not connect: dial tcp 10.17.15.154:2380: getsockopt: connection refused
May 08 16:12:05 minion0 dbus[1071]: [system] Activating via systemd: service name='org.freedesktop.hostname1' unit='dbus-org.freedesktop.hostname1.service'
May 08 16:12:05 minion0 systemd[1]: Starting Hostname Service...
May 08 16:12:06 minion0 dbus[1071]: [system] Successfully activated service 'org.freedesktop.hostname1'
May 08 16:12:06 50c22c86e1bf4789818233573f703b4b.infra.caasp.local systemd-hostnamed[2600]: Changed host name to '50c22c86e1bf4789818233573f703b4b.infra.caasp.local'
May 08 16:12:06 50c22c86e1bf4789818233573f703b4b.infra.caasp.local systemd[1]: Started Hostname Service.
May 08 16:12:07 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2489]: health check for peer f84a2d545e336936 could not connect: dial tcp 10.17.15.154:2380: getsockopt: connection refused
May 08 16:12:09 50c22c86e1bf4789818233573f703b4b.infra.caasp.local systemd[1]: Reloading.
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local systemd[1]: Started /usr/bin/systemctl restart etcd.service.
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2489]: received terminated signal, shutting down...
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2489]: skipped leadership transfer for stopping non-leader member
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2489]: stopping peer a86932d3a18a66f1...
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2489]: closed the TCP streaming connection with peer a86932d3a18a66f1 (stream MsgApp v2 writer)
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2489]: stopped streaming with peer a86932d3a18a66f1 (writer)
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2489]: closed the TCP streaming connection with peer a86932d3a18a66f1 (stream Message writer)
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2489]: stopped streaming with peer a86932d3a18a66f1 (writer)
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2489]: stopped HTTP pipelining with peer a86932d3a18a66f1
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2489]: lost the TCP streaming connection with peer a86932d3a18a66f1 (stream MsgApp v2 reader)
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2489]: stopped streaming with peer a86932d3a18a66f1 (stream MsgApp v2 reader)
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2489]: lost the TCP streaming connection with peer a86932d3a18a66f1 (stream Message reader)
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2489]: failed to read a86932d3a18a66f1 on stream Message (net/http: request canceled)
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2489]: peer a86932d3a18a66f1 became inactive
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2489]: stopped streaming with peer a86932d3a18a66f1 (stream Message reader)
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2489]: stopped peer a86932d3a18a66f1
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2489]: stopping peer f84a2d545e336936...
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2489]: stopped streaming with peer f84a2d545e336936 (writer)
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2489]: stopped streaming with peer f84a2d545e336936 (writer)
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2489]: stopped HTTP pipelining with peer f84a2d545e336936
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2489]: stopped streaming with peer f84a2d545e336936 (stream MsgApp v2 reader)
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2489]: stopped streaming with peer f84a2d545e336936 (stream Message reader)
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2489]: stopped peer f84a2d545e336936
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local systemd[1]: Stopping Etcd Server...
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local systemd[1]: Stopped Etcd Server.
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local systemd[1]: Starting Etcd Server...
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: recognized and used environment variable ETCD_ADVERTISE_CLIENT_URLS=http://minion0.k8s.local:2379,http://10.17.15.251:2379
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: recognized and used environment variable ETCD_CA_FILE=/etc/pki/trust/anchors/SUSE_CaaSP_CA.crt
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: recognized and used environment variable ETCD_CERT_FILE=/etc/pki/minion.crt
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: recognized and used environment variable ETCD_DISCOVERY=http://dashboard.k8s.local:2379/v2/keys/_etcd/registry/k8s
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: recognized and used environment variable ETCD_DISCOVERY_FALLBACK=proxy
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: recognized and used environment variable ETCD_INITIAL_ADVERTISE_PEER_URLS=https://minion0.k8s.local:2380,https://10.17.15.251:2380
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: recognized and used environment variable ETCD_INITIAL_CLUSTER_STATE=new
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: recognized and used environment variable ETCD_INITIAL_CLUSTER_TOKEN=k8s
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: recognized and used environment variable ETCD_KEY_FILE=/etc/pki/minion.key
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: recognized and used environment variable ETCD_LISTEN_PEER_URLS=https://0.0.0.0:2380
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: recognized and used environment variable ETCD_PEER_CA_FILE=/etc/pki/trust/anchors/SUSE_CaaSP_CA.crt
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: recognized and used environment variable ETCD_PEER_CERT_FILE=/etc/pki/minion.crt
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: recognized and used environment variable ETCD_PEER_KEY_FILE=/etc/pki/minion.key
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: recognized and used environment variable ETCD_PEER_TRUSTED_CA_FILE=/etc/pki/trust/anchors/SUSE_CaaSP_CA.crt
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: recognized and used environment variable ETCD_TRUSTED_CA_FILE=/etc/pki/trust/anchors/SUSE_CaaSP_CA.crt
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: recognized environment variable ETCD_NAME, but unused: shadowed by corresponding flag 
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: recognized environment variable ETCD_DATA_DIR, but unused: shadowed by corresponding flag 
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: recognized environment variable ETCD_LISTEN_CLIENT_URLS, but unused: shadowed by corresponding flag 
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: etcd Version: 3.1.0
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: Git SHA: Not provided (use ./build instead of go build)
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: Go Version: go1.7.5
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: Go OS/Arch: linux/amd64
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: setting maximum number of CPUs to 1, total number of available CPUs is 1
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: peerTLS: cert = /etc/pki/minion.crt, key = /etc/pki/minion.key, ca = /etc/pki/trust/anchors/SUSE_CaaSP_CA.crt, trusted-ca = /etc/pki
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: listening for peers on https://0.0.0.0:2380
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: The scheme of client url http://0.0.0.0:2379 is HTTP while peer key/cert files are presented. Ignored key/cert files.
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: listening for client requests on 0.0.0.0:2379
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: resolving minion0.k8s.local:2380 to 10.17.15.251:2380
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: resolving minion0.k8s.local:2380 to 10.17.15.251:2380
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: discovery cluster full, falling back to proxy
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: proxy: this proxy supports v2 API only!
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: proxy: using peer urls [https://10.17.15.120:2380 https://10.17.15.154:2380 https://10.17.15.251:2380 https://master.k8s.local:2380 
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: could not get cluster response from https://10.17.15.120:2380: Get https://10.17.15.120:2380/members: x509: cannot validate certific
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: could not get cluster response from https://10.17.15.154:2380: Get https://10.17.15.154:2380/members: dial tcp 10.17.15.154:2380: ge
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: could not get cluster response from https://10.17.15.251:2380: Get https://10.17.15.251:2380/members: dial tcp 10.17.15.251:2380: ge
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: could not get cluster response from https://master.k8s.local:2380: Get https://master.k8s.local:2380/members: dial tcp 10.17.15.154:
May 08 16:12:10 50c22c86e1bf4789818233573f703b4b.infra.caasp.local etcd[2649]: could not get cluster response from https://minion0.k8s.local:2380: Get https://minion0.k8s.local:2380/members: dial tcp 10.17.15.25
...

This is probably related to the fact that we are changing the hostname too late. Maybe we should add a step in the orchestration for doing it the first thing (before the certificates creation).

Break the kubernetes-minion state in smaller components

At some moment we will install the kubelet in all the machines (including the masters), and maybe the proxy will be run as a daemon set, so it would be good to split this state in several states (ie, kubelet and proxy) and add the kubelet to the role:kube-master

etcd traffic should be using https only

Currently, we're configuring etcd to use HTTP and HTTPS, which is a security concern. You also see errors in the etcd logs:

May 01 18:37:30 minion1 etcd[2613]: The scheme of client url http://0.0.0.0:2379 is HTTP while peer key/cert files are presented. Ignored key/cert files.

See /etc/sysconfig/etcd:

ETCD_LISTEN_CLIENT_URLS="http://0.0.0.0:2379"
ETCD_LISTEN_PEER_URLS="https://0.0.0.0:2380"

ETCD_ADVERTISE_CLIENT_URLS="http://minion1.k8s.local:2379,http://10.17.3.187:2379"

That must all be https.

The update procedure fails to mark worker nodes as schedulable

When doing the update procedure we are going machine by machine, and on the worker nodes we are first of all draining them. This is fine, but when they come back it sometimes fails to mark them as schedulable again.

The problem resides here:

https://github.com/kubic-project/salt/blob/3c64b885effe7dfba43103de60d97bd9d7723197/salt/kubelet/init.sls#L89-L104

And more specifically, in the onlyif section:

                        ID: kubelet
                  Function: caasp_cmd.run
                      Name: kubectl uncordon d14594f000a94cf68ba6858439e38b35.infra.caasp.local
              
                    Result: True
                   Comment: onlyif execution failed
                   Started: 17:37:19.501176
                  Duration: 497.879 ms
                   Changes: 

Getting the output of the command in the onlyif when this orchestration was run is:

The connection to the server api.infra.caasp.local:6443 was refused - did you specify the right host or port?

This makes the whole logic fail, and nodes will be kept unschedulable:

~ KUBECONFIG=~/Downloads/kubeconfig kubectl get nodes
NAME                                                 STATUS                     AGE       VERSION
8250118f04664f1392eff757d280d698.infra.caasp.local   Ready,SchedulingDisabled   10m       v1.8.5
d14594f000a94cf68ba6858439e38b35.infra.caasp.local   Ready,SchedulingDisabled   10m       v1.8.5

Use the salt.modules.etcd_mod module

We are doing many operations on etcd in different parts of our code (ie, for setting the etcd discovery or for configuring flannel). Most of these things are done with the help of etcdctl or curl, but there is a native mechanism with the salt.modules.etcd_mod module. We should try to use it, but it would require packaging python-etcd and including it in the CaaSP image...

Use pillar in salt

I think it would be preferable to have a pillar in params.sls with something like:

haproxy:
  image: haproxy:1.6.11

and then use here:

      image: {{ salt.caasp_registry.base_image_url() }}/{{ pillar['haproxy']['image'] }}

instead of this hardcoded version of the image.

Originally posted by @inercia in #747

The new pillar returns None for unset values

The new pillar does not conform to the Salt pillar standard, so salt['pillar.get']('something:that:does:not:exist', 'default-value') will never return 'default-value' but None.

velum external pillar does not load

When trying to access the pillar items provided via the external velum pillar, I get no data. The salt-master container log contains the following:

{"log":"16:00:50,353 [PROFILE ] Time (in seconds) to render '/srv/pillar/cloud.sls' using 'yaml
' renderer: 0.00510215759277\n","stream":"stderr","time":"2018-02-27T16:00:50.354028037Z"}
{"log":"16:00:50,370 [INFO    ] Loaded velum pillar module\n","stream":"stderr","time":"2018-02
-27T16:00:50.37046686Z"}
{"log":"16:00:50,513 [DEBUG   ] Failed to import module kubernetes:\n","stream":"stderr","time"
:"2018-02-27T16:00:50.513395674Z"}
{"log":"Traceback (most recent call last):\n","stream":"stderr","time":"2018-02-27T16:00:50.513
42746Z"}
{"log":"  File \"/usr/lib/python2.7/site-packages/salt/loader.py\", line 1333, in _load_module\
n","stream":"stderr","time":"2018-02-27T16:00:50.513432637Z"}
{"log":"    mod = imp.load_module(mod_namespace, fn_, fpath, desc)\n","stream":"stderr","time":
"2018-02-27T16:00:50.513437312Z"}
{"log":"  File \"/usr/lib/python2.7/site-packages/salt/modules/kubernetes.py\", line 67, in \u0
03cmodule\u003e\n","stream":"stderr","time":"2018-02-27T16:00:50.513441432Z"}
{"log":"    from kubernetes.client import AppsV1beta1Deployment\n","stream":"stderr","time":"20
18-02-27T16:00:50.513446319Z"}
{"log":"  File \"/usr/lib/python2.7/site-packages/salt/modules/kubernetes.py\", line 67, in \u0
03cmodule\u003e\n","stream":"stderr","time":"2018-02-27T16:00:50.51345018Z"}
{"log":"    from kubernetes.client import AppsV1beta1Deployment\n","stream":"stderr","time":"20
18-02-27T16:00:50.513454406Z"}
{"log":"ImportError: No module named client\n","stream":"stderr","time":"2018-02-27T16:00:50.51
3458254Z"}
{"log":"16:00:50,631 [DEBUG   ] Module DSC: Only available on Windows systems\n","stream":"stde
rr","time":"2018-02-27T16:00:50.632140745Z"}
{"log":"16:00:50,636 [DEBUG   ] Module PSGet: Only available on Windows systems\n","stream":"st
derr","time":"2018-02-27T16:00:50.637058699Z"}
{"log":"16:00:50,970 [DEBUG   ] Could not LazyLoad redis.get_key: 'redis.get_key' is not availa
ble.\n","stream":"stderr","time":"2018-02-27T16:00:50.970732683Z"}
{"log":"16:00:50,973 [DEBUG   ] Could not LazyLoad velum.ext_pillar: 'velum.ext_pillar' is not 
available.\n","stream":"stderr","time":"2018-02-27T16:00:50.974088542Z"}
{"log":"16:00:50,973 [CRITICAL] Specified ext_pillar interface velum is unavailable\n","stream"
:"stderr","time":"2018-02-27T16:00:50.9741117Z"}

The __virtual__ function of velum.py is obviously executed (hence the "Loaded velum pillar module"), but it doesn't work. That's with salt 2016.11.4 as provided by SLE12.

Do not link certificate's CN to anything technical (like domain name)

There's no need to do anything like this when setting the CN on the CA certificate:

{% if salt['grains.get']('domain', '')|length > 0 %}
    - CN: {{ grains['domain'] }}
{% elif salt['pillar.get']('dns:domain', '')|length > 0 %}
    - CN: {{ pillar['dns']['domain'] }}
{% else %}
    - CN: kubernetes
{% endif %}

From: https://github.com/kubic-project/salt/blob/8e123e4cc8c7308cc55f69ade7ba97009fe265c6/salt/ca/init.sls#L57-L63

Maybe we can use pillar['certificate_information']['subject_properties']['O'], or add a different setting. A static Kubernetes cluster CA certificate or something similar would work too. Some examples of CN values on root certificates:

hostname is being reset by DHCP after a while

Even when we are setting the hostname in our Salt scripts, the hostname can be overwritten by wickedd-dhcp4 when revewing the DHCP lease:

...
May 16 12:16:39 141abdcfc0b04ce3b79e2da3598966f7.infra.caasp.local hyperkube[2892]: I0516 12:16:39.931630    2892 qos_container_manager_linux.go:285] [ContainerManager]: Updated QoS cgroup configuration
May 16 12:16:47 141abdcfc0b04ce3b79e2da3598966f7.infra.caasp.local hyperkube[2892]: I0516 12:16:47.017172    2892 kubelet_node_status.go:391] Using node IP: "10.17.15.120"
May 16 12:16:47 141abdcfc0b04ce3b79e2da3598966f7.infra.caasp.local hyperkube[2892]: E0516 12:16:47.018774    2892 kubelet.go:1669] Failed to check if disk space is available on the root partition: failed to get 
May 16 12:16:51 141abdcfc0b04ce3b79e2da3598966f7.infra.caasp.local wickedd-dhcp4[1102]: eth0: Committed DHCPv4 lease with address 10.17.15.120 (lease time 3600 sec, renew in 1660 sec, rebind in 3010 sec)
May 16 12:16:51 141abdcfc0b04ce3b79e2da3598966f7.infra.caasp.local wickedd[1142]: route ipv4 0.0.0.0/0 via 10.17.15.1 dev eth0#2 type unicast table main scope universe protocol dhcp covered by a ipv4:dhcp lease
May 16 12:16:52 minion1 systemd[1]: Reloading System Logging Service.
May 16 12:16:52 minion1 systemd[1]: Reloaded System Logging Service.
May 16 12:16:57 minion1 hyperkube[2892]: I0516 12:16:57.039253    2892 kubelet_node_status.go:391] Using node IP: "10.17.15.120"
May 16 12:16:57 minion1 hyperkube[2892]: E0516 12:16:57.041470    2892 kubelet.go:1669] Failed to check if disk space is available on the root partition: failed to get fs info for "root": error trying to get fil
May 16 12:17:07 minion1 hyperkube[2892]: I0516 12:17:07.068052    2892 kubelet_node_status.go:391] Using node IP: "10.17.15.120"
May 16 12:17:07 minion1 hyperkube[2892]: E0516 12:17:07.069585    2892 kubelet.go:1669] Failed to check if disk space is available on
...

I think the consequence of this would be that Salt's fqdn for this node would be wrong (so any new orchestration would mess up things). We should verify this is not going to break things.

README update

Update the readme with what packages are installed, what they configure and how and a simple guide on how to run the provisioning.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.