Cluster API implementation for OpenStack

Home Page: https://cluster-api-openstack.sigs.k8s.io/

License: Apache License 2.0

Go 89.80% Makefile 2.53% Shell 5.80% Dockerfile 0.31% Python 0.81% Smarty 0.75%

k8s-sig-cluster-lifecycle

cluster-api-provider-openstack's Issues

Update cluster-api as soon as #599 is in

See kubernetes-sigs/cluster-api#599.

When this is in, defaults for the MachineDeplyment controller are set.

Readme instructions are not up to date with current project state

examples:
cd $GOPATH/src/sigs.k8s.io/cluster-api-provider-openstack/clusterctl
---> this path no longer exists

clusterctl create cluster --provider openstack -c examples/openstack/out/cluster.yaml -m examples/openstack/out/machines.yaml -p examples/openstack/out/provider-components.yaml
---> pathing: /out/*.yaml is not how the directory is managed by default

Missing input for argument [Username] in clusterapi-controllers pod log

Saw the following when using the latest master code. Used the steps in https://github.com/kubernetes-sigs/cluster-api-provider-openstack/blob/master/README.md

Testing on MacOS High Sierra 10.13.6 with Docker version 17.03.0-ce, minikube 1.10.0. Ran clusterctl with:

./clusterctl create cluster --provider openstack -c examples/openstack/out/cluster.yaml -m examples/openstack/out/machines.yaml -p examples/openstack/out/provider-components.yaml --alsologtostderr --v 9 --vm-driver=hyperkit

After a few mins, my cluster state looks like:

kubectl get all
NAME                                         READY     STATUS             RESTARTS   AGE
po/clusterapi-apiserver-79bd7bdff-bg6wc      1/1       Running            0          8m
po/clusterapi-controllers-657dd5468b-fwwpr   1/2       CrashLoopBackOff   6          7m
po/etcd-clusterapi-0                         1/1       Running            0          8m

NAME                      CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
svc/clusterapi            10.96.13.229    <none>        443/TCP    8m
svc/etcd-clusterapi-svc   10.97.196.161   <none>        2379/TCP   8m
svc/kubernetes            10.96.0.1       <none>        443/TCP    8m

NAME                           KIND
statefulsets/etcd-clusterapi   StatefulSet.v1.apps

NAME                            DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
deploy/clusterapi-apiserver     1         1         1            1           8m
deploy/clusterapi-controllers   1         1         1            0           7m

NAME                                   DESIRED   CURRENT   READY     AGE
rs/clusterapi-apiserver-79bd7bdff      1         1         1         8m
rs/clusterapi-controllers-657dd5468b   1         1         0         7m

Looking at the logs of the failing pod (clusterapi-controllers) containers:

kubectl logs clusterapi-controllers-657dd5468b-fwwpr -c controller-manager
ERROR: logging before flag.Parse: I1001 14:43:46.032006       1 controller.go:83] Waiting for caches to sync for machine deployment controller
ERROR: logging before flag.Parse: E1001 14:43:46.034175       1 reflector.go:205] sigs.k8s.io/cluster-api/pkg/controller/sharedinformers/zz_generated.api.register.go:55: Failed to list *v1alpha1.Machine: Get https://localhost:8443/apis/cluster.k8s.io/v1alpha1/machines?limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
.....

kubectl logs clusterapi-controllers-657dd5468b-fwwpr -c openstack-machine-controller
ERROR: logging before flag.Parse: F1001 14:59:56.629761       1 main.go:55] Could not create Openstack machine actuator: Create providerClient err: Missing input for argument [Username]

Would appreciate any help on the above...thanks!

getKubeadmToken produces gibberish output when offline

This call produces gibberish when working offline and no kubernetes-version is specified:

TOKEN=I1025 09:05:57.001785      15 version.go:89] could not fetch a Kubernetes version from the internet: unable to ge
t URL "https://dl.k8s.io/release/stable-1.txt": Get https://dl.k8s.io/release/stable-1.txt: net/http: request canceled
while waiting for connection (Client.Timeout exceeded while awaiting headers)
I1025 09:05:57.001886      15 version.go:94] falling back to the local client version: v1.12.1
pjpft6.5o7555anlw2y2b3a

FWIW, we should not be relying on kubeadm to generate tokens but, this is what we have now so we ought to fix it. We could pass kubernetes-version to the call if it was a valid option in the TokenParams struct. Sadly it isn't.

Use Kustomize in generate-yaml.sh

Currently the generate-yaml.sh script utilizes bash and sed to handle building out the yaml files to pass to clusterctl. There can be environmental issues (mac vs linux, tooling version) which makes that script flakey.

I'd like to propose we move to Kustomize for the generation and reduce this need on bash. I'm happy to take this on and send a PR.

Replace glog with klog

s/glog/klog/g

xref: https://groups.google.com/forum/#!topic/kubernetes-dev/7vnijOMhLS0

Implement deletion of cluster

As soon as #150 is in. We also need a way to delete a cluster. This includes removing created stuff by the cluster reconciler.

This could be a bit more complicated than it sounds like. Very likely just deleting everything in reverse order might not work, because when Kubernetes is configured to create OpenStack LoadBalancers, on deletion of a cluster, there might be LBs left in the subnet. This must go away in order to be able to delete the subnet.

Same is true for volumes.

Think about using OpenStack Server Groups

Currently, when the machine actuator creates VMs, it goes to OpenStack Compute and asks for a VM, for the next VM, again, and again and again. This might result in having all VMs on a single hypervisor.

Well. OpenStack provides a way to configure anti-affinity. See https://docs.openstack.org/python-openstackclient/pike/cli/command-objects/server-group.html.

For sure, this does not come without problems. As long as there are hypervisor without "our"-VMs running on, everything is fine. But if "our"-VMs are scheduled to all of the Hypervisors, we will get a "no host found" exception.
What we would need is kind of soft-anti-affinity. As long as it is possible, OpenStack should schedule VMs on different nodes, but as soon as it is not possible any more, OpenStack should accept that it needs do deploy VMs to an already used Hypervisor.

We probably can create a ServerGroup in the cluster actuator for master nodes. For worker nodes we might think about using its own group, but implement the fuzzy scheduling in the machine actuator. Like. Count the number of failures due to "no host found", if that increases $MAX, try to spin up a VM without servergroup.
But I want hear other opinions on this.

Create a periodic job to build and publish images to docker hub

Document is needed in the VM image

Looks like we use cc as the userid in the image. Can we please use by default standard ubuntu images documented in openstack docs? https://docs.openstack.org/image-guide/obtain-images.html Please note that the login account is ubuntu

Hardcoded ubuntu username

Allow for setting the ssh username in the providerConfig, if needed.

https://github.com/kubernetes-sigs/cluster-api-provider-openstack/blob/master/pkg/cloud/openstack/machineactuator.go#L50

Worker node receives incorrect port for kube-apiserver on Master node for kubeadm join

The kubeadm join on worker nodes is provided the wrong port (443) for the kube-apiserver on the master node, i.e.

kubeadm join --token <redacted> 10.10.10.1:443 --ignore-preflight-errors=all --discovery-token-unsafe-skip-ca-verification
[preflight] running pre-flight checks
[discovery] Trying to connect to API Server "10.10.10.1:443"
[discovery] Created cluster-info discovery client, requesting info from "https://10.10.10.1:443"

The configured port for the kube-apiserver is 6443, resulting in worker nodes being unable to join the cluster.

Allow users to pass their own cloud.conf into generate-yaml.sh

This would allow people to pass the path to an already filled out clouds.conf into generate-yaml.sh via a command line option. Something like this would make sense:

generate-yaml.sh --conf your/path

Enable users to select OS_CLOUD

/assign @iamemilio
../../pkg/cloud/openstack/clients/machineservice.go
Give clients a way to enter a value for OS_CLOUD

Use standard clouds.yaml file as config file

We're currently using a custom config file for cluster-api-provider-openstack, which is different from both, the cloud-provider-openstack config file and the standard clouds.yaml file. The current config uses the following format:

user-name:
password: 
domain-name: 
tenant-id: 
region:
auth-url:

In order to provider a better user experience, it'd be better to either adopt the config file format used by the cloud-provider-openstack or, even better, the clouds.yaml format (gophercloud has support for the latter already). Here's an example of what the yaml would look like:

clouds:
  my_cloud:
    insecure: true
    verify: false
    identity_api_version: 3
    auth:
      auth_url: http://10.0.0.14:5000/v3
      project_name: admin
      username: admin
      password: $SUPER_PASSWORD
      project_domain_name: Default
      user_domain_name: Default
    region: RegionOne

Thoughts?

/cc @Lion-Wei

Migrate rbac to go comment annotations of kubebuilder

In this controller, we only shipped a file the file in config/rbac/rbac_*.yaml. In theory this should be generated by controller-gen for us.
To get this running, we need to include it into our Makefile, see https://github.com/kubernetes-sigs/cluster-api/blob/2edce9534dce0be00f9a519d5ac64fca009639bc/Makefile#L59

Also, we must migrate everything in config/rbac/rbac_role.yaml and config/rbac_role_binding.yaml to annotations on our Reconciler, in format of:

// +kubebuilder:rbac:groups=cluster.k8s.io,resources=machines,verbs=get;list;watch;create;update;patch;delete

See https://github.com/kubernetes-sigs/cluster-api/blob/299c318690e526b7d8c704ab40a813c00ab7b0b0/pkg/controller/node/node_controller.go#L80

Note: We are not able to do the same with data in config/rbac/secrets_*.yaml, because there we use Role (and not ClusterRole), which is not yet supported by kubebuilder. See kubernetes-sigs/kubebuilder#401

Find a better way to generate tokens

We're currently depending on kubeadm to generate tokens. This is far from ideal as it requires to shell out and it doesn't work on environments where kubeadm is not present.

Allow to pass filters to Networks/SecurityGroups in the providerconfig

Networks and SecurityGroups currently only accept uuids. It'd be great to support other filters like Names, or even better filtering on other parameters.

Register cluster-api/pkg/controller for MachineSet and MachineDeployment

After the update of cluster-api in #112, the openstack controller doesn't include the controller for MachineSet and MachineDeployment any more. Before, this was maybe done by some magic?

Anyways. To register the controllers, we need to extend cmd/manager/main.go by

import (
        clusterv1controller "sigs.k8s.io/cluster-api/pkg/controller"
)

func main() {
        ...
        if err := clusterv1controller.AddToManager(mgr); err != nil {
		glog.Fatal(err)
	}
}

Move generate-yaml.sh to go version of yq

generate-yaml.sh currently uses the yq found at https://github.com/kislyuk/yq but the yq found at https://github.com/mikefarah/yq is more widely available. The script should use the syntax of the latter and detect if the other is installed instead.

Don't assume startup script is bash

We should not assume the startup script (see snippet below) is a bash script. In a cloud-init scenario, this works great but it won't work for images using ignition.

Instead, we could assume the startup script has gotemplate variables and use it as the template. If it does then it'll be rendered, otherwise nothing will happen and it'll be pushed into the node as-is.

https://github.com/kubernetes-sigs/cluster-api-provider-openstack/blob/master/pkg/cloud/openstack/machineScript.go#L91-L114

const masterEnvironmentVars = `#!/bin/bash
KUBELET_VERSION={{ .Machine.Spec.Versions.Kubelet }}
VERSION=v${KUBELET_VERSION}
NAMESPACE={{ .Machine.ObjectMeta.Namespace }}
MACHINE=$NAMESPACE
MACHINE+="/"
MACHINE+={{ .Machine.ObjectMeta.Name }}
CONTROL_PLANE_VERSION={{ .Machine.Spec.Versions.ControlPlane }}
CLUSTER_DNS_DOMAIN={{ .Cluster.Spec.ClusterNetwork.ServiceDomain }}
POD_CIDR={{ .PodCIDR }}
SERVICE_CIDR={{ .ServiceCIDR }}
`

const nodeEnvironmentVars = `#!/bin/bash
KUBELET_VERSION={{ .Machine.Spec.Versions.Kubelet }}
TOKEN={{ .Token }}
MASTER={{ .MasterEndpoint }}
NAMESPACE={{ .Machine.ObjectMeta.Namespace }}
MACHINE=$NAMESPACE
MACHINE+="/"
MACHINE+={{ .Machine.ObjectMeta.Name }}
CLUSTER_DNS_DOMAIN={{ .Cluster.Spec.ClusterNetwork.ServiceDomain }}
POD_CIDR={{ .PodCIDR }}
SERVICE_CIDR={{ .ServiceCIDR }}

Add support for MachineSets

The current implementation doesn't have support for MachineSets, which limits the way the actuator can be used.

The Name parameter and other parts of the implementation might need to be revisited as part of this implementatio

Allow system:serviceaccount:openstack-provider-system:default to create secrets in kube-system

Relates to #78

When cluster-api openstack controller is deployed to the actual cluster, it uses different secrets than when running in the minikube phase. Due to this, the controller is not able to create bootstrap tokens and thus, not able to create new worker nodes.

E1107 10:46:52.740308       1 machineactuator.go:383] Machine error: error creating Openstack instance: secrets is forbidden: User "system:serviceaccount:openstack-provider-system:default" cannot create resource "secrets" in API group "" in the namespace "kube-system"

Todo:
Extend ClusterRole openstack-provider-manager-role to be able to create secrets in kube-system.

Migrate bootstrap token code to cluster-api

This is a follow-up ticket for #78

When code of PR #89 is proven to work, we want to propose it for cluster-api so that other implementations can benefit from it.

Generator always puts output files in ubuntu directory

looks like this is the culprit:

cluster-api-provider-openstack/cmd/clusterctl/examples/openstack/generate-yaml.sh

Line 55 in 3584374

TEMPLATES_PATH=${TEMPLATES_PATH:-$PWD/ubuntu/}

Leverage cloud-provider-openstack

Our implementations currently focus on using the Kubernetes OpenStack built-ins which are deprecated. We should make the switch to using cloud-provider-openstack instead once Use standard clouds.yaml file as config file closes.

Periodic e2e CI- Signal on openstack providers

Please add periodic jobs to https://k8s-testgrid.appspot.com/sig-cluster-lifecycle-cluster-api-provider-openstack to give feedback on CI signal.

Sync `cluster-api-provider-openstack` config with `cloud-provider-openstack`

As identified by @flaper87 we use a different authentication config than the cloud-provider-openstack, which can lead to confusion and prevents reuse of code. In addition to enabling authentications with a standard OpenStackclouds.yaml file being addressed in #16 we should also adopt the cloud-provider-openstack structure for traditional credentials.

Reorganize code to conform to what is in skeleton repo

please see https://github.com/davidewatson/cluster-api-provider-skeleton

Add cloud.conf to nodes

Nodes deployed using the cluster-api-provider-openstack can be safely assumed to be running on in an OpenStack environment, and therefore should be configured with a cloud.conf to enable Kubernetes-OpenStack integration

xrefs:
OpenStack Cloud Provider
GCE example implementation
GCE example kubelet configuration

separate machine_providerconfig and cluster_provider config

OpenstackProviderConfig can be separated into cluster scope and machine scope.

generate-yaml.sh failed with "./generate-yaml.sh: 100: read: Illegal option -s"

generate-yaml.sh failed with "./generate-yaml.sh: 100: read: Illegal option -s" on Ubuntu.
Bacause Ubuntu default shell is dash which does not support read with -s option.

Shebang of generate-yaml.sh should be /bin/bash instead of /bin/sh.

move cloud.conf to a secret, and take it out of plaintext scripts and code

exhibit a:

cluster-api-provider-openstack/cmd/clusterctl/examples/openstack/centos/provider-components.yaml.template

Line 190 in 4908a8d

echo $OPENSTACK_CLOUD_PROVIDER_CONF | base64 -d > /etc/kubernetes/cloud.conf

Migrate to CRD's

Now that cluster-api has migrated to using kubebuilder to use CRD's instead of using apiserver-builder and API aggregation (ref: kubernetes-sigs/cluster-api#494) we need to sync the openstack provider to use kubebuilder with the upstream

Update Readme to Include Clouds.yaml instructions

@iamemilio

Decoding machine status fails

Decode fails in method machineInstanceStatus within instancestatus.go with error:

no kind "Machine" is registered for version "cluster.k8s.io/v1alpha1"

Implement basic cluster actuator

So far, the cluster actutor is not yet implemented, except for basic types.

I would like to see a way to set up Networks, SecrityGroups and Loadbalancers for the cluster. In a first step, a single subnet for all nodes (master and worker) would IMHO be enough. Created Machines should use the created infrastructure.

Todo:

Create Network with subnet with a configured CIDR
Create a router connected to a configured external network, as well as the just created subnet
Create SecurityGroups for master and nodes
Create a loadbalancer with floating IP

~~LB Floating IP~~ and NetworkID must go to cluster status. SecurityGroups also somehow, but I am currently not sure yet how. Edit: Floating IP not needed because there is an APIEndpoint on ClusterStatus.

The loadbalancer should be used as the entry point for the apiserver, so we are able to create HA control planes.

If we are done with this, we can update the machine actuator to use the infrastructure.

Disable ufw on ubuntu

We faced an interference with ufw running on ubuntu 18.04.
IMHO we should disable ufw by default, no matter the used Ubuntu version is 16.04 or 18.04.

On both versions, this in the startup script of master and node should do the trick

ufw disable

Openstack (gophercloud) Identity and Compute Clients timeout

In it's current state, the Identity and Compute clients are timing out after 3 hours (I believe), rendering the provider pod useless until a new token is issued. You then begin getting a constant stream of "...Get service list err: Authentication failed". To resolve this during testing, you can just recreate the pod. But for actual use cases we should leverage the ReAuth option within the gophercloud library, which is defaulted to false. Setting this to true for our use cases seems much more sensible.

clouds.yaml doesn't support doesn't work

The current implementation is not passing all the required data to the auth step resulting in an authentication failure.

Allow the actuator to get the openstack configs from a configmap

We currently depend on having clouds.yaml in one of the standard paths. In a more k8s fashion, it'd be great if the actuator could get the secret itself (assuming there's a service account token in the pod) and use the data in that secret to create the openstack client.

generate-yaml.sh does not work if $PWD contains spaces.

Currently, generate-yaml.sh breaks if the path it's run in contains spaces:

./generate-yaml.sh -c clouds.yaml                                    
/generate-yaml.sh: line 73: [: /mnt/c/Users/Daniel: binary operator expected
./generate-yaml.sh: line 78: [: /mnt/c/Users/Daniel: binary operator expected
./generate-yaml.sh: line 83: [: /mnt/c/Users/Daniel: binary operator expected
mkdir: cannot create directory ‘/mnt/c/Users/Daniel’: Permission denied

This can be solved by some strategic use of quotes.

Add CI jobs for cluster-api-provider-openstack

Jobs we need to add:

fmt job
check job
test job

kubernetes-sigs / cluster-api-provider-openstack Goto Github PK

cluster-api-provider-openstack's Issues

Recommend Projects

Recommend Topics

Recommend Org