Giter VIP home page Giter VIP logo

cluster-api-provider-openstack's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cluster-api-provider-openstack's Issues

Implement deletion of cluster

As soon as #150 is in. We also need a way to delete a cluster. This includes removing created stuff by the cluster reconciler.

This could be a bit more complicated than it sounds like. Very likely just deleting everything in reverse order might not work, because when Kubernetes is configured to create OpenStack LoadBalancers, on deletion of a cluster, there might be LBs left in the subnet. This must go away in order to be able to delete the subnet.

Same is true for volumes.

Allow system:serviceaccount:openstack-provider-system:default to create secrets in kube-system

Relates to #78

When cluster-api openstack controller is deployed to the actual cluster, it uses different secrets than when running in the minikube phase. Due to this, the controller is not able to create bootstrap tokens and thus, not able to create new worker nodes.

E1107 10:46:52.740308       1 machineactuator.go:383] Machine error: error creating Openstack instance: secrets is forbidden: User "system:serviceaccount:openstack-provider-system:default" cannot create resource "secrets" in API group "" in the namespace "kube-system"

Todo:
Extend ClusterRole openstack-provider-manager-role to be able to create secrets in kube-system.

Readme instructions are not up to date with current project state

examples:
cd $GOPATH/src/sigs.k8s.io/cluster-api-provider-openstack/clusterctl
---> this path no longer exists

clusterctl create cluster --provider openstack -c examples/openstack/out/cluster.yaml -m examples/openstack/out/machines.yaml -p examples/openstack/out/provider-components.yaml
---> pathing: /out/*.yaml is not how the directory is managed by default

Worker node receives incorrect port for kube-apiserver on Master node for kubeadm join

The kubeadm join on worker nodes is provided the wrong port (443) for the kube-apiserver on the master node, i.e.

kubeadm join --token <redacted> 10.10.10.1:443 --ignore-preflight-errors=all --discovery-token-unsafe-skip-ca-verification
[preflight] running pre-flight checks
[discovery] Trying to connect to API Server "10.10.10.1:443"
[discovery] Created cluster-info discovery client, requesting info from "https://10.10.10.1:443"

The configured port for the kube-apiserver is 6443, resulting in worker nodes being unable to join the cluster.

Think about using OpenStack Server Groups

Currently, when the machine actuator creates VMs, it goes to OpenStack Compute and asks for a VM, for the next VM, again, and again and again. This might result in having all VMs on a single hypervisor.

Well. OpenStack provides a way to configure anti-affinity. See https://docs.openstack.org/python-openstackclient/pike/cli/command-objects/server-group.html.

For sure, this does not come without problems. As long as there are hypervisor without "our"-VMs running on, everything is fine. But if "our"-VMs are scheduled to all of the Hypervisors, we will get a "no host found" exception.
What we would need is kind of soft-anti-affinity. As long as it is possible, OpenStack should schedule VMs on different nodes, but as soon as it is not possible any more, OpenStack should accept that it needs do deploy VMs to an already used Hypervisor.

We probably can create a ServerGroup in the cluster actuator for master nodes. For worker nodes we might think about using its own group, but implement the fuzzy scheduling in the machine actuator. Like. Count the number of failures due to "no host found", if that increases $MAX, try to spin up a VM without servergroup.
But I want hear other opinions on this.

Disable ufw on ubuntu

We faced an interference with ufw running on ubuntu 18.04.
IMHO we should disable ufw by default, no matter the used Ubuntu version is 16.04 or 18.04.

On both versions, this in the startup script of master and node should do the trick

ufw disable

Allow the actuator to get the openstack configs from a configmap

We currently depend on having clouds.yaml in one of the standard paths. In a more k8s fashion, it'd be great if the actuator could get the secret itself (assuming there's a service account token in the pod) and use the data in that secret to create the openstack client.

getKubeadmToken produces gibberish output when offline

This call produces gibberish when working offline and no kubernetes-version is specified:

TOKEN=I1025 09:05:57.001785      15 version.go:89] could not fetch a Kubernetes version from the internet: unable to ge
t URL "https://dl.k8s.io/release/stable-1.txt": Get https://dl.k8s.io/release/stable-1.txt: net/http: request canceled
while waiting for connection (Client.Timeout exceeded while awaiting headers)
I1025 09:05:57.001886      15 version.go:94] falling back to the local client version: v1.12.1
pjpft6.5o7555anlw2y2b3a

FWIW, we should not be relying on kubeadm to generate tokens but, this is what we have now so we ought to fix it. We could pass kubernetes-version to the call if it was a valid option in the TokenParams struct. Sadly it isn't.

Migrate rbac to go comment annotations of kubebuilder

In this controller, we only shipped a file the file in config/rbac/rbac_*.yaml. In theory this should be generated by controller-gen for us.
To get this running, we need to include it into our Makefile, see https://github.com/kubernetes-sigs/cluster-api/blob/2edce9534dce0be00f9a519d5ac64fca009639bc/Makefile#L59

Also, we must migrate everything in config/rbac/rbac_role.yaml and config/rbac_role_binding.yaml to annotations on our Reconciler, in format of:

// +kubebuilder:rbac:groups=cluster.k8s.io,resources=machines,verbs=get;list;watch;create;update;patch;delete

See https://github.com/kubernetes-sigs/cluster-api/blob/299c318690e526b7d8c704ab40a813c00ab7b0b0/pkg/controller/node/node_controller.go#L80

Note: We are not able to do the same with data in config/rbac/secrets_*.yaml, because there we use Role (and not ClusterRole), which is not yet supported by kubebuilder. See kubernetes-sigs/kubebuilder#401

Dockerize all local code

suggested feature: If we wrap all the parts that run on local hardware in a docker container, it would mitigate a lot of hardware specific issues, and ultimately streamline the app.

Missing input for argument [Username] in clusterapi-controllers pod log

Saw the following when using the latest master code. Used the steps in https://github.com/kubernetes-sigs/cluster-api-provider-openstack/blob/master/README.md

Testing on MacOS High Sierra 10.13.6 with Docker version 17.03.0-ce, minikube 1.10.0. Ran clusterctl with:

./clusterctl create cluster --provider openstack -c examples/openstack/out/cluster.yaml -m examples/openstack/out/machines.yaml -p examples/openstack/out/provider-components.yaml --alsologtostderr --v 9 --vm-driver=hyperkit

After a few mins, my cluster state looks like:

kubectl get all
NAME                                         READY     STATUS             RESTARTS   AGE
po/clusterapi-apiserver-79bd7bdff-bg6wc      1/1       Running            0          8m
po/clusterapi-controllers-657dd5468b-fwwpr   1/2       CrashLoopBackOff   6          7m
po/etcd-clusterapi-0                         1/1       Running            0          8m

NAME                      CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
svc/clusterapi            10.96.13.229    <none>        443/TCP    8m
svc/etcd-clusterapi-svc   10.97.196.161   <none>        2379/TCP   8m
svc/kubernetes            10.96.0.1       <none>        443/TCP    8m

NAME                           KIND
statefulsets/etcd-clusterapi   StatefulSet.v1.apps

NAME                            DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
deploy/clusterapi-apiserver     1         1         1            1           8m
deploy/clusterapi-controllers   1         1         1            0           7m

NAME                                   DESIRED   CURRENT   READY     AGE
rs/clusterapi-apiserver-79bd7bdff      1         1         1         8m
rs/clusterapi-controllers-657dd5468b   1         1         0         7m

Looking at the logs of the failing pod (clusterapi-controllers) containers:

kubectl logs clusterapi-controllers-657dd5468b-fwwpr -c controller-manager
ERROR: logging before flag.Parse: I1001 14:43:46.032006       1 controller.go:83] Waiting for caches to sync for machine deployment controller
ERROR: logging before flag.Parse: E1001 14:43:46.034175       1 reflector.go:205] sigs.k8s.io/cluster-api/pkg/controller/sharedinformers/zz_generated.api.register.go:55: Failed to list *v1alpha1.Machine: Get https://localhost:8443/apis/cluster.k8s.io/v1alpha1/machines?limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
..... 
kubectl logs clusterapi-controllers-657dd5468b-fwwpr -c openstack-machine-controller
ERROR: logging before flag.Parse: F1001 14:59:56.629761       1 main.go:55] Could not create Openstack machine actuator: Create providerClient err: Missing input for argument [Username]

Would appreciate any help on the above...thanks!

Make providerConfig keys consistent

The security_group and availability_zone keys are not consistent with the format used in other keys floatingIP, etc. Make these keys consistent to improve the user experience.

Implement basic cluster actuator

So far, the cluster actutor is not yet implemented, except for basic types.

I would like to see a way to set up Networks, SecrityGroups and Loadbalancers for the cluster. In a first step, a single subnet for all nodes (master and worker) would IMHO be enough. Created Machines should use the created infrastructure.

Todo:

  • Create Network with subnet with a configured CIDR
  • Create a router connected to a configured external network, as well as the just created subnet
  • Create SecurityGroups for master and nodes
  • Create a loadbalancer with floating IP

LB Floating IP and NetworkID must go to cluster status. SecurityGroups also somehow, but I am currently not sure yet how. Edit: Floating IP not needed because there is an APIEndpoint on ClusterStatus.

The loadbalancer should be used as the entry point for the apiserver, so we are able to create HA control planes.

If we are done with this, we can update the machine actuator to use the infrastructure.

See also: https://github.com/kubernetes-sigs/cluster-api-provider-aws/blob/master/pkg/cloud/aws/actuators/cluster/actuator.go

generate-yaml.sh does not work if $PWD contains spaces.

Currently, generate-yaml.sh breaks if the path it's run in contains spaces:

./generate-yaml.sh -c clouds.yaml                                    
/generate-yaml.sh: line 73: [: /mnt/c/Users/Daniel: binary operator expected
./generate-yaml.sh: line 78: [: /mnt/c/Users/Daniel: binary operator expected
./generate-yaml.sh: line 83: [: /mnt/c/Users/Daniel: binary operator expected
mkdir: cannot create directory ‘/mnt/c/Users/Daniel’: Permission denied

This can be solved by some strategic use of quotes.

Move generate-yaml.sh to go version of yq

generate-yaml.sh currently uses the yq found at https://github.com/kislyuk/yq but the yq found at https://github.com/mikefarah/yq is more widely available. The script should use the syntax of the latter and detect if the other is installed instead.

Add support for MachineSets

The current implementation doesn't have support for MachineSets, which limits the way the actuator can be used.

The Name parameter and other parts of the implementation might need to be revisited as part of this implementatio

Openstack (gophercloud) Identity and Compute Clients timeout

In it's current state, the Identity and Compute clients are timing out after 3 hours (I believe), rendering the provider pod useless until a new token is issued. You then begin getting a constant stream of "...Get service list err: Authentication failed". To resolve this during testing, you can just recreate the pod. But for actual use cases we should leverage the ReAuth option within the gophercloud library, which is defaulted to false. Setting this to true for our use cases seems much more sensible.

Don't assume startup script is bash

We should not assume the startup script (see snippet below) is a bash script. In a cloud-init scenario, this works great but it won't work for images using ignition.

Instead, we could assume the startup script has gotemplate variables and use it as the template. If it does then it'll be rendered, otherwise nothing will happen and it'll be pushed into the node as-is.

https://github.com/kubernetes-sigs/cluster-api-provider-openstack/blob/master/pkg/cloud/openstack/machineScript.go#L91-L114

const masterEnvironmentVars = `#!/bin/bash
KUBELET_VERSION={{ .Machine.Spec.Versions.Kubelet }}
VERSION=v${KUBELET_VERSION}
NAMESPACE={{ .Machine.ObjectMeta.Namespace }}
MACHINE=$NAMESPACE
MACHINE+="/"
MACHINE+={{ .Machine.ObjectMeta.Name }}
CONTROL_PLANE_VERSION={{ .Machine.Spec.Versions.ControlPlane }}
CLUSTER_DNS_DOMAIN={{ .Cluster.Spec.ClusterNetwork.ServiceDomain }}
POD_CIDR={{ .PodCIDR }}
SERVICE_CIDR={{ .ServiceCIDR }}
`

const nodeEnvironmentVars = `#!/bin/bash
KUBELET_VERSION={{ .Machine.Spec.Versions.Kubelet }}
TOKEN={{ .Token }}
MASTER={{ .MasterEndpoint }}
NAMESPACE={{ .Machine.ObjectMeta.Namespace }}
MACHINE=$NAMESPACE
MACHINE+="/"
MACHINE+={{ .Machine.ObjectMeta.Name }}
CLUSTER_DNS_DOMAIN={{ .Cluster.Spec.ClusterNetwork.ServiceDomain }}
POD_CIDR={{ .PodCIDR }}
SERVICE_CIDR={{ .ServiceCIDR }}

Decoding machine status fails

Decode fails in method machineInstanceStatus within instancestatus.go with error:

no kind "Machine" is registered for version "cluster.k8s.io/v1alpha1"

Use Kustomize in generate-yaml.sh

Currently the generate-yaml.sh script utilizes bash and sed to handle building out the yaml files to pass to clusterctl. There can be environmental issues (mac vs linux, tooling version) which makes that script flakey.

I'd like to propose we move to Kustomize for the generation and reduce this need on bash. I'm happy to take this on and send a PR.

Register cluster-api/pkg/controller for MachineSet and MachineDeployment

After the update of cluster-api in #112, the openstack controller doesn't include the controller for MachineSet and MachineDeployment any more. Before, this was maybe done by some magic?

Anyways. To register the controllers, we need to extend cmd/manager/main.go by

import (
        clusterv1controller "sigs.k8s.io/cluster-api/pkg/controller"
)

func main() {
        ...
        if err := clusterv1controller.AddToManager(mgr); err != nil {
		glog.Fatal(err)
	}
}

Use standard clouds.yaml file as config file

We're currently using a custom config file for cluster-api-provider-openstack, which is different from both, the cloud-provider-openstack config file and the standard clouds.yaml file. The current config uses the following format:

user-name:
password: 
domain-name: 
tenant-id: 
region:
auth-url:

In order to provider a better user experience, it'd be better to either adopt the config file format used by the cloud-provider-openstack or, even better, the clouds.yaml format (gophercloud has support for the latter already). Here's an example of what the yaml would look like:

clouds:
  my_cloud:
    insecure: true
    verify: false
    identity_api_version: 3
    auth:
      auth_url: http://10.0.0.14:5000/v3
      project_name: admin
      username: admin
      password: $SUPER_PASSWORD
      project_domain_name: Default
      user_domain_name: Default
    region: RegionOne

Thoughts?

/cc @Lion-Wei

Find a better way to generate tokens

We're currently depending on kubeadm to generate tokens. This is far from ideal as it requires to shell out and it doesn't work on environments where kubeadm is not present.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.