kubernetes-sigs / cluster-api-provider-openstack Goto Github PK
View Code? Open in Web Editor NEWCluster API implementation for OpenStack
Home Page: https://cluster-api-openstack.sigs.k8s.io/
License: Apache License 2.0
Cluster API implementation for OpenStack
Home Page: https://cluster-api-openstack.sigs.k8s.io/
License: Apache License 2.0
See kubernetes-sigs/cluster-api#599.
When this is in, defaults for the MachineDeplyment controller are set.
examples:
cd $GOPATH/src/sigs.k8s.io/cluster-api-provider-openstack/clusterctl
---> this path no longer exists
clusterctl create cluster --provider openstack -c examples/openstack/out/cluster.yaml -m examples/openstack/out/machines.yaml -p examples/openstack/out/provider-components.yaml
---> pathing: /out/*.yaml is not how the directory is managed by default
Saw the following when using the latest master code. Used the steps in https://github.com/kubernetes-sigs/cluster-api-provider-openstack/blob/master/README.md
Testing on MacOS High Sierra 10.13.6 with Docker version 17.03.0-ce, minikube 1.10.0. Ran clusterctl with:
./clusterctl create cluster --provider openstack -c examples/openstack/out/cluster.yaml -m examples/openstack/out/machines.yaml -p examples/openstack/out/provider-components.yaml --alsologtostderr --v 9 --vm-driver=hyperkit
After a few mins, my cluster state looks like:
kubectl get all
NAME READY STATUS RESTARTS AGE
po/clusterapi-apiserver-79bd7bdff-bg6wc 1/1 Running 0 8m
po/clusterapi-controllers-657dd5468b-fwwpr 1/2 CrashLoopBackOff 6 7m
po/etcd-clusterapi-0 1/1 Running 0 8m
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
svc/clusterapi 10.96.13.229 <none> 443/TCP 8m
svc/etcd-clusterapi-svc 10.97.196.161 <none> 2379/TCP 8m
svc/kubernetes 10.96.0.1 <none> 443/TCP 8m
NAME KIND
statefulsets/etcd-clusterapi StatefulSet.v1.apps
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
deploy/clusterapi-apiserver 1 1 1 1 8m
deploy/clusterapi-controllers 1 1 1 0 7m
NAME DESIRED CURRENT READY AGE
rs/clusterapi-apiserver-79bd7bdff 1 1 1 8m
rs/clusterapi-controllers-657dd5468b 1 1 0 7m
Looking at the logs of the failing pod (clusterapi-controllers) containers:
kubectl logs clusterapi-controllers-657dd5468b-fwwpr -c controller-manager
ERROR: logging before flag.Parse: I1001 14:43:46.032006 1 controller.go:83] Waiting for caches to sync for machine deployment controller
ERROR: logging before flag.Parse: E1001 14:43:46.034175 1 reflector.go:205] sigs.k8s.io/cluster-api/pkg/controller/sharedinformers/zz_generated.api.register.go:55: Failed to list *v1alpha1.Machine: Get https://localhost:8443/apis/cluster.k8s.io/v1alpha1/machines?limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
.....
kubectl logs clusterapi-controllers-657dd5468b-fwwpr -c openstack-machine-controller
ERROR: logging before flag.Parse: F1001 14:59:56.629761 1 main.go:55] Could not create Openstack machine actuator: Create providerClient err: Missing input for argument [Username]
Would appreciate any help on the above...thanks!
This call produces gibberish when working offline and no kubernetes-version is specified:
TOKEN=I1025 09:05:57.001785 15 version.go:89] could not fetch a Kubernetes version from the internet: unable to ge
t URL "https://dl.k8s.io/release/stable-1.txt": Get https://dl.k8s.io/release/stable-1.txt: net/http: request canceled
while waiting for connection (Client.Timeout exceeded while awaiting headers)
I1025 09:05:57.001886 15 version.go:94] falling back to the local client version: v1.12.1
pjpft6.5o7555anlw2y2b3a
FWIW, we should not be relying on kubeadm to generate tokens but, this is what we have now so we ought to fix it. We could pass kubernetes-version
to the call if it was a valid option in the TokenParams
struct. Sadly it isn't.
Currently the generate-yaml.sh script utilizes bash and sed to handle building out the yaml files to pass to clusterctl. There can be environmental issues (mac vs linux, tooling version) which makes that script flakey.
I'd like to propose we move to Kustomize for the generation and reduce this need on bash. I'm happy to take this on and send a PR.
s/glog/klog/g
xref: https://groups.google.com/forum/#!topic/kubernetes-dev/7vnijOMhLS0
As soon as #150 is in. We also need a way to delete a cluster. This includes removing created stuff by the cluster reconciler.
This could be a bit more complicated than it sounds like. Very likely just deleting everything in reverse order might not work, because when Kubernetes is configured to create OpenStack LoadBalancers, on deletion of a cluster, there might be LBs left in the subnet. This must go away in order to be able to delete the subnet.
Same is true for volumes.
Currently, when the machine actuator creates VMs, it goes to OpenStack Compute and asks for a VM, for the next VM, again, and again and again. This might result in having all VMs on a single hypervisor.
Well. OpenStack provides a way to configure anti-affinity. See https://docs.openstack.org/python-openstackclient/pike/cli/command-objects/server-group.html.
For sure, this does not come without problems. As long as there are hypervisor without "our"-VMs running on, everything is fine. But if "our"-VMs are scheduled to all of the Hypervisors, we will get a "no host found" exception.
What we would need is kind of soft-anti-affinity. As long as it is possible, OpenStack should schedule VMs on different nodes, but as soon as it is not possible any more, OpenStack should accept that it needs do deploy VMs to an already used Hypervisor.
We probably can create a ServerGroup in the cluster actuator for master nodes. For worker nodes we might think about using its own group, but implement the fuzzy scheduling in the machine actuator. Like. Count the number of failures due to "no host found", if that increases $MAX, try to spin up a VM without servergroup.
But I want hear other opinions on this.
Looks like we use cc
as the userid in the image. Can we please use by default standard ubuntu images documented in openstack docs? https://docs.openstack.org/image-guide/obtain-images.html Please note that the login account is ubuntu
Allow for setting the ssh username in the providerConfig, if needed.
The kubeadm join
on worker nodes is provided the wrong port (443
) for the kube-apiserver
on the master node, i.e.
kubeadm join --token <redacted> 10.10.10.1:443 --ignore-preflight-errors=all --discovery-token-unsafe-skip-ca-verification
[preflight] running pre-flight checks
[discovery] Trying to connect to API Server "10.10.10.1:443"
[discovery] Created cluster-info discovery client, requesting info from "https://10.10.10.1:443"
The configured port for the kube-apiserver
is 6443
, resulting in worker nodes being unable to join the cluster.
This would allow people to pass the path to an already filled out clouds.conf into generate-yaml.sh via a command line option. Something like this would make sense:
generate-yaml.sh --conf your/path
/assign @iamemilio
../../pkg/cloud/openstack/clients/machineservice.go
Give clients a way to enter a value for OS_CLOUD
We're currently using a custom config file for cluster-api-provider-openstack
, which is different from both, the cloud-provider-openstack
config file and the standard clouds.yaml
file. The current config uses the following format:
user-name:
password:
domain-name:
tenant-id:
region:
auth-url:
In order to provider a better user experience, it'd be better to either adopt the config file format used by the cloud-provider-openstack
or, even better, the clouds.yaml
format (gophercloud has support for the latter already). Here's an example of what the yaml would look like:
clouds:
my_cloud:
insecure: true
verify: false
identity_api_version: 3
auth:
auth_url: http://10.0.0.14:5000/v3
project_name: admin
username: admin
password: $SUPER_PASSWORD
project_domain_name: Default
user_domain_name: Default
region: RegionOne
Thoughts?
/cc @Lion-Wei
In this controller, we only shipped a file the file in config/rbac/rbac_*.yaml
. In theory this should be generated by controller-gen
for us.
To get this running, we need to include it into our Makefile
, see https://github.com/kubernetes-sigs/cluster-api/blob/2edce9534dce0be00f9a519d5ac64fca009639bc/Makefile#L59
Also, we must migrate everything in config/rbac/rbac_role.yaml
and config/rbac_role_binding.yaml
to annotations on our Reconciler, in format of:
// +kubebuilder:rbac:groups=cluster.k8s.io,resources=machines,verbs=get;list;watch;create;update;patch;delete
Note: We are not able to do the same with data in config/rbac/secrets_*.yaml
, because there we use Role
(and not ClusterRole
), which is not yet supported by kubebuilder. See kubernetes-sigs/kubebuilder#401
We're currently depending on kubeadm
to generate tokens. This is far from ideal as it requires to shell out and it doesn't work on environments where kubeadm is not present.
Networks and SecurityGroups currently only accept uuids. It'd be great to support other filters like Names, or even better filtering on other parameters.
After the update of cluster-api
in #112, the openstack controller doesn't include the controller for MachineSet
and MachineDeployment
any more. Before, this was maybe done by some magic?
Anyways. To register the controllers, we need to extend cmd/manager/main.go
by
import (
clusterv1controller "sigs.k8s.io/cluster-api/pkg/controller"
)
func main() {
...
if err := clusterv1controller.AddToManager(mgr); err != nil {
glog.Fatal(err)
}
}
generate-yaml.sh currently uses the yq
found at https://github.com/kislyuk/yq
but the yq
found at https://github.com/mikefarah/yq
is more widely available. The script should use the syntax of the latter and detect if the other is installed instead.
We should not assume the startup script (see snippet below) is a bash script. In a cloud-init scenario, this works great but it won't work for images using ignition.
Instead, we could assume the startup script has gotemplate variables and use it as the template. If it does then it'll be rendered, otherwise nothing will happen and it'll be pushed into the node as-is.
const masterEnvironmentVars = `#!/bin/bash
KUBELET_VERSION={{ .Machine.Spec.Versions.Kubelet }}
VERSION=v${KUBELET_VERSION}
NAMESPACE={{ .Machine.ObjectMeta.Namespace }}
MACHINE=$NAMESPACE
MACHINE+="/"
MACHINE+={{ .Machine.ObjectMeta.Name }}
CONTROL_PLANE_VERSION={{ .Machine.Spec.Versions.ControlPlane }}
CLUSTER_DNS_DOMAIN={{ .Cluster.Spec.ClusterNetwork.ServiceDomain }}
POD_CIDR={{ .PodCIDR }}
SERVICE_CIDR={{ .ServiceCIDR }}
`
const nodeEnvironmentVars = `#!/bin/bash
KUBELET_VERSION={{ .Machine.Spec.Versions.Kubelet }}
TOKEN={{ .Token }}
MASTER={{ .MasterEndpoint }}
NAMESPACE={{ .Machine.ObjectMeta.Namespace }}
MACHINE=$NAMESPACE
MACHINE+="/"
MACHINE+={{ .Machine.ObjectMeta.Name }}
CLUSTER_DNS_DOMAIN={{ .Cluster.Spec.ClusterNetwork.ServiceDomain }}
POD_CIDR={{ .PodCIDR }}
SERVICE_CIDR={{ .ServiceCIDR }}
The current implementation doesn't have support for MachineSets
, which limits the way the actuator can be used.
The Name
parameter and other parts of the implementation might need to be revisited as part of this implementatio
Relates to #78
When cluster-api openstack controller is deployed to the actual cluster, it uses different secrets than when running in the minikube phase. Due to this, the controller is not able to create bootstrap tokens and thus, not able to create new worker nodes.
E1107 10:46:52.740308 1 machineactuator.go:383] Machine error: error creating Openstack instance: secrets is forbidden: User "system:serviceaccount:openstack-provider-system:default" cannot create resource "secrets" in API group "" in the namespace "kube-system"
Todo:
Extend ClusterRole openstack-provider-manager-role
to be able to create secrets in kube-system.
looks like this is the culprit:
Our implementations currently focus on using the Kubernetes OpenStack built-ins which are deprecated. We should make the switch to using cloud-provider-openstack instead once Use standard clouds.yaml file as config file closes.
Please add periodic jobs to https://k8s-testgrid.appspot.com/sig-cluster-lifecycle-cluster-api-provider-openstack to give feedback on CI signal.
As identified by @flaper87 we use a different authentication config than the cloud-provider-openstack
, which can lead to confusion and prevents reuse of code. In addition to enabling authentications with a standard OpenStackclouds.yaml
file being addressed in #16 we should also adopt the cloud-provider-openstack
structure for traditional credentials.
Nodes deployed using the cluster-api-provider-openstack
can be safely assumed to be running on in an OpenStack environment, and therefore should be configured with a cloud.conf
to enable Kubernetes-OpenStack integration
xrefs:
OpenStack Cloud Provider
GCE example implementation
GCE example kubelet
configuration
OpenstackProviderConfig
can be separated into cluster scope and machine scope.
generate-yaml.sh failed with "./generate-yaml.sh: 100: read: Illegal option -s" on Ubuntu.
Bacause Ubuntu default shell is dash which does not support read with -s option.
Shebang of generate-yaml.sh should be /bin/bash instead of /bin/sh.
Now that cluster-api
has migrated to using kubebuilder
to use CRD's instead of using apiserver-builder
and API aggregation (ref: kubernetes-sigs/cluster-api#494) we need to sync the openstack provider to use kubebuilder
with the upstream
Decode fails in method machineInstanceStatus within instancestatus.go with error:
no kind "Machine" is registered for version "cluster.k8s.io/v1alpha1"
So far, the cluster actutor is not yet implemented, except for basic types.
I would like to see a way to set up Networks, SecrityGroups and Loadbalancers for the cluster. In a first step, a single subnet for all nodes (master and worker) would IMHO be enough. Created Machines should use the created infrastructure.
Todo:
LB Floating IP and NetworkID must go to cluster status. SecurityGroups also somehow, but I am currently not sure yet how. Edit: Floating IP not needed because there is an APIEndpoint
on ClusterStatus
.
The loadbalancer should be used as the entry point for the apiserver, so we are able to create HA control planes.
If we are done with this, we can update the machine actuator to use the infrastructure.
We faced an interference with ufw running on ubuntu 18.04.
IMHO we should disable ufw by default, no matter the used Ubuntu version is 16.04 or 18.04.
On both versions, this in the startup script of master and node should do the trick
ufw disable
In it's current state, the Identity and Compute clients are timing out after 3 hours (I believe), rendering the provider pod useless until a new token is issued. You then begin getting a constant stream of "...Get service list err: Authentication failed". To resolve this during testing, you can just recreate the pod. But for actual use cases we should leverage the ReAuth option within the gophercloud library, which is defaulted to false. Setting this to true for our use cases seems much more sensible.
The current implementation is not passing all the required data to the auth step resulting in an authentication failure.
We currently depend on having clouds.yaml
in one of the standard paths. In a more k8s fashion, it'd be great if the actuator could get the secret itself (assuming there's a service account token in the pod) and use the data in that secret to create the openstack client.
Currently, generate-yaml.sh breaks if the path it's run in contains spaces:
./generate-yaml.sh -c clouds.yaml
/generate-yaml.sh: line 73: [: /mnt/c/Users/Daniel: binary operator expected
./generate-yaml.sh: line 78: [: /mnt/c/Users/Daniel: binary operator expected
./generate-yaml.sh: line 83: [: /mnt/c/Users/Daniel: binary operator expected
mkdir: cannot create directory ‘/mnt/c/Users/Daniel’: Permission denied
This can be solved by some strategic use of quotes.
Jobs we need to add:
This would make the extensibility of the script much better and simpler
xref: kubernetes-sigs/cluster-api#572
/assign @chrigl
suggested feature: If we wrap all the parts that run on local hardware in a docker container, it would mitigate a lot of hardware specific issues, and ultimately streamline the app.
There have been changes upstream to rename ProviderConfig
to ProviderSpec
. We need to pull in the upstream changes and update our references and documentation.
The security_group and availability_zone keys are not consistent with the format used in other keys floatingIP
, etc. Make these keys consistent to improve the user experience.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.