att-comdev / halcyon-kubernetes Goto Github PK
View Code? Open in Web Editor NEWAnsible playbooks for a kubadm-based kubernetes deployment, on supporting any cloud and any kubeadm-enabled OS.
License: Apache License 2.0
Ansible playbooks for a kubadm-based kubernetes deployment, on supporting any cloud and any kubeadm-enabled OS.
License: Apache License 2.0
The proxy_enable setting added in #28 is not defined in group_vars which leads to the following error:
fatal: [ravi-kube196]: FAILED! => {"failed": true, "msg": "The conditional check 'docker_shared_mounts or proxy_enable' failed. The error was: error while evaluating conditional (docker_shared_mounts or proxy_enable): 'proxy_enable' is undefined\n\nThe error appears to have been in '/Users/rmehra/esupport/code/halcyon-vagrant-kubernetes/halcyon-kubernetes/kube-deploy/roles/deploy-kube/tasks/ubuntu.yml': line 20, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: setting up docker unit drop-in dir\n ^ here\n"}
This is more of a usability issue. I was running this playbook multiple times and resetting in-between. I forgot to remove the deployed /etc/systemd/system/kubelet.service.d/15-hostname-override.conf file and I was constantly hitting #35. It would be nice if the file is removed (state: absent) if kubelet_hostname_override: false
.
When kubelet_hostname_override setting is enabled, the kube-proxy gives the following startup errors:
E0112 06:07:10.274956 1 server.go:421] Can't get Node "kube1", assuming iptables proxy, err: nodes "kube1" not found
W0112 06:07:10.277165 1 server.go:468] Failed to retrieve node info: nodes "kube1" not found
W0112 06:07:10.277296 1 proxier.go:249] invalid nodeIP, initialize kube-proxy with 127.0.0.1 as nodeIP
I'm not sure of the symptoms of this error. I was having multiple networking issues so it was hard to correlate which error messages caused which issues. The errors can be cleared though by passing the --hostname-override setting to kube-proxy as well as kubelet, which is apparently required according to kubernetes/kubernetes#18104 (comment).
E0112 06:07:10.274956 1 server.go:421] Can't get Node "kube1", assuming iptables proxy, err: nodes "kube1" not found
I0112 06:07:10.276098 1 server.go:215] Using iptables Proxier.
W0112 06:07:10.277165 1 server.go:468] Failed to retrieve node info: nodes "kube1" not found
W0112 06:07:10.277296 1 proxier.go:249] invalid nodeIP, initialize kube-proxy with 127.0.0.1 as nodeIP
W0112 06:07:10.277347 1 proxier.go:254] clusterCIDR not specified, unable to distinguish between internal and external traffic
I0112 06:07:10.277415 1 server.go:227] Tearing down userspace rules.
I0112 06:07:10.287037 1 conntrack.go:81] Set sysctl 'net/netfilter/nf_conntrack_max' to 131072
I0112 06:07:10.287505 1 conntrack.go:66] Setting conntrack hashsize to 32768
I0112 06:07:10.287732 1 conntrack.go:81] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_established' to 86400
I0112 06:07:10.287818 1 conntrack.go:81] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_close_wait' to 3600
Self explanatory. Reviewable causes more confusion than it solves with reviews.
I have two IP addresses on my "public_iface" interface. Commands like ip addr show {{ public_iface }} | grep "inet\b" | awk '{print $2}' | cut -d/ -f1
will grab both and will fail subsequent commands.
one nice (and extremely simple) change that we can make is to have kubectl
auto-completion.
this can be done with a simple ansible lineinfile
EOF in ~/.bashrc
.
source <(kubectl completion bash)
i can get back to this later, but if someone in the community wants to add to the playbooks...that's fine also. i just wanted to create the feature request before i forget it (as i'm working on something else).
similar to CoreOS and #54, a lot of people would like to see support for Atomic as well.
this issue is just to track this effort and let the community know that there is an intention to support more base OS's. Ceph support could potentially be easier in Atomic than CoreOS, but it's something we'd like to have in place in order to support both the openstack-helm and kolla-kubernetes.
Kubernetes 1.5 now checks for the --cluster-cidr flag and will give a warning without it. See kubernetes/kubernetes#39440. There is some discussion on how kubeadm can automatically resolve it in the future but for now, it looks like we'll have to set it ourselves.
As a result of moby/moby#30083, pulls from gcr.io are not working in all cases.
Thanks the great project.
But do you have any plan to support k8s 1.5+ in offline env ?
add ci testing to this repo/deployment.
It was determined that support for KVM was necessary for some individuals who wish to use the halcyon-kubernetes project for their development needs.
i think we need to add multiple cni-enabled sdn providers to the set of playbooks. i would like to see this done in the following (somewhat opinionated) way:
role:
kube-sdn:
tasks:
main.yml (referring to other sdn playbooks)
calico.yml
canal.yml
romana.yml
weave.yml
i would like to add an "sdn boostrapped" artifact in /etc/kubernetes/halcyon/network/.sdn
, and include the following output that can then be used later to identify which SDN is deployed:
ubuntu@kube1:~$ cat /etc/kubernetes/halcyon/network/.sdn
# Halcyon Network Deployment:
kube_sdn_deploy: {{ kube_sdn_deploy }}
kube_sdn: {{ kube_sdn }}
if any of the sdn folks are interested in this, that's fine...i can pick this up later as well.
When deploying Romana on worker nodes spun up with vagrant, the romana agent pods remain in a crashloopbackoff state. Describing the pods shows:
8m 8m 1 {kubelet 172.16.35.13} spec.containers{romana-agent} Normal Started Started container with docker id 1241efd23e9a
8m 8m 1 {kubelet 172.16.35.13} spec.containers{romana-agent} Normal Created Created container with docker id 1241efd23e9a; Security:[seccomp=unconfined]
8m 5m 13 {kubelet 172.16.35.13} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "romana-agent" with CrashLoopBackOff: "Back-off 2m40s restarting failed container=romana-agent pod=romana-agent-dgth5_kube-system(0eada120-aad2-11e6-831b-02d0b043c29f)"
The docker logs for the container shows:
ubuntu@kube1:~$ sudo docker logs 99bda40ea34c
Error: Unable to fetch list of hosts using 'http://10.99.99.99:9600'
run-romana-agent: entrypoint for romana services container.
Options:
-h or --help: print usage
--romana-root: URL for Romana Root service, eg: http://127.0.0.1:9600
--interface: Interface for IP Address lookup, instead of eth0
--nat: Add NAT rule for traffic coming from pods
--nat-interface: Interface that NATs traffic, instead of eth0 or --interface setting.
--cluster-ip-cidr: CIDR for cluster IPs. Excluded from NAT rule.
--pod-to-host: Permit pods to connect to the host server
The vagrant-proxyconf plugin doesn't properly set up the docker proxy for CentOS and requires a vagrant reload after docker is installed which causes problems with the ansible playbook.
@mwgiles (https://github.com/mwgiles) has proposed a fix: https://github.com/portdirect/halcyon-kubernetes/pull/1, that should be brought into this repo to resolve this issue.
for some reason, people see "vagrant" and immediately think that this is only for a local lab; when in fact, we're allowing for multiple vagrant provider deployments. so in order to address this concern, i think these playbooks need broken away from the vagrant or any terraform solution. so the repos would look like this:
halcyon-terraform-kubernetes:
- terraform deployments that use halcyon-kubernetes as a submodule.
halcyon-vagrant-kubernetes:
- vagrant deployments that use halcyon-kubernetes as a submodule.
halcyon-kubernetes:
- just the necessary ansible playbooks that are portable and can be used as a Galaxy role, or as a submodule for other deployments.
i think this will make the most sense for other people long term.
It would be great if the project could help deploy kubernetes to other architectures like arm/arm64. The project as-is already almost gets a working install since recent kubeadm versions have good multi-platform support built in.
Here is my post-playbook steps for getting to a happy state.
sudo kubectl delete -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
curl -sSL https://rawgit.com/coreos/flannel/master/Documentation/kube-flannel.yml | sed "s/amd64/arm64/g" | sudo kubectl create -f -
sudo kubectl delete -f https://rawgit.com/kubernetes/dashboard/master/src/deploy/kubernetes-dashboard.yaml
curl -sSL https://rawgit.com/kubernetes/dashboard/master/src/deploy/kubernetes-dashboard.yaml | sed "s/amd64/arm64/g" | sudo kubectl create -f -
sudo kubectl delete pods --all --namespace=kube-system && sudo kubectl delete pods --all
Helm/tiller doesn't currently have arm64 support, but it does have 32-bit arm support and it can be deployed with:
curl -L http://storage.googleapis.com/kubernetes-helm/helm-canary-linux-arm.tar.gz | tar zxv --strip 1 -C /tmp; chmod +x /tmp/helm; sudo mv /tmp/helm /usr/local/bin/helm
sudo /usr/local/bin/helm init
To make these steps work with this project, I'm thinking either ansible could detect the platform with uname -m
or it could be user-specified in group_vars/all.yml
. Then, ansible just modifies the dashboard and flannel manifests with the specified architecture before installing them.
Other notes:
I got a fatal error message while "vagrant up"
below is error message
➜ halcyon-vagrant-kubernetes git:(master) ✗ ./setup-halcyon.sh --k8s-config kolla --k8s-version v1.5.2 --guest-os centos
➜ halcyon-vagrant-kubernetes git:(master) ✗ vagrant up
...
TASK [kube-init : initialize the kubernetes master] ****************************
fatal:[
kube1
]:FAILED! =>{
"changed":true,
"cmd":"kubeadm init --token ac7da3.b2cofcda6ab01976 --use-kubernetes-version v1.5.2 --api-advertise-addresses 172.16.35.11",
"delta":"0:00:00.067388",
"end":"2017-04-17 06:39:06.573168",
"failed":true,
"rc":1,
"start":"2017-04-17 06:39:06.505780",
"stderr":"Error: unknown flag: --use-kubernetes-version\nUsage:\n kubeadm init [flags]\n\nFlags:\n --apiserver-advertise-address string The IP address the API Server will advertise it's listening on. 0.0.0.0 means the default network interface's address.\n --apiserver-bind-port int32 Port for the API Server to bind to (default 6443)\n --apiserver-cert-extra-sans stringSlice Optional extra altnames to use for the API Server serving cert. Can be both IP addresses and dns names.\n --cert-dir string The path where to save and store the certificates (default "/etc/kubernetes/pki")\n --config string Path to kubeadm config file (WARNING: Usage of a configuration file is experimental)\n --kubernetes-version string Choose a specific Kubernetes version for the control plane (default "v1.6.0")\n --pod-network-cidr string Specify range of IP addresses for the pod network; if set, the control plane will automatically allocate CIDRs for every node\n --service-cidr string Use alternative range of IP address for service VIPs (default "10.96.0.0/12")\n --service-dns-domain string Use alternative domain for services, e.g. "myorg.internal" (default "cluster.local")\n --skip-preflight-checks Skip preflight checks normally run before modifying the system\n --token string The token to use for establishing bidirectional trust between nodes and masters.\n --token-ttl duration The duration before the bootstrap token is automatically deleted. 0 means 'never expires'.",
"stderr_lines":[
"Error: unknown flag: --use-kubernetes-version",
"Usage:",
" kubeadm init [flags]",
"",
"Flags:",
" --apiserver-advertise-address string The IP address the API Server will advertise it's listening on. 0.0.0.0 means the default network interface's address.",
" --apiserver-bind-port int32 Port for the API Server to bind to (default 6443)",
" --apiserver-cert-extra-sans stringSlice Optional extra altnames to use for the API Server serving cert. Can be both IP addresses and dns names.",
" --cert-dir string The path where to save and store the certificates (default "/etc/kubernetes/pki")",
" --config string Path to kubeadm config file (WARNING: Usage of a configuration file is experimental)",
" --kubernetes-version string Choose a specific Kubernetes version for the control plane (default "v1.6.0")",
" --pod-network-cidr string Specify range of IP addresses for the pod network; if set, the control plane will automatically allocate CIDRs for every node",
" --service-cidr string Use alternative range of IP address for service VIPs (default "10.96.0.0/12")",
" --service-dns-domain string Use alternative domain for services, e.g. "myorg.internal" (default "cluster.local")",
" --skip-preflight-checks Skip preflight checks normally run before modifying the system",
" --token string The token to use for establishing bidirectional trust between nodes and masters.",
" --token-ttl duration The duration before the bootstrap token is automatically deleted. 0 means 'never expires'."
],
"stdout":"",
"stdout_lines":[
]
}
...
And I had accessed to my kube1 VM
Checked kubeadm version
Below is #an inforamtion of kubeadm version in my kube1 VM
[vagrant@kube1 ~]$ kubeadm version
kubeadm version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.1", GitCommit:"b0b7a323cc5a4a2019b2e9520c21c7830b7f708e", GitTreeState:"clean", BuildDate:"2017-04-03T20:33:27Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"linux/amd64"}
Please check below changelog "--use-kubernetes-version" was changed
https://github.com/kubernetes/kops/blob/master/vendor/k8s.io/kubernetes/CHANGELOG.md
many people would like to see this feature added, especially when we're talking for projects like openstack-helm and kolla-kubernetes.
i'd like to see CoreOS supported for the project, and this issue is just to track this and let the community know that this is the project's intention. i know we're going to run into issues with Ceph support, but it's possible that this could get easier with things like a Ceph Helm Chart.
i would like to add centos tasks to these plabooks as well. this should be done in a somewhat opinionated way (i've sort of started the framework):
roles:
playbook:
main.yml (referring to centos/ubuntu tasks)
centos.yml
ubuntu.yml
{{ future_os_support.yml }}
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.