gluster / gluster-kubernetes Goto Github PK
View Code? Open in Web Editor NEWGlusterFS Native Storage Service for Kubernetes
License: Apache License 2.0
GlusterFS Native Storage Service for Kubernetes
License: Apache License 2.0
Any heketi-cli command failure does not call abort instead it just proceeds with other steps.
Always outputs error as gluster templates already present, When I try without -g.
I get the following error on the kubernetes master when using the quickstart guide https://github.com/gluster/gluster-kubernetes#quickstart
[root@master deploy]# ./gk-deploy -g
Starting Kubernetes deployment.
serviceaccount "heketi-service-account" created
deployment "glusterfs-node0" created
deployment "glusterfs-node1" created
deployment "glusterfs-node2" created
Waiting for GlusterFS pods to start ... OK
service "deploy-heketi" created
deployment "deploy-heketi" created
Waiting for deploy-heketi pod to start ... OK
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 17 100 17 0 0 511 0 --:--:-- --:--:-- --:--:-- 531
Creating cluster ... ID: 73f2e1d87c54137b27f52285e4b22a14
Creating node node0 ... ID: 5dcc52cc01a199a59386e7863602e983
Adding device /dev/vdb ... Unable to add device: Unable to execute command on glusterfs-node0-2509304327-edwgn: Device /dev/vdb not found (or ignored by filtering).
Adding device /dev/vdc ... Unable to add device: Unable to execute command on glusterfs-node0-2509304327-edwgn: Device /dev/vdc not found (or ignored by filtering).
Adding device /dev/vdd ... Unable to add device: Unable to execute command on glusterfs-node0-2509304327-edwgn: Device /dev/vdd not found (or ignored by filtering).
Creating node node1 ... ID: 173452ef339f96efd9b1d089469cb572
Adding device /dev/vdb ... Unable to add device: Unable to execute command on glusterfs-node1-3290690057-kbbmh: Device /dev/vdb not found (or ignored by filtering).
Adding device /dev/vdc ... Unable to add device: Unable to execute command on glusterfs-node1-3290690057-kbbmh: Device /dev/vdc not found (or ignored by filtering).
Adding device /dev/vdd ... Unable to add device: Unable to execute command on glusterfs-node1-3290690057-kbbmh: Device /dev/vdd not found (or ignored by filtering).
Creating node node2 ... ID: 27ace16e97f55dffebce8126844a87df
Adding device /dev/vdb ... Unable to add device: Unable to execute command on glusterfs-node2-4072075787-7t1gx: Device /dev/vdb not found (or ignored by filtering).
Adding device /dev/vdc ... Unable to add device: Unable to execute command on glusterfs-node2-4072075787-7t1gx: Device /dev/vdc not found (or ignored by filtering).
Adding device /dev/vdd ... Unable to add device: Unable to execute command on glusterfs-node2-4072075787-7t1gx: Device /dev/vdd not found (or ignored by filtering).
Error: Error calling v.allocBricksInCluster: Id not found
the path "heketi-storage.json" does not exist
Timed out waiting for pods matching 'job-name=heketi-storage-copy-job'.
service "deploy-heketi" deleted
deployment "deploy-heketi" deleted
No resources found
Error from server: services "heketi-storage-endpoints" not found
serviceaccount "heketi-service-account" deleted
pod "glusterfs-node0-2509304327-edwgn" deleted
pod "glusterfs-node1-3290690057-kbbmh" deleted
pod "glusterfs-node2-4072075787-7t1gx" deleted
deployment "glusterfs-node0" deleted
deployment "glusterfs-node1" deleted
deployment "glusterfs-node2" deleted
When heketi/heketi#599 resolves, use an etcd3 cluster (running in kube) to store the heketi DB instead of a direct BoltDB in a GlusterFS volume. This should remove the need for an initial deploy-heketi pod. We could use this as reference.
Thanks for this project - it's been very useful. I've been trying to use it on a 3 node Kubernetes cluster installed via kubeadm on CentOS 7. While experimenting, I found a few enhancements that would be useful:
rm -rf /var/lib/heketi
on --abort
on all the nodes would save me having to do it manually if I need to start from scratch--abort
I need to manually run vgs
followed by vgremove -y <volume group starting with vg_>
so that the volumes are ready for a fresh install. Not sure if this can happen automatically on --abort
I also found I needed to ensure lvm2-monitor was active on all nodes or strange errors would crop up. If it wasn't running I had to do the following:
systemctl restart lvm2-lvmetad.service
systemctl restart lvm2-lvmetad.socket
I could then check the status with systemctl status lvm2-monitor
Hope this is useful!
The actual process create a separated gluster volume to hold the heketi.db
But it could be managed via a pvc/pv, it simplifies the setup and have a unified way to manage all the gluster volumes (no exception).
For example, one usecase is that you can clean up the whole cluster by using only kubectl delete ...
and create it again from scratch without extra actions
As a poc, I tryied the following with success:
heketi-deployment
with the pvc mounted(modified template: heketi.yaml)$ kubctl get pv
NAME CAPACITY ACCESSMODES RECLAIMPOLICY STATUS CLAIM
heketidbstorage 4Gi RWX Retain Bound glusterfs/heketi
$ kubectl get pvc --namespace glusterfs
NAME STATUS VOLUME CAPACITY ACCESSMODES AGE
heketi Bound heketidbstorage 4Gi RWX
Would you be interested to move to a such configuration? and discuss the correct way to achieve this?
I managed to get the provided sample vagrant deployment working. I'm now trying to use the deploy logic on a pre-existing kubernetes deployment (built from https://github.com/att-comdev/halcyon-vagrant-kubernetes which uses Ubuntu for the nodes). In this setup, there are three nodes called node1, node2, node3 and I modified their Vagrantfile to add three drives each.
The .gk-deploy gets to the deploy-heketi stage, successfully adds the devices for node1. Then, it freezes on the second node and I have to kill the script. I'm not sure where to start debugging.
ubuntu@kube1:~/deploy$ export KUBECONFIG="/etc/kubernetes/admin.conf" && sudo -E bash ./gk-deploy -g -w 180
Using Kubernetes CLI.
Error from server: error when creating "./kube-templates/heketi-service-account.yaml": serviceaccounts "heketi-service-account" already exists
'storagenode' already has a value (glusterfs), and --overwrite is false
'storagenode' already has a value (glusterfs), and --overwrite is false
'storagenode' already has a value (glusterfs), and --overwrite is false
Error from server: error when creating "./kube-templates/glusterfs-daemonset.json": daemonsets.extensions "glusterfs" already exists
Waiting for GlusterFS pods to start ... OK
Error from server: error when creating "STDIN": services "deploy-heketi" already exists
Error from server: error when creating "STDIN": deployments.extensions "deploy-heketi" already exists
Waiting for deploy-heketi pod to start ... OK
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 17 100 17 0 0 2303 0 --:--:-- --:--:-- --:--:-- 2428
Found node 172.16.35.11 on cluster bb595497757b43a3895fb6f7ef3ec791
Adding device /dev/sdb ... Unable to add device: Unable to execute command on glusterfs-92gtg: Can't initialize physical volume "
/dev/sdb" of volume group "vg_0490a9019cb9648662b6d0e2d47041ac" without -ff
Adding device /dev/sdc ... Unable to add device: Unable to execute command on glusterfs-92gtg: Can't initialize physical volume "
/dev/sdc" of volume group "vg_defad24db8970ec3c474909c04dd8052" without -ff
Adding device /dev/sda ... Unable to add device: Unable to execute command on glusterfs-92gtg: Can't initialize physical volume "
/dev/sda" of volume group "vg_1adc96baa051963403b95d438fdc1d84" without -ff
Found node 172.16.35.12 on cluster bb595497757b43a3895fb6f7ef3ec791
Adding device /dev/sdb ...
If this bug/question should be reported to heketi, let me know.
up.sh --provider=virtualbox results in failure:
fatal: [node2]: FAILED! => {"changed": true, "cmd": ["yum", "-y", "install", "centos-release-gluster", "epel-release"], "delta": "0:00:15.836969", "end": "2016-12-08 14:20:52.772676", "failed": true, "rc": 1, "start": "2016-12-08 14:20:36.935707", "stderr": "http://ftp.osuosl.org/pub/fedora-epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttp://mirror.sfo12.us.leaseweb.net/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttps://mirrors.lug.mtu.edu/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttp://mirror.math.princeton.edu/pub/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttp://mirror.cs.princeton.edu/pub/mirrors/fedora-epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttp://mirror.us.leaseweb.net/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttps://mirror.redsox.cc/fedora-epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttps://mirror.steadfast.net/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttp://mirror.symnds.com/distributions/fedora-epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttps://mirror.chpc.utah.edu/pub/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttp://fedora-epel.mirror.lstn.net/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttp://mirror.cs.p
follow up with 'vagrant provision' it also fails with what appears to be some preflight checks:
TASK [master : kubeadm init] ***************************************************
fatal: [master]: FAILED! => {"changed": true, "cmd": ["kubeadm", "init", "--token=abcdef.1234567890abcdef", "--use-kubernetes-version=v1.4.5", "--api-advertise-addresses=192.168.10.90"], "delta": "0:00:00.245555", "end": "2016-12-08 14:28:28.534215", "failed": true, "rc": 2, "start": "2016-12-08 14:28:28.288660", "stderr": "preflight check errors:\n\tPort 6443 is in use\n\tPort 2379 is in use\n\tPort 8080 is in use\n\tPort 9898 is in use\n\tPort 10250 is in use\n\tPort 10251 is in use\n\tPort 10252 is in use\n\t/etc/kubernetes/manifests is not empty\n\t/etc/kubernetes/pki is not empty\n\t/var/lib/etcd is not empty\n\t/var/lib/kubelet is not empty\n\t/etc/kubernetes/admin.conf already exists\n\t/etc/kubernetes/kubelet.conf already exists", "stdout": "Running pre-flight checks", "stdout_lines": ["Running pre-flight checks"], "warnings": []}
to retry, use: --limit @/home/screeley/git/gluster-kubernetes-dev/test-dir/scale1/gluster-kubernetes/vagrant/site.retry
PLAY RECAP *********************************************************************
master : ok=21 changed=5 unreachable=0 failed=1
node0 : ok=20 changed=5 unreachable=0 failed=0
node1 : ok=20 changed=5 unreachable=0 failed=0
node2 : ok=4 changed=2 unreachable=0 failed=1
node3 : ok=20 changed=5 unreachable=0 failed=0
Ansible failed to complete successfully. Any error output should be
visible above. Please fix these errors and try again.
Wondering if we should convert json to yaml, I believe OpenShift and Kube have settled on yaml as the official object spec def.
What say you @obnoxxx @jarrpa @wattsteve @erinboyd
I can do the conversion if everyone is in agreement
While using the vagrant setup, both @screeley and I are running into the problems encountered with #24. The GlusterFS provisioner has undergone a lot of changes in Kubernetes lately so the first thing I did was ascertain the version the kubernetes version, which is 1.4.5.
According to history of the provisioning README at the time 1.4.5 shipped (https://github.com/kubernetes/kubernetes/blob/1d527194656bad6a0f191f9fc6160bf7e931cf09/examples/experimental/persistent-volume-provisioning/README.md) the following storage class and claim should work, but they don't. Any idea @jarrpa @humblec ?
[root@master deploy]# cat gluster-sc.yaml
apiVersion: storage.k8s.io/v1beta1
kind: StorageClass
metadata:
name: glusterfs
provisioner: kubernetes.io/glusterfs
parameters:
resturl: "http://10.36.0.0:8080"
restuser: "admin"
secretNamespace: "default"
secretName: "heketi-service-account-token-7ne36"
[root@master deploy]# cat gluster-claim.yaml
{
"kind": "PersistentVolumeClaim",
"apiVersion": "v1",
"metadata": {
"name": "test-claim",
"annotations": {
"volume.beta.kubernetes.io/storage-class": "glusterfs"
}
},
"spec": {
"accessModes": [
"ReadWriteMany"
],
"resources": {
"requests": {
"storage": "5Gi"
}
}
}
}
What is the operational story to replace a disk in a node in Kubernetes?
I am hitting the following error when executing heketi-cli setup-openshift-heketi-storage
:
Error: Unable to execute command on glusterfs1-1373000839-qq9jv: /usr/sbin/modprobe failed: 1
Cannot read thin-pool target version.
thin: Required device-mapper target(s) not detected in your kernel.
Run `lvcreate --help' for more information.
On the hosts there is Ubuntu 16.04.1 LTS
installed.
What are the prerequisites for Heketi/Glusterfs to create volumes?
Currently gk-deploy is started even if the topology file is not specified. It should verify and fail if not found.
Currently, the default behavior of gk-deploy
is to deploy into the namespace 'default'. We should instead deploy to whatever the user's current namespace is (e.g. not require a -n
option).
Its good to have a function which mention the prerequisites we need to have for the gk-deploy
to run. This helps the storage admin wrt to the deployment of this solution.
What is the operational story for getting a disk replaced when we only see a single Heketi brick reporting failures?
The following error, although moves between nodes on subsequent runs, it has shown consistently on at least 1 node for each up.sh initiation. This time it showed on node0. wondering if there is something we can do to help this? going to try to remove epel.repo prior to yum installs or possibly yum clean metadata after repo is installed????
fatal: [node0]: FAILED! => {"changed": true, "cmd": ["yum", "-y", "install", "wget", "screen", "git", "vim", "glusterfs-client", "heketi-client", "iptables", "iptables-utils", "iptables-services", "docker", "kubeadm"], "delta": "0:00:16.419865", "end": "2016-12-08 14:39:52.070970", "failed": true, "rc": 1, "start": "2016-12-08 14:39:35.651105", "stderr": "http://mirror.math.princeton.edu/pub/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttps://mirror.chpc.utah.edu/pub/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttp://mirror.symnds.com/distributions/fedora-epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttp://mirror.us.leaseweb.net/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttp://mirror.sfo12.us.leaseweb.net/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttp://mirrors.syringanetworks.net/fedora-epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttp://mirrors.mit.edu/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttp://mirror.cs.pitt.edu/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttps://pubmirror1.math.uh.edu/fedora-buffet/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttp://fedora-epel.mirror.lstn.net/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttp://ftp.osuosl.org/pub/fedora-epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttp://mirror.nexcess.net/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttp://mirror.sjc02.svwh.net/fedora-epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttp://mirror.oss.ou.edu/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for
What is the story for replacing a node in the cluster?
In kubernetes 1.5, the "parameters" options in the storage class no longer use the "endpoints" parameter. As per kubernetes/kubernetes#34705:
When the persistent volumes are dynamically provisioned, the Gluster plugin automatically create an endpoint and a headless service in the name `gluster-dynamic-
Leaving it in will generate the below error when trying to create the PVC. This option needs to be removed from the Hello World storage class example. Also, gk-deploy should not even create the "heketi-storage-endpoints" at all since k8s handles the endpoint.
ubuntu@kube1:~$ kubectl describe pvc/gluster1
Name: gluster1
Namespace: default
Status: Pending
Volume:
Labels: <none>
Capacity:
Access Modes:
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
10m 14s 41 {persistentvolume-controller } Warning ProvisioningFailed Failed
to provision volume with StorageClass "gluster-heketi": glusterfs: invalid option "endpoint" for volume plugin kubernetes.io/glusterfs
While executing heketi-cli setup-openshift-heketi-storage
the following error shows up:
Error: Unable to execute command on glusterfs0-2272744551-a4ghp: volume create: heketidbstorage: failed: Host 120fa67f-b5fe-4232-8c77-0c78e1c1c8ce.pub.cloud.scaleway.com is not in ' Peer in Cluster' state
Topology info gives this:
Cluster Id: 645be219ee6b0598b4d51458f2c82a12
Volumes:
Nodes:
Node Id: 18be84c12d63e0cba5b45a85145867f4
State: online
Cluster Id: 645be219ee6b0598b4d51458f2c82a12
Zone: 1
Management Hostname: 120fa67f-b5fe-4232-8c77-0c78e1c1c8ce.pub.cloud.scaleway.com
Storage Hostname: 120fa67f-b5fe-4232-8c77-0c78e1c1c8ce.pub.cloud.scaleway.com
Devices:
Id:a19f21522ad62a555ce29fcfa374019c Name:/dev/vdb State:online Size (GiB):46 Used (GiB):0 Free (GiB):46
Bricks:
Node Id: 41a0f607a5669136219f3ccd09cb4583
State: online
Cluster Id: 645be219ee6b0598b4d51458f2c82a12
Zone: 1
Management Hostname: 220d4345-ea09-4ba3-bf8e-bc2c86bc821c.pub.cloud.scaleway.com
Storage Hostname: 220d4345-ea09-4ba3-bf8e-bc2c86bc821c.pub.cloud.scaleway.com
Devices:
Id:71227ba841eb6ca845fb4315fe011b2c Name:/dev/vdb State:online Size (GiB):46 Used (GiB):0 Free (GiB):46
Bricks:
Node Id: 4fbef6294f6eedcff4fe86874cd4b93c
State: online
Cluster Id: 645be219ee6b0598b4d51458f2c82a12
Zone: 1
Management Hostname: f6e5fcaf-35bf-424b-a2f9-900d3d1a9b11.pub.cloud.scaleway.com
Storage Hostname: f6e5fcaf-35bf-424b-a2f9-900d3d1a9b11.pub.cloud.scaleway.com
Devices:
Id:7b8fbfe3ad7de9c825f082f91d0bf6ac Name:/dev/vdb State:online Size (GiB):46 Used (GiB):0 Free (GiB):46
Bricks:
What can I try to resolve this?
gluster-kubernetes
should clarify that it is intended to facilitate the hyper-converged scenario of GlusterFS + heketi running within Kubernetes and on Kubernetes nodes.
If all resources of Heketi and GlusterFS reside in one namespace, it might be easier to filter for them on metrics or logging.
Also to stop and remove all resources gracefully you then can just call:
kubectl delete namespace glusterfs
Error from server: DaemonSet in version "v1beta1" cannot be handled as a DaemonSet: [pos 1495]: json: decode bool: got first char "
I am trying to create this PersistantVolumeClaim
:
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: claim1
annotations:
volume.beta.kubernetes.io/storage-class: slow
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 4Gi
When doing so this error shows up:
kubectl describe persistentvolumeclaim claim1 368ms
Name: claim1
Namespace: default
Status: Pending
Volume:
Labels: <none>
Capacity:
Access Modes:
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
48m 1m 29 {persistentvolume-controller } Warning ProvisioningFailed Failed to provision volume with StorageClass "slow": glusterfs: invalid option "secretNamespace" for volume plugin kubernetes.io/glusterfs
49m 7s 170 {persistentvolume-controller } Warning ProvisioningFailed Failed to provision volume with StorageClass "slow": glusterfs: invalid option "secretName" for volume plugin kubernetes.io/glusterfs
I set up this StorageClass
before:
apiVersion: storage.k8s.io/v1beta1
kind: StorageClass
metadata:
name: slow
provisioner: kubernetes.io/glusterfs
parameters:
resturl: "http://heketi.gluster.svc:8080"
# restuser: "admin"
secretNamespace: "gluster"
secretName: "heketi-secret"
Topology load failure should not proceed, It should fail with a warning and should be able to resume from the load command.
Heketi supports Multiple clusters. We should be able to do the following:
Get the topology file for new cluster.
Add labels for specific nodes
Then heketi topology load
Here is a new one
TASK [setup] *******************************************************************
ok: [master]
TASK [master : kubeadm init] ***************************************************
fatal: [master]: FAILED! => {"failed": true, "msg": "the field 'args' has an invalid value, which appears to include a variable that is undefined. The error was: 'dict object' has no attribute 'ipv4'\n\nThe error appears to have been in '/home/screeley/git/gluster-kubernetes-dev/fix_metalink_error/gluster-kubernetes/vagrant/roles/master/tasks/main.yml': line 1, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: kubeadm init\n ^ here\n"}
to retry, use: --limit @/home/screeley/git/gluster-kubernetes-dev/fix_metalink_error/gluster-kubernetes/vagrant/site.retry
PLAY RECAP *********************************************************************
master : ok=24 changed=22 unreachable=0 failed=1
node0 : ok=23 changed=22 unreachable=0 failed=0
node1 : ok=23 changed=22 unreachable=0 failed=0
node2 : ok=23 changed=22 unreachable=0 failed=0
node3 : ok=23 changed=22 unreachable=0 failed=0
Ansible failed to complete successfully. Any error output should be
visible above. Please fix these errors and try again.
Hi. Why do you run SSH in GlusterFS containers? Maybe I am being naive, but wouldn't it be better to use kubectl exec to run commands in the containers? I am saying this because you run the DaemonSet with host network and in privileged mode, and makes me feel questioning the security of this provisioning strategy. In some OpenStack installation the firewall can fail sometimes in applying the security groups, and if this happens it would expose the SSH servers to the internet.
currently heketi searches for this label.
heketi/heketi#622 introduces some changes for integrating with some cool Kube 1.5 features. We should decide if we want to go for them immediately or introduce a version check and creating another kube-template directory for 1.5.
Related: #87
Currently we use generic Vagrant boxes, which we then setup as we want using Ansible. This model is completely unpredictable as time goes by because many things change. Instead, we need to change our tests to use prebuilt vagrant boxes, which have been setup for our tests. This means that we also need a directory in Heketi so that anyone can build the boxes automatically, then the boxes can submit to Vagrant Atlas.
We should provide a document or set of documentation that clearly describes the deployment scenarios this project supports. This includes diagrams and descriptions of all the components in each scenario and the relationships between the components.
See: heketi/heketi#636
What is the story for removing a node from Kubernetes without impacting the live storage volumes managed by Heketi?
At the moment there seems to be no log output for the glusterfs pods.
The current ansible playbook builds the /etc/hosts file in nodes by getting the IPs assigned to eth1 device.
On my machines, this is the ip addr output for eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 52:54:00:2a:84:0c brd ff:ff:ff:ff:ff:ff
inet 192.168.10.227/24 brd 192.168.10.255 scope global dynamic eth1
valid_lft 2296sec preferred_lft 2296sec
inet 192.168.10.101/24 brd 192.168.10.255 scope global secondary eth1
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:fe2a:840c/64 scope link
valid_lft forever preferred_lft forever
As it can be seen, this interface has two IPs. The one that is added to /etc/hosts is not the one in topology file and causes the gk-deploy script to fail.
We've started work on a Helm Chart based off the manifests here.
So far it works with a few changes for standard token and api locations but doesn't persist the database or load the topology automatically. I thought I'd raise a ticket early for tracking and inputs but the addition of etcd and daemonset features to Heketi should let us wrap this up and push it upstream.
https://github.com/AcalephStorage/charts/tree/glusterfs/incubator/glusterfs
Update Jan 11 2016: This purpose of this issue has changed and is now focused on the updated issue's title: "Explore 3rd party alternatives to providing our own vagrant k8s deployment". The migration to ansible discussion has moved to #149.
I ended up getting this project working on my existing k8s cluster with Ubuntu 16.04 hosts. I'm really happy about that and excited about using it. Overall, this is a great project that helped me get started on understanding glusterfs and heketi. However, I do have some suggestions in the same spirit as #35.
When heketi/heketi#596 resolves and we get a new container image, update the GlusterFS definition to use DaemonSets instead of Deployments. For the OpenShift support, this means the use of Templates will also no longer be necessary.
Now that there is code and deployment logic in this repo, it needs to make sure it keeps working and therefore needs a CI.
I advise to do this first before accepting any new changes.
Long arguments seem to have issues:
$ ./gk-deploy --deploy-gluster --verbose
Unknown option 'deploy-gluster'.
$ ./gk-deploy -g --verbose 1
Unknown option 'erbose'.
Short versions work, but there seem to be some other issues:
./gk-deploy -g -v
Starting Kubernetes deployment.
serviceaccount "heketi-service-account" created
Found secret 'heketi-service-account-token-3juax' in namespace 'default' for heketi-service-account.
File "<stdin>", line 10
print node['node']['hostnames']['manage'][0]
^
SyntaxError: Missing parentheses in call to 'print'
Deploying GlusterFS pods on .
The Deployment "glusterfs-" is invalid: metadata.name: Invalid value: "glusterfs-": must match the regex [a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)* (e.g. 'example.com')
Waiting for GlusterFS pods to start ... Checking status of pods matching 'glusterfs=pod':
Checking status of pods matching 'glusterfs=pod':
[...]
These variables has to be properly parsed and acted.
@MohamedAshiqrh have the patch , waiting for the PR :)
By default we should assume that we are meant to deploy the GlusterFS DaemonSet. However, we should prompt the user whether they want to deploy GlusterFS or not. This prompt should also include a small note as to the firewall requirements on the node for GlusterFS. The -g
option should still be retained, with the slight modification that it simply assumes 'yes' and skips the prompt entirely.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.