Comments (21)
These days are Chinese new year, I may be slow to respond...
from open-local.
I've done more research and here's what I've found.
- The kube-scheduler "--policy-config-file" that you're using is deprecated and will be going away in kubernetes 1.23. You will need to migrate to "kube-scheduler --config" as documented here: https://kubernetes.io/docs/reference/scheduling/config/
I haven't tested, but it could be like this:
--- apiVersion: kubescheduler.config.k8s.io/v1beta3 kind: KubeSchedulerConfiguration extenders: - urlPrefix: http://open-local-scheduler-extender.kube-system:23000/scheduler filterVerb: predicates prioritizeVerb: priorities preemptVerb: '' bindVerb: '' weight: 10 enableHttps: false nodeCacheCapable: true ignorable: true profiles: - pluginConfig: - name: InterPodAffinity args: apiVersion: kubescheduler.config.k8s.io/v1beta3 kind: InterPodAffinityArgs hardPodAffinityWeight: 10
- While k3s doesn't have a pod for kube-scheduler, we have the option of providing arguments to it. So assuming I create a scheduler.yaml in my server with the extenders here, I can do something like:
--kube-scheduler-arg config=/etc/rancher/k3s/scheduler.yaml
Will you consider removing the requirement of the init-job.yaml, and give us the option to configure kube-scheduler ourselves? Perhaps you can have it default to running the init-job, and we can disable if we will configure the scheduler separately?
This works fine, I have tested in my cluster.
apiVersion: kubescheduler.config.k8s.io/v1beta1
kind: KubeSchedulerConfiguration
clientConnection:
kubeconfig: /etc/kubernetes/scheduler.conf
extenders:
- urlPrefix: http://open-local-scheduler-extender.kube-system:23000/scheduler
filterVerb: predicates
prioritizeVerb: priorities
weight: 10
ignorable: true
nodeCacheCapable: true
I have raised a PR #106
from open-local.
Sure, you can edit nls resource to use your existing VG.
kubectl edit nls [node name]
apiVersion: csi.aliyun.com/v1alpha1
kind: NodeLocalStorage
spec:
nodeName: [node name]
listConfig:
vgs:
include:
- [your VG name]
- open-local-pool-[0-9]+
when your existing VG name appears in the status.nodeStorageInfo.volumeGroups of this nls, means you can use this VG to create your Persistent Volume.
from open-local.
Thanks, ill give this a try!
Is it possible to configure this in the values.yaml? My cluster is defined declaratively and I prefer to keep the state in my repo.
from open-local.
How about if i do it this way:
- add crds
- create nls
- install helm chart
Will it respect my nls settings? Or will it override.
from open-local.
Thanks, ill give this a try!
Is it possible to configure this in the values.yaml? My cluster is defined declaratively and I prefer to keep the state in my repo.
Sure! And contributions to this project are very welcome!
from open-local.
How about if i do it this way:
- add crds
- create nls
- install helm chart
Will it respect my nls settings? Or will it override.
No nls will be created or overrided if there is an existing nls.
from open-local.
Thanks for your help. I was able to get it working after I added the crds from "csi.aliyun.com_nodelocalstorages.yaml", then defined the nls, and after that installed the helm chart.
However, init-jobs are permanently pending due to an affinity issue, as they're set to not run on master. My set up is a single node k3s cluster so I only have a master node.
It appears to be working fine without the init-job, and I've raised a PR to allow disabling them: #105
from open-local.
But you need init-job, it will modify kube-scheduler.yaml file of each master node to configure scheduler policy.
from open-local.
The files and dirs its trying to modify dont exist there on k3s. Whats the scheduler policy needed for? I was able to create and delete pvc without it.
Reading here https://kubernetes.io/docs/reference/scheduling/policies/ it says scheduler policy is deprecated in 1.23, and you should use the kubescheduler manifest https://kubernetes.io/docs/reference/scheduling/config/.
I also dont agree with an installed app making changes to my nodes, i prefer to do this myself if needed.
from open-local.
Open-Local is a local disk management system, this means open-local must choose a node from cluster, and consider how many storage capacity in each node can use, and which volume group it will use to create volume and snapshot.
So k8s scheduler must contain storage scheduling algorithm of open-local, we extend filtering and scoring algorithms in extender way, you can see it in init-job file.
open-local/helm/templates/init-job.yaml
Lines 58 to 60 in fec04e4
from open-local.
What about if you just have 1 node? Then is that still needed?
from open-local.
What about if you just have 1 node? Then is that still needed?
This has nothing to do with the number of nodes. You must configure this otherwise the k8s cluster will have no local storage scheduling capability.
It appears to be working fine without the init-job
Because your k3s cluster only have a master node, and you specify the vg name in storage class, it work fine. But it will work abnormally if the cluster size exceeds one node.
from open-local.
Thanks. How can i get it working on my single node cluster considering there is a nodetaint which specifies it can only run on non-master? It is stuck in pending.
Im happy to do a pr to address anything, but would appreciate your direction.
Should we add a variable in values.yaml for single node master clusters? And then if true, dont apply that taint?
from open-local.
I'm not familiar with k3s. Is your kube-scheduler static pod?
from open-local.
"k3s bundles the Kubernetes components (kube-apiserver, kube-controller-manager, kube-scheduler, kubelet, kube-proxy) into combined processes that are presented as a simple server and agent model.": https://www.suse.com/c/rancher_blog/introducing-k3s-the-lightweight-kubernetes-distribution-built-for-the-edge/
I haven't done anything with the kube-scheduler till now unfortunately.
from open-local.
I've done more research and here's what I've found.
- The kube-scheduler "--policy-config-file" that you're using is deprecated and will be going away in kubernetes 1.23. You will need to migrate to "kube-scheduler --config" as documented here: https://kubernetes.io/docs/reference/scheduling/config/
I haven't tested, but it could be like this:
---
apiVersion: kubescheduler.config.k8s.io/v1beta3
kind: KubeSchedulerConfiguration
extenders:
- urlPrefix: http://open-local-scheduler-extender.kube-system:23000/scheduler
filterVerb: predicates
prioritizeVerb: priorities
preemptVerb: ''
bindVerb: ''
weight: 10
enableHttps: false
nodeCacheCapable: true
ignorable: true
profiles:
- pluginConfig:
- name: InterPodAffinity
args:
apiVersion: kubescheduler.config.k8s.io/v1beta3
kind: InterPodAffinityArgs
hardPodAffinityWeight: 10
- While k3s doesn't have a pod for kube-scheduler, we have the option of providing arguments to it. So assuming I create a scheduler.yaml in my server with the extenders here, I can do something like:
--kube-scheduler-arg config=/etc/rancher/k3s/scheduler.yaml
Will you consider removing the requirement of the init-job.yaml, and give us the option to configure kube-scheduler ourselves? Perhaps you can have it default to running the init-job, and we can disable if we will configure the scheduler separately?
from open-local.
I modified the parameters and successfully run the k3s service. I am a single master and two node cluster.
open-local version: v0.3.3-dev
# After restarting the k3s service with --kube-scheduler-arg config, the k3s service will be restarted repeatedly
# But the policy-config-file is OK
--kube-scheduler-arg policy-config-file=/etc/rancher/k3s/scheduler.json
scheduler.json:
{
"kind" : "Policy",
"apiVersion" : "v1",
"extenders" : [{
"urlPrefix": "http://{{ .Values.extender.name }}.{{.Values.namespace}}:23000/scheduler",
"filterVerb": "predicates",
"prioritizeVerb": "priorities",
"preemptVerb": "",
"bindVerb": "",
"weight": 10,
"enableHttps": false,
"nodeCacheCapable": true,
"ignorable": true
}],
"hardPodAffinitySymmetricWeight" : 10
}
❯ kubectl get nls -ojson master1 | jq .status.filteredStorageInfo
{
"devices": [
"/dev/vdf"
],
"updateStatusInfo": {
"lastUpdateTime": "2022-02-09T07:07:49Z",
"updateStatus": "accepted"
},
"volumeGroups": [
"open-local-pool-0"
]
}
❯ kubectl get nls -ojson node1 | jq .status.filteredStorageInfo
{
"devices": [
"/dev/vdf"
],
"updateStatusInfo": {
"lastUpdateTime": "2022-02-09T07:07:48Z",
"updateStatus": "accepted"
},
"volumeGroups": [
"open-local-pool-0"
]
}
❯ kubectl get nls -ojson node2 | jq .status.filteredStorageInfo
{
"devices": [
"/dev/vdf"
],
"updateStatusInfo": {
"lastUpdateTime": "2022-02-09T07:07:47Z",
"updateStatus": "accepted"
},
"volumeGroups": [
"open-local-pool-0"
]
}
However, the disk cannot be created normally:
master1 [~]$ kubectl describe pvc -n demo pvc-open-local-lvm-open-local-lvm-0-d1
Name: pvc-open-local-lvm-open-local-lvm-0-d1
Namespace: demo
StorageClass: open-local-lvm
Status: Pending
Volume:
Labels: liveit100.com/app=qvm
liveit100.com/pvc=open-local-lvm-0
Annotations: <none>
Finalizers: [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode: Filesystem
Used By: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal WaitForFirstConsumer 14s (x5 over 67s) persistentvolume-controller waiting for first consumer to be created before binding
master1 [~]$ kubectl describe pvc -n demo pvc-open-local-device-hdd-open-local-device-hdd-0-d1
Name: pvc-open-local-device-hdd-open-local-device-hdd-0-d1
Namespace: demo
StorageClass: open-local-device-hdd
Status: Pending
Volume:
Labels: liveit100.com/app=qvm
liveit100.com/pvc=open-local-device-hdd-0
Annotations: <none>
Finalizers: [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode: Filesystem
Used By: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal WaitForFirstConsumer 8s (x8 over 101s) persistentvolume-controller waiting for first consumer to be created before binding
from open-local.
But I use kubectl apply - f sts-nginx.yaml
successfully ran pod...
❯ kubectl get pod
NAME READY STATUS RESTARTS AGE
nginx-lvm-0 1/1 Running 0 5m13s
❯ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
html-nginx-lvm-0 Bound local-595ce706-4ecf-4689-a020-d47fe1a0b1f5 30Gi RWO open-local-lvm 5m17s
❯ kubectl describe pvc html-nginx-lvm-0
Name: html-nginx-lvm-0
Namespace: default
StorageClass: open-local-lvm
Status: Bound
Volume: local-595ce706-4ecf-4689-a020-d47fe1a0b1f5
Labels: app=nginx-lvm
Annotations: pv.kubernetes.io/bind-completed: yes
pv.kubernetes.io/bound-by-controller: yes
volume.beta.kubernetes.io/storage-provisioner: local.csi.aliyun.com
volume.kubernetes.io/selected-node: node2
Finalizers: [kubernetes.io/pvc-protection]
Capacity: 30Gi
Access Modes: RWO
VolumeMode: Filesystem
Used By: nginx-lvm-0
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal WaitForFirstConsumer 6m42s persistentvolume-controller waiting for first consumer to be created before binding
Warning ProvisioningFailed 6m5s (x6 over 6m41s) local.csi.aliyun.com_node2_38d1a19f-5a3a-48cf-8c73-d6da1f0bd18d failed to provision volume with StorageClass "open-local-lvm": rpc error: code = Internal desc = Parse lvm part schedule info error: rpc error: code = InvalidArgument desc = lvm schedule with error Post "http://open-local-scheduler-extender:23000/apis/scheduling/default/persistentvolumeclaims/html-nginx-lvm-0?volumeType=LVM&nodeName=node2": dial tcp 10.43.38.189:23000: connect: connection refused
Normal ExternalProvisioning 5m47s (x6 over 6m42s) persistentvolume-controller waiting for a volume to be created, either by external provisioner "local.csi.aliyun.com" or manually created by system administrator
Normal Provisioning 5m33s (x7 over 6m42s) local.csi.aliyun.com_node2_38d1a19f-5a3a-48cf-8c73-d6da1f0bd18d External provisioner is provisioning volume for claim "default/html-nginx-lvm-0"
Normal ProvisioningSucceeded 5m33s local.csi.aliyun.com_node2_38d1a19f-5a3a-48cf-8c73-d6da1f0bd18d Successfully provisioned volume local-595ce706-4ecf-4689-a020-d47fe1a0b1f5
Maybe I need some time to find out why
from open-local.
Because the pvc is set to waitforfirstconsumer, it has to be used/referenced a pod before it gets created.
from open-local.
Because the pvc is set to waitforfirstconsumer, it has to be used/referenced a pod before it gets created.
Yes, you are right. However, the main reason I'm here is that there is a problem with a service installed before, which leads to the failure of the pod to start normally, so PVC has been waiting.
Now, I have no problem here.
from open-local.
Related Issues (20)
- Source PVC's deletion make VolumeSnapshot broken
- Create lvm example sts-nginx failed, hint no topology key found on CSINode HOT 2
- snapshot-controller的源码在哪里了? HOT 2
- [feature] support project-quota
- [Bug ?] Open-Local storage volume(PVC/PV) judgment with storageclass not found. HOT 1
- image pull failed HOT 2
- three init-job pods cannot evenly distributed to three masters HOT 9
- open-local-init-job stall in Pending state
- IO 限流没有生效 HOT 3
- English documentation HOT 1
- 是否支持动态扩容磁盘 HOT 4
- Implement Storage Capacity tracking HOT 2
- master是否设置为不可调度 HOT 2
- Worker nodes behind NAT, unable to access lvmd 1736 port HOT 6
- open-local seemed not working properly with konnectivity HOT 2
- Support CLONE_VOLUME HOT 3
- What do I need to do to activate surveillance HOT 1
- [feature] Add e2e case with Kind
- [bug?]open-local-lvm does not support the underlying disk expansion capacity HOT 11
- [feature] Could support each PVC exclusively occupying a VG?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from open-local.