Giter VIP home page Giter VIP logo

Comments (21)

TheBeatles1994 avatar TheBeatles1994 commented on May 10, 2024 1

These days are Chinese new year, I may be slow to respond...

from open-local.

TheBeatles1994 avatar TheBeatles1994 commented on May 10, 2024 1

I've done more research and here's what I've found.

  1. The kube-scheduler "--policy-config-file" that you're using is deprecated and will be going away in kubernetes 1.23. You will need to migrate to "kube-scheduler --config" as documented here: https://kubernetes.io/docs/reference/scheduling/config/

I haven't tested, but it could be like this:

---
apiVersion: kubescheduler.config.k8s.io/v1beta3
kind: KubeSchedulerConfiguration
extenders:
- urlPrefix: http://open-local-scheduler-extender.kube-system:23000/scheduler
  filterVerb: predicates
  prioritizeVerb: priorities
  preemptVerb: ''
  bindVerb: ''
  weight: 10
  enableHttps: false
  nodeCacheCapable: true
  ignorable: true
profiles:
- pluginConfig:
  - name: InterPodAffinity 
    args:
      apiVersion: kubescheduler.config.k8s.io/v1beta3
      kind: InterPodAffinityArgs
      hardPodAffinityWeight: 10
  1. While k3s doesn't have a pod for kube-scheduler, we have the option of providing arguments to it. So assuming I create a scheduler.yaml in my server with the extenders here, I can do something like:
--kube-scheduler-arg config=/etc/rancher/k3s/scheduler.yaml

Will you consider removing the requirement of the init-job.yaml, and give us the option to configure kube-scheduler ourselves? Perhaps you can have it default to running the init-job, and we can disable if we will configure the scheduler separately?

This works fine, I have tested in my cluster.

apiVersion: kubescheduler.config.k8s.io/v1beta1
kind: KubeSchedulerConfiguration
clientConnection:
  kubeconfig: /etc/kubernetes/scheduler.conf
extenders:
- urlPrefix: http://open-local-scheduler-extender.kube-system:23000/scheduler
  filterVerb: predicates
  prioritizeVerb: priorities
  weight: 10
  ignorable: true
  nodeCacheCapable: true

I have raised a PR #106

from open-local.

TheBeatles1994 avatar TheBeatles1994 commented on May 10, 2024

Sure, you can edit nls resource to use your existing VG.

kubectl edit nls [node name]

apiVersion: csi.aliyun.com/v1alpha1
kind: NodeLocalStorage
spec:
  nodeName: [node name]
  listConfig:
    vgs: 
      include:
      - [your VG name]
      - open-local-pool-[0-9]+

when your existing VG name appears in the status.nodeStorageInfo.volumeGroups of this nls, means you can use this VG to create your Persistent Volume.

from open-local.

murkylife avatar murkylife commented on May 10, 2024

Thanks, ill give this a try!

Is it possible to configure this in the values.yaml? My cluster is defined declaratively and I prefer to keep the state in my repo.

from open-local.

murkylife avatar murkylife commented on May 10, 2024

How about if i do it this way:

  • add crds
  • create nls
  • install helm chart

Will it respect my nls settings? Or will it override.

from open-local.

TheBeatles1994 avatar TheBeatles1994 commented on May 10, 2024

Thanks, ill give this a try!

Is it possible to configure this in the values.yaml? My cluster is defined declaratively and I prefer to keep the state in my repo.

Sure! And contributions to this project are very welcome!

from open-local.

TheBeatles1994 avatar TheBeatles1994 commented on May 10, 2024

How about if i do it this way:

  • add crds
  • create nls
  • install helm chart

Will it respect my nls settings? Or will it override.

No nls will be created or overrided if there is an existing nls.

from open-local.

murkylife avatar murkylife commented on May 10, 2024

Thanks for your help. I was able to get it working after I added the crds from "csi.aliyun.com_nodelocalstorages.yaml", then defined the nls, and after that installed the helm chart.

However, init-jobs are permanently pending due to an affinity issue, as they're set to not run on master. My set up is a single node k3s cluster so I only have a master node.

It appears to be working fine without the init-job, and I've raised a PR to allow disabling them: #105

from open-local.

TheBeatles1994 avatar TheBeatles1994 commented on May 10, 2024

But you need init-job, it will modify kube-scheduler.yaml file of each master node to configure scheduler policy.

from open-local.

murkylife avatar murkylife commented on May 10, 2024

The files and dirs its trying to modify dont exist there on k3s. Whats the scheduler policy needed for? I was able to create and delete pvc without it.

Reading here https://kubernetes.io/docs/reference/scheduling/policies/ it says scheduler policy is deprecated in 1.23, and you should use the kubescheduler manifest https://kubernetes.io/docs/reference/scheduling/config/.

I also dont agree with an installed app making changes to my nodes, i prefer to do this myself if needed.

from open-local.

TheBeatles1994 avatar TheBeatles1994 commented on May 10, 2024

Open-Local is a local disk management system, this means open-local must choose a node from cluster, and consider how many storage capacity in each node can use, and which volume group it will use to create volume and snapshot.

So k8s scheduler must contain storage scheduling algorithm of open-local, we extend filtering and scoring algorithms in extender way, you can see it in init-job file.

"urlPrefix": "http://{{ .Values.extender.name }}.{{.Values.namespace}}:23000/scheduler",
"filterVerb": "predicates",
"prioritizeVerb": "priorities",

from open-local.

murkylife avatar murkylife commented on May 10, 2024

What about if you just have 1 node? Then is that still needed?

from open-local.

TheBeatles1994 avatar TheBeatles1994 commented on May 10, 2024

What about if you just have 1 node? Then is that still needed?

This has nothing to do with the number of nodes. You must configure this otherwise the k8s cluster will have no local storage scheduling capability.

It appears to be working fine without the init-job

Because your k3s cluster only have a master node, and you specify the vg name in storage class, it work fine. But it will work abnormally if the cluster size exceeds one node.

from open-local.

murkylife avatar murkylife commented on May 10, 2024

Thanks. How can i get it working on my single node cluster considering there is a nodetaint which specifies it can only run on non-master? It is stuck in pending.

Im happy to do a pr to address anything, but would appreciate your direction.

Should we add a variable in values.yaml for single node master clusters? And then if true, dont apply that taint?

from open-local.

TheBeatles1994 avatar TheBeatles1994 commented on May 10, 2024

I'm not familiar with k3s. Is your kube-scheduler static pod?

from open-local.

murkylife avatar murkylife commented on May 10, 2024

"k3s bundles the Kubernetes components (kube-apiserver, kube-controller-manager, kube-scheduler, kubelet, kube-proxy) into combined processes that are presented as a simple server and agent model.": https://www.suse.com/c/rancher_blog/introducing-k3s-the-lightweight-kubernetes-distribution-built-for-the-edge/

I haven't done anything with the kube-scheduler till now unfortunately.

from open-local.

murkylife avatar murkylife commented on May 10, 2024

I've done more research and here's what I've found.

  1. The kube-scheduler "--policy-config-file" that you're using is deprecated and will be going away in kubernetes 1.23. You will need to migrate to "kube-scheduler --config" as documented here: https://kubernetes.io/docs/reference/scheduling/config/

I haven't tested, but it could be like this:

---
apiVersion: kubescheduler.config.k8s.io/v1beta3
kind: KubeSchedulerConfiguration
extenders:
- urlPrefix: http://open-local-scheduler-extender.kube-system:23000/scheduler
  filterVerb: predicates
  prioritizeVerb: priorities
  preemptVerb: ''
  bindVerb: ''
  weight: 10
  enableHttps: false
  nodeCacheCapable: true
  ignorable: true
profiles:
- pluginConfig:
  - name: InterPodAffinity 
    args:
      apiVersion: kubescheduler.config.k8s.io/v1beta3
      kind: InterPodAffinityArgs
      hardPodAffinityWeight: 10
  1. While k3s doesn't have a pod for kube-scheduler, we have the option of providing arguments to it. So assuming I create a scheduler.yaml in my server with the extenders here, I can do something like:
--kube-scheduler-arg config=/etc/rancher/k3s/scheduler.yaml

Will you consider removing the requirement of the init-job.yaml, and give us the option to configure kube-scheduler ourselves? Perhaps you can have it default to running the init-job, and we can disable if we will configure the scheduler separately?

from open-local.

Seryta avatar Seryta commented on May 10, 2024

I modified the parameters and successfully run the k3s service. I am a single master and two node cluster.
open-local version: v0.3.3-dev

# After restarting the k3s service with --kube-scheduler-arg config, the k3s service will be restarted repeatedly
# But the policy-config-file is OK

--kube-scheduler-arg policy-config-file=/etc/rancher/k3s/scheduler.json

scheduler.json:

{
"kind" : "Policy",
"apiVersion" : "v1",
"extenders" : [{
   "urlPrefix": "http://{{ .Values.extender.name }}.{{.Values.namespace}}:23000/scheduler",
   "filterVerb": "predicates",
   "prioritizeVerb": "priorities",
   "preemptVerb": "",
   "bindVerb": "",
   "weight": 10,
   "enableHttps": false,
   "nodeCacheCapable": true,
   "ignorable": true
   }],
   "hardPodAffinitySymmetricWeight" : 10
}
❯ kubectl get nls -ojson master1 | jq .status.filteredStorageInfo
{
  "devices": [
    "/dev/vdf"
  ],
  "updateStatusInfo": {
    "lastUpdateTime": "2022-02-09T07:07:49Z",
    "updateStatus": "accepted"
  },
  "volumeGroups": [
    "open-local-pool-0"
  ]
}

❯ kubectl get nls -ojson node1 | jq .status.filteredStorageInfo
{
  "devices": [
    "/dev/vdf"
  ],
  "updateStatusInfo": {
    "lastUpdateTime": "2022-02-09T07:07:48Z",
    "updateStatus": "accepted"
  },
  "volumeGroups": [
    "open-local-pool-0"
  ]
}

❯ kubectl get nls -ojson node2 | jq .status.filteredStorageInfo
{
  "devices": [
    "/dev/vdf"
  ],
  "updateStatusInfo": {
    "lastUpdateTime": "2022-02-09T07:07:47Z",
    "updateStatus": "accepted"
  },
  "volumeGroups": [
    "open-local-pool-0"
  ]
}

However, the disk cannot be created normally:

master1 [~]$ kubectl describe pvc -n demo pvc-open-local-lvm-open-local-lvm-0-d1
Name:          pvc-open-local-lvm-open-local-lvm-0-d1
Namespace:     demo
StorageClass:  open-local-lvm
Status:        Pending
Volume:        
Labels:        liveit100.com/app=qvm
               liveit100.com/pvc=open-local-lvm-0
Annotations:   <none>
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      
Access Modes:  
VolumeMode:    Filesystem
Used By:       <none>
Events:
  Type    Reason                Age                From                         Message
  ----    ------                ----               ----                         -------
  Normal  WaitForFirstConsumer  14s (x5 over 67s)  persistentvolume-controller  waiting for first consumer to be created before binding

master1 [~]$ kubectl describe pvc -n demo pvc-open-local-device-hdd-open-local-device-hdd-0-d1
Name:          pvc-open-local-device-hdd-open-local-device-hdd-0-d1
Namespace:     demo
StorageClass:  open-local-device-hdd
Status:        Pending
Volume:        
Labels:        liveit100.com/app=qvm
               liveit100.com/pvc=open-local-device-hdd-0
Annotations:   <none>
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      
Access Modes:  
VolumeMode:    Filesystem
Used By:       <none>
Events:
  Type    Reason                Age                From                         Message
  ----    ------                ----               ----                         -------
  Normal  WaitForFirstConsumer  8s (x8 over 101s)  persistentvolume-controller  waiting for first consumer to be created before binding

from open-local.

Seryta avatar Seryta commented on May 10, 2024

But I use kubectl apply - f sts-nginx.yaml successfully ran pod...

❯ kubectl get pod                                                
NAME          READY   STATUS    RESTARTS   AGE
nginx-lvm-0   1/1     Running   0          5m13s
❯ kubectl get pvc
NAME                    STATUS   VOLUME                                       CAPACITY   ACCESS MODES   STORAGECLASS     AGE
html-nginx-lvm-0        Bound    local-595ce706-4ecf-4689-a020-d47fe1a0b1f5   30Gi       RWO            open-local-lvm   5m17s

❯ kubectl describe pvc html-nginx-lvm-0                          
Name:          html-nginx-lvm-0
Namespace:     default
StorageClass:  open-local-lvm
Status:        Bound
Volume:        local-595ce706-4ecf-4689-a020-d47fe1a0b1f5
Labels:        app=nginx-lvm
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               volume.beta.kubernetes.io/storage-provisioner: local.csi.aliyun.com
               volume.kubernetes.io/selected-node: node2
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      30Gi
Access Modes:  RWO
VolumeMode:    Filesystem
Used By:       nginx-lvm-0
Events:
  Type     Reason                 Age                    From                                                             Message
  ----     ------                 ----                   ----                                                             -------
  Normal   WaitForFirstConsumer   6m42s                  persistentvolume-controller                                      waiting for first consumer to be created before binding
  Warning  ProvisioningFailed     6m5s (x6 over 6m41s)   local.csi.aliyun.com_node2_38d1a19f-5a3a-48cf-8c73-d6da1f0bd18d  failed to provision volume with StorageClass "open-local-lvm": rpc error: code = Internal desc = Parse lvm part schedule info error: rpc error: code = InvalidArgument desc = lvm schedule with error Post "http://open-local-scheduler-extender:23000/apis/scheduling/default/persistentvolumeclaims/html-nginx-lvm-0?volumeType=LVM&nodeName=node2": dial tcp 10.43.38.189:23000: connect: connection refused
  Normal   ExternalProvisioning   5m47s (x6 over 6m42s)  persistentvolume-controller                                      waiting for a volume to be created, either by external provisioner "local.csi.aliyun.com" or manually created by system administrator
  Normal   Provisioning           5m33s (x7 over 6m42s)  local.csi.aliyun.com_node2_38d1a19f-5a3a-48cf-8c73-d6da1f0bd18d  External provisioner is provisioning volume for claim "default/html-nginx-lvm-0"
  Normal   ProvisioningSucceeded  5m33s                  local.csi.aliyun.com_node2_38d1a19f-5a3a-48cf-8c73-d6da1f0bd18d  Successfully provisioned volume local-595ce706-4ecf-4689-a020-d47fe1a0b1f5

Maybe I need some time to find out why

from open-local.

murkylife avatar murkylife commented on May 10, 2024

Because the pvc is set to waitforfirstconsumer, it has to be used/referenced a pod before it gets created.

from open-local.

Seryta avatar Seryta commented on May 10, 2024

Because the pvc is set to waitforfirstconsumer, it has to be used/referenced a pod before it gets created.

Yes, you are right. However, the main reason I'm here is that there is a problem with a service installed before, which leads to the failure of the pod to start normally, so PVC has been waiting.

Now, I have no problem here.

from open-local.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.