Giter VIP home page Giter VIP logo

external-health-monitor's Introduction

Volume Health Monitor

The Volume Health Monitor is part of Kubernetes implementation of Container Storage Interface (CSI). It was introduced as an Alpha feature in Kubernetes v1.19. In Kubernetes 1.21, a second Alpha was done due to a design change.

Overview

The Volume Health Monitor is implemented in two components: External Health Monitor Controller and Kubelet.

When this feature was first introduced in Kubernetes 1.19, there was an External Health Monitor Agent that monitors volume health from the node side. In the Kubernetes 1.21 release, the node side volume health monitoring logic was moved to Kubelet to avoid duplicate CSI RPC calls.

  • External Health Monitor Controller:

    • The external health monitor controller will be deployed as a sidecar together with the CSI controller driver, similar to how the external-provisioner sidecar is deployed.
    • Trigger controller RPC to check the health condition of the CSI volumes.
    • The external controller sidecar will also watch for node failure events. This component can be enabled via a flag.
  • Kubelet:

    • In addition to existing volume stats collected already, Kubelet will also check volume's mounting conditions collected from the same CSI node RPC and log events to Pods if volume condition is abnormal.

The Volume Health Monitoring feature need to invoke the following CSI interfaces.

  • External Health Monitor Controller:
    • ListVolumes (If both ListVolumes and ControllerGetVolume are supported, ListVolumes will be used)
    • ControllerGetVolume
  • Kubelet:
    • NodeGetVolumeStats
    • This feature in Kubelet is controlled by an Alpha feature gate CSIVolumeHealth.

Compatibility

This information reflects the head of this branch.

Compatible with CSI Version Container Image
CSI Spec v1.3.0 registry.k8s.io/sig-storage.csi-external-health-monitor-controller

Driver Support

Currently, the CSI volume health monitoring interfaces are only implemented in the Mock Driver and the CSI Hostpath driver.

Usage

External Health Monitor Controller needs to be deployed with CSI driver.

Alpha feature gate CSIVolumeHealth needs to be enabled for the node side monitoring to take effect.

Build && Push Image

You can run the command below in the root directory of the project.

make container GOFLAGS_VENDOR=$( [ -d vendor ] && echo '-mod=vendor' )

And then, you can tag and push the csi-external-health-monitor-controller image to your own image repository.

docker tag csi-external-health-monitor-controller:latest <custom-image-repo-addr>/csi-external-health-monitor-controller:<custom-image-tag>

External Health Monitor Controller

cd external-health-monitor
kubectl create -f deploy/kubernetes/external-health-monitor-controller

You can run kubectl get pods command to confirm if they are deployed on your cluster successfully.

Check logs of external health monitor controller as follows:

  • kubectl logs <leader-of-external-health-monitor-controller-container-name> -c csi-external-health-monitor-controller

Check if there are events on PVCs or Pods that report abnormal volume condition when the volume you are using is abnormal.

csi-external-health-monitor-controller-sidecar-command-line-options

Important optional arguments that are highly recommended to be used

  • leader-election: Enables leader election. This is useful when there are multiple replicas of the same external-health-monitor-controller running for one CSI driver. Only one of them may be active (=leader). A new leader will be re-elected when the current leader dies or becomes unresponsive for ~15 seconds.

  • leader-election-namespace <namespace>: The namespace where the leader election resource exists. Defaults to the pod namespace if not set.

  • leader-election-lease-duration <duration>: Duration, in seconds, that non-leader candidates will wait to force acquire leadership. Defaults to 15 seconds.

  • leader-election-renew-deadline <duration>: Duration, in seconds, that the acting leader will retry refreshing leadership before giving up. Defaults to 10 seconds.

  • leader-election-retry-period <duration>: Duration, in seconds, the LeaderElector clients should wait between tries of actions. Defaults to 5 seconds.

  • http-endpoint: The TCP network address where the HTTP server for diagnostics, including metrics and leader election health check, will listen (example: :8080 which corresponds to port 8080 on local host). The default is empty string, which means the server is disabled.

  • metrics-path: The HTTP path where prometheus metrics will be exposed. Default is /metrics.

  • worker-threads: Number of worker threads for running volume checker when CSI Driver supports ControllerGetVolume, but not ListVolumes. The default value is 10.

Other recognized arguments

  • kubeconfig <path>: Path to Kubernetes client configuration that the external-health-monitor-controller uses to connect to the Kubernetes API server. When omitted, default token provided by Kubernetes will be used. This option is useful only when the external-health-monitor-controller does not run as a Kubernetes pod, e.g. for debugging.

  • resync <duration>: Internal resync interval when the monitor controller re-evaluates all existing resource objects that it was watching and tries to fulfill them. It does not affect re-tries of failed calls! It should be used only when there is a bug in Kubernetes watch logic. The default is ten mintiues.

  • csiAddress <path-to-csi>: This is the path to the CSI Driver socket inside the pod that the external-health-monitor-controller container will use to issue CSI operations (/run/csi/socket is used by default).

  • version: Prints the current version of external-health-monitor-controller.

  • timeout <duration>: Timeout of all calls to CSI Driver. It should be set to value that accommodates the majority of ListVolumes, ControllerGetVolume calls. 15 seconds is used by default.

  • list-volumes-interval <duration>: Interval of monitoring volume health condition by invoking the RPC interface of ListVolumes. You can adjust it to change the frequency of the evaluation process. Five mintiues by default if not set.

  • enable-node-watcher <boolean>: Enable node-watcher. node-watcher evaluates volume health condition by checking node status periodically.

  • monitor-interval <duration>: Interval of monitoring volume health condition when CSI Driver supports ControllerGetVolume, but not ListVolumes. It is also used by nodeWatcher. You can adjust it to change the frequency of the evaluation process. One minute by default if not set.

  • volume-list-add-interval <duration>: Interval of listing volumes and adding them to the queue when CSI driver supports ControllerGetVolume, but not ListVolumes.

  • node-list-add-interval <duration>: Interval of listing nodes and adding them. It is used together with monitor-interval and enable-node-watcher by nodeWatcher.

  • metrics-address: (deprecated) The TCP network address where the Prometheus metrics endpoint will run (example: :8080, which corresponds to port 8080 on local host). The default is the empty string, which means the metrics and leader election check endpoint is disabled.

Community, discussion, contribution, and support

Learn how to engage with the Kubernetes community on the community page.

You can reach the maintainers of this project at:

Code of conduct

Participation in the Kubernetes community is governed by the Kubernetes Code of Conduct.

external-health-monitor's People

Contributors

amolmote avatar andyzhangx avatar animeshk08 avatar bells17 avatar chrishenzie avatar cyb70289 avatar ddebroy avatar dependabot[bot] avatar fengzixu avatar humblec avatar ialidzhikov avatar jsafrane avatar k8s-ci-robot avatar kazimsarikaya avatar mauriciopoppe avatar mowangdk avatar msau42 avatar mucahitkurt avatar namrata-ibm avatar nickrenren avatar nikhita avatar pohly avatar raunakshah avatar saad-ali avatar sneha-at avatar spiffxp avatar sunnylovestiramisu avatar windayski avatar xing-yang avatar zhucan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

external-health-monitor's Issues

Investigate the "similar event collapsing" feature on the event broadcaster

There is a monitor interval for the controller and one for the agent to control how often
to check the volume health. It is configurable with 1 minute as default. Will consider
changing it to 5 minutes by default to avoid overloading the K8s API server.

When scaled out across many nodes, low frequency checks can still produce high volumes of
events. To control this, we should use options on the eventrecorder to control QPS per key.
This way we can collapse keys and have a slow update cadence per key.

Switch from k8s.gcr.io to registry.k8s.io

From kubernetes-announce:
"On the 3rd of April 2023, the old registry k8s.gcr.io will be frozen and no further images for Kubernetes and related subprojects will be pushed to the old registry.

This registry registry.k8s.io replaced the old one and has been generally available for several months. We have published a blog post about its benefits to the community and the Kubernetes project."

Add command line arguments to configure leader election options in external-health-monitor-controller

The leader election package specified 3 variables that are related to the time interval of acquiring and renewing leases. We should allow these options to be configurable in external-health-monitor-controller. See detailed description in the following issue:
kubernetes-csi/external-provisioner#556

See a PR that addressed this:
kubernetes-csi/external-provisioner#643

We need to make the changes here:
https://github.com/kubernetes-csi/external-health-monitor/blob/master/cmd/csi-external-health-monitor-controller/main.go

Release external-health-monitor to v0.1.1

Reason

When I integrated the external-health-monitor with host-path-driver, I found its release script to get the rbac yaml files of csi components based on image tag. Because external-health-monitor has been released as v0.1.0, it will take https://raw.githubusercontent.com/kubernetes-csi/external-health-monitor/v0.1.0/deploy/kubernetes/external-health-monitor-controller/rbac.yaml

But, there is a bug in rbac file of external-health-monitor-controller. It was fixed in this commit: 78fb385#diff-4f92853217f76a2e31d2f601fcf8a532c199a96999ed40318cb2c6a93c7d9604L38

So, I want to do a new release for resolving the issue I encountered

include additional context in NodeGetVolumeStats method to help volume_condition logic

I'd love to see the readonly attribute of the publish request passed into this method. A specific use-case I'm addressing is when an iscsi connection drops and then comes back online the mount becomes ro instead of rw. However without knowing specifically if it supposed to be ro I can for certain say if it's unhealthy or not.

In my case I support staging so on the same node it's theoretically possible (at least from a csi perspective...maybe not k8s) to have the same volume published in 1 case ro and another case rw. I support this by staging rw and then doing a bind mount with ro if appropriate. Not sure if the stats method gets called on the staged target or the published target but that could make the logic more complex.

Thanks!

when use Block PV, staging_target_path is error

My k8s version is 1.20.11, still need to deploy csi-external-health-monitor-agent service.

When I use block type PV, the staging_target_path address provided by kubelet is /var/lib/kubelet/plugins/kubernetes.io/csi/volumeDevices/staging/pvc-9fb798fe-7e26-4488-85c4-4c8e660f7351

The address provided by csi-external-health-monitor-agent during monitoring is /var/lib/kubelet/plugins/kubernetes.io/csi/volumeDevices/staging/pvc-9fb798fe-7e26-4488-85c4-4c8e660f7351

Update README with support matrix and flags

We need to update README with the following information:

Add a compatibility matrix as follows:
https://github.com/kubernetes-csi/external-snapshotter#compatibility

Compatible with CSI Version Container Image Min K8s Version Recommended K8s Version
CSI Spec v1.3.0 k8s.gcr.io/sig-storage/csi-external-health-monitor-controller 1.19 1.19
CSI Spec v1.3.0 k8s.gcr.io/sig-storage/csi-external-health-monitor-agent 1.19 1.19

Document all the flags as follows:
https://github.com/kubernetes-csi/external-snapshotter#snapshot-controller-command-line-options
https://github.com/kubernetes-csi/external-snapshotter#csi-external-snapshotter-sidecar-command-line-options
We should have sections for csi-external-health-monitor-controller-sidecar-command-line-options and csi-external-health-monitor-agent-sidecar-command-line-options.

"ControllerGetVolume" reports "Not Found" is invalid with local storage backend

Reproduce:

  1. deploy the external-provisioner in election-leader mode and with replicatins >=2.
  2. disable "ControllerListVolume" capability in hostpath driver.
  3. deploy the hostpath driver in daemonset.
  4. use "ControllerGetVolume" to get volume'status.
  5. deploy the healthe monitor sidecar with election-leader mode.
  6. create pvc in the local.

Fix:

  1. if the backend storage is local, we should filter the pv with node name. Only monitor the pv in the same node with the sidecar

Suggestion:

  1. If backend storege is local disk, Not only "COntrollerGetVolume" or "ControllerListVolume" should monitor the local volume that with the monitor sidecar in the same node.

CC @xing-yang

Record an event when an error occurs in the health monitoring controller and agent

Currently an event will be recorded on pvc/pod when the controller/agent has successfully retrieved an abnormal volume condition from the storage system. However when other errors occur in the controller/agent, the errors will be logged but not recorded as events. Before moving to beta, the controller/agent should be modified to record an event when other errors occur.

the change of pv status due to the pv health monitoring not work

What happened?

when the state of pv goes from bound->available->bound, heath-monitor sidecar could not continued to monitoring pv health

What did you expect to happen?

pv health monitoring always work when pv status is bound

How can we reproduce it (as minimally and precisely as possible)?

  1. external-health-monitor work with csi plugin which not support supportListVolumes and support controlVolumeStatus
  2. create a pvc and then provisioing pv success, and reclaimPolicy=Retain in sc
  3. in some case, need delete pvc and delete spec.claimRef in pv, and the pv status is change to available
  4. create the deleted pvc, and the pvc status is change to bound
  5. mock to the volume is not health, but the external-health-monitor is not event abnormal event

the case is exist, for example when resize the volume capacity failure, need to delete pvc and create again to resume the status of pvc and pv.

Anything else we need to know?

in pkg/controller/pv_monitor_controller.go:229, get pv from ctrl.pvQueue to check. when pv.Status.Phase is not bound, the ctrl.pvEnqueued[pv] is always true. so the pv not be re-enqueue ctrl.pvQueue

env version

kubernetes is v1.28.0
external-health-monitor is v0.9.0

/kind bug
/assign

Only watch Pods and Nodes when Node Watcher is enabled

In the external-health-monitor-controller, we always watch all PVCs, Pods, Nodes:
https://github.com/kubernetes-csi/external-health-monitor/blob/v0.3.0/cmd/csi-external-health-monitor-controller/main.go#L210

We need to watch them to support the Node Watcher functionality which is disabled by default. This has caused scalability problems.
https://github.com/kubernetes-csi/external-health-monitor/blob/v0.3.0/cmd/csi-external-health-monitor-controller/main.go#L66

We should change the code to only watch Pods and Nodes when the Node Watcher component is enabled.

kubernetes/kubernetes#102452 (comment)

expose volume health abnormalities as prometheus metrics

We are planning to implement the volume health interface within our csi plugins. Since the volume abnormality signals are exposed as events on pvc (from controller) or pods (from node agent), they can’t be exposed as prometheus metrics since kube-state-metrics doesn’t support exporting k8s event resources due to high cardinality (no mention of events in documentation). Adding up pv failure metrics to prometheus are essential for setting up alerts in multi-tenant k8s cluster.

Is it possible to expose these volume health abnormalities as prometheus metrics from external-health-monitor controller and node agent itself?

Initial Thread:
https://kubernetes.slack.com/archives/C09QZFCE5/p1620665651243600

don't requeue notfound volumes

Hello,

At the line https://github.com/kubernetes-csi/external-health-monitor/blob/master/pkg/agent/pv_monitor_agent.go#L209 is incorrect. If the error is at line https://github.com/kubernetes-csi/external-health-monitor/blob/master/pkg/agent/pv_monitor_agent.go#L204 is not nil and it is not found, we should not re-queue checking.

Edit: Why?
When cronjob or job completed, the pod will be exists on the node but it's volumes were unpublished. The code still tries to get stats for the completed pods. however there is no volume which is mounted.

worker threads mechanism does not allow for small number of volumes

The external-health-monitor's worker thread mechanism (10 by default) is blind to the number of volumes. If there are fewer than 10 volumes then there is a burst of repeated probes of the same volume: if there is only one then there is a rapid fire succession of 10 probes of the same volume every minute.

This is not a high priority issue.

VolumePath for Block PV is incorrect

when use Block PV, this error would occur:

E0901 10:33:25.153213       1 server.go:110] GRPC error: rpc error: code = NotFound 
desc = Could not get file information
 from /var/lib/kubelet/pods/a38a090d-deb6-457b-be20-dd3d5be90fad/volumes/kubernetes.io~csi/pvc-16d05a9d-d67a-45d5-8ac6-7c06778a9150/mount: 
stat /var/lib/kubelet/pods/a38a090d-deb6-457b-be20-dd3d5be90fad/volumes/kubernetes.io~csi/pvc-16d05a9d-d67a-45d5-8ac6-7c06778a9150/mount:
 no such file or directory

This is caused by CheckNodeVolumeStatus only consider FileSystem PV, but Block PV have different path.

Broken Link of `contributor cheat sheet` need to fix

Bug Report

I have observed same kind of issue in various kubernetes-csi project.
this happens because after the localization there are too much modifications done in the various directories.
I have observed same issue in this page also.

It has one broken link of the contributes cheat sheet which needs to fix.
I will try to look in further csi repo as well and try to fix it as soon as I can

/kind bug
/assign

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.