Giter VIP home page Giter VIP logo

Comments (17)

tzifudzi avatar tzifudzi commented on June 2, 2024

I took a further look at the code, and I'd like to add an additional observation.

Based on the current implementation, the NodeStageVolume and NodeUnstageVolume methods (in nodeserver.go) utilize a locking mechanism to ensure that only one operation occurs at a time for a given volume. Taking a closer look at the NodePublishVolume method, given the two operations are being carried out quite closely in time, could this issue possibly be mitigated by utilizing a thread-safe mutex lock for the NodePublishVolume operation?

from csi-driver-smb.

andyzhangx avatar andyzhangx commented on June 2, 2024

there is retry if NodePublishVolume failed at first time, will retry work?

from csi-driver-smb.

tzifudzi avatar tzifudzi commented on June 2, 2024

@andyzhangx Because of the difficulty of reproducing the issue, its difficult to say for certain but I will try look deeper into the logs to answer if a retry is attempted and whether its successful.

However for now what I can say, is that it appears a retry is not attempted. The SMB CSI driver tries to recover by unmounting and remounting, but that fails because of an access denied error and the binding of the volume for the pod fails and the entire container creation fails.

from csi-driver-smb.

andyzhangx avatar andyzhangx commented on June 2, 2024

@tzifudzi the two NodePublishVolume operations are for the same pod?

in 16:00:27.749206, the symlink of c:\var\lib\kubelet\pods\aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaaa\volumes\kubernetes.io~csi\something-pv-csi\mount is created, while in 16:00:27.857327, it returns Access is denied error, that means after symlink is created, the smb file share access is not set up immediately, not sure why.

I1208 16:00:27.749206    1664 safe_mounter_windows.go:175] Mount: old name: \var\lib\kubelet\plugins\kubernetes.io\csi\smb.csi.k8s.io\bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb\globalmount. new name: c:\var\lib\kubelet\pods\aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaaa\volumes\kubernetes.io~csi\something-pv-csi\mount
W1208 16:00:27.857327    1664 nodeserver.go:400] ReadDir c:\var\lib\kubelet\pods\aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaaa\volumes\kubernetes.io~csi\something-pv-csi\mount failed with open c:\var\lib\kubelet\pods\aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaaa\volumes\kubernetes.io~csi\something-pv-csi\mount: Access is denied., unmount this directory

from csi-driver-smb.

tzifudzi avatar tzifudzi commented on June 2, 2024

@andyzhangx Thank you for the prompt response.

the two NodePublishVolume operations are for the same pod?

  • I can say with certainty, that the operations are indeed the same pod. They are have the same pod ID and pod name.

that means after symlink is created, the smb file share access is not set up immediately, not sure why

  • Perhaps the symlink is not ready by the time the second call arrives. I am not certain, but my concern is that the short interval between these operations might not allow enough time for the subsequent call to execute correctly. And that there is a possibility that implementing some form of safety check could prevent this scenario.

from csi-driver-smb.

vinayaksakharkar avatar vinayaksakharkar commented on June 2, 2024

@andyzhangx are you planning official release for the below change soon ?
#720 [from andyzhangx/reduce-mount-lock]

from csi-driver-smb.

andyzhangx avatar andyzhangx commented on June 2, 2024

@andyzhangx are you planning official release for the below change soon ? #720 [from andyzhangx/reduce-mount-lock]

@vinayaksakharkar I will cut a new release in the middle of this month.

from csi-driver-smb.

vinayaksakharkar avatar vinayaksakharkar commented on June 2, 2024

@andyzhangx are you planning official release for the below change soon ? #720 [from andyzhangx/reduce-mount-lock]

@vinayaksakharkar I will cut a new release in the middle of this month.

Thank you @andyzhangx

from csi-driver-smb.

Chil49 avatar Chil49 commented on June 2, 2024

@tzifudzi
I might have e similar issue to yours. I am seeing access denied while the pod is trying to start. SMB controller logs shows that node publish volume globalmount was successfully mounted but pod fails saying failed to create containerd task failed to creat shim task failed to eval symlinks for mount source \\foldera\golderb\folderc\folderd access denied.

Can you please suggest something or did you ever manage to find a workaround?

from csi-driver-smb.

vinayaksakharkar avatar vinayaksakharkar commented on June 2, 2024

@Chil49 We haven't tried driver with changes done by @andyzhangx on master branch. we are waiting for @andyzhangx to cut release in the middle of this month. Meantime, you can mount SMB share on node and use host path for mounting it inside pod as work around if that helps.

from csi-driver-smb.

tzifudzi avatar tzifudzi commented on June 2, 2024

@Chil49, the access denied error you're encountering does indicate a potential similarity to the issue I experienced. However, your specific error message differs from mine. While I haven't completely root caused the issue, I suspect that implementing a concurrency lock might help mitigate it. As a temporary workaround, I've found that deleting and then relaunching the pod can resolve the issue. This problem appears to be transient, occurring roughly once in say 500 runs. A retry is usually successful in overcoming this hurdle.

@andyzhangx not sure how you prefer to move forward but I suggest leaving the issue open so people can upvote if they encounter similar errors. Would have tried to help and make the changes but I don't have the free time to take up the task to root cause and resolve.

from csi-driver-smb.

andyzhangx avatar andyzhangx commented on June 2, 2024

per #721 (comment), I don't think adding a lock would help since the above logs shows that the first mount succeeded, and then second mount started.

from csi-driver-smb.

tzifudzi avatar tzifudzi commented on June 2, 2024

Got it @andyzhangx. I can't provide any further troubleshooting information at this point, nor do I have time to experiment with any code changes that could resolve this. So I will leave it up to you regarding whether the issue should remain open or it can be closed.

from csi-driver-smb.

vinayaksakharkar avatar vinayaksakharkar commented on June 2, 2024

@andyzhangx can you please cut release as per your previous comment in the middle of month. We are interested in testing it.
We are facing this issue in our environment. @tzifudzi helped us to raise this issue.

from csi-driver-smb.

vinayaksakharkar avatar vinayaksakharkar commented on June 2, 2024

@andyzhangx Are you still planning cut release for code changes you made it for this issue as per your previous comment. You did mention in the middle of this month.

from csi-driver-smb.

andyzhangx avatar andyzhangx commented on June 2, 2024

@andyzhangx Are you still planning cut release for code changes you made it for this issue as per your previous comment. You did mention in the middle of this month.

yes, we are waiting for this PR merge: kubernetes/k8s.io#6282, and then new image is ready.

from csi-driver-smb.

andyzhangx avatar andyzhangx commented on June 2, 2024

v1.14.0 is already released

from csi-driver-smb.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.