Giter VIP home page Giter VIP logo

Comments (31)

marcquark avatar marcquark commented on May 24, 2024 1

fwiw, it happened again a few days ago. Is there a way to enable verbose/debug logging?

from directory-connector.

marcquark avatar marcquark commented on May 24, 2024 1

@cscharf apologies for the delayed response. I sent you some details via Email.

Regarding the creation time, we're out of luck. It's after the modify time, i assume that's because the file was copied/moved to another location. We're going to provide the timestamps when it happens the next time around

from directory-connector.

marcquark avatar marcquark commented on May 24, 2024 1
Inode: 2359304   Type: regular    Mode:  0664   Flags: 0x80000
Generation: 2610896047    Version: 0x00000000:00000001
User:  1000   Group:  1000   Project:     0   Size: 33
File ACL: 0
Links: 1   Blockcount: 8
Fragment:  Address: 0    Number: 0    Size: 0
 ctime: 0x5fd18faa:a2182ff8 -- Thu Dec 10 04:02:02 2020
 atime: 0x5fd18faa:c9320eb0 -- Thu Dec 10 04:02:02 2020
 mtime: 0x5fd18faa:a2182ff8 -- Thu Dec 10 04:02:02 2020
crtime: 0x5f6c7438:1681dd58 -- Thu Sep 24 12:26:00 2020
Size of extra inode fields: 32
Inode checksum: 0xeeb33b88
EXTENTS:
(0):5902345

Sent the log file to your support team

from directory-connector.

marcquark avatar marcquark commented on May 24, 2024 1

looks like the release fixed this issue. thanks!

from directory-connector.

kspearrin avatar kspearrin commented on May 24, 2024

Is there something on the system that runs at 0:30AM? Does bwdc run then?

from directory-connector.

marcquark avatar marcquark commented on May 24, 2024

Nope, no other cronjobs or backup jobs.
There's a bwdc cronjob every 15 minutes, so yes, it does run at 0:30AM

from directory-connector.

marcquark avatar marcquark commented on May 24, 2024

happened again on the 30th. timestamp of the empty data.json is 8:15 AM local time. it only contains this

{
  "installedVersion": "2.6.2"
}

from directory-connector.

kspearrin avatar kspearrin commented on May 24, 2024

Can you try the latest version that was just released (2.7.0) and see if the issue persists?

from directory-connector.

marcquark avatar marcquark commented on May 24, 2024

Installed v2.7.0 yesterday. Will report back if it happens again. It's seemingly random though :/ I guess if it doesn't come up in the next 2 months we can assume it's fixed.

from directory-connector.

kspearrin avatar kspearrin commented on May 24, 2024

Hey @marcquark , sorry for the delayed response here.

Does it reset to this same value every time?

  "installedVersion": "2.6.2"
}

Or did the version change with this last reset since upgrading last time?

from directory-connector.

marcquark avatar marcquark commented on May 24, 2024

@kspearrin i didn't have a backup of the last occurrence, so couldn't check. It happened again today and the content of the new file was just

{
  "installedVersion": "2.7.0"
}

from directory-connector.

cscharf avatar cscharf commented on May 24, 2024

Hi @marcquark , was that timestamp similar to the prior ones, at 0:30 AM local time (or what is the last modified timestamp on that file)? Are the created/modified timestamps the same (trying to determine if the file was overwritten OR modified/written to)? I'm setting up a test environment now and taking a look at the BWDC code to see what mechanisms, if any, I can track down that may be at the root cause.

from directory-connector.

marcquark avatar marcquark commented on May 24, 2024

@cscharf it looks like this time it occurred more or less in the middle of the day - at least the last modified date shows 10:44 AM. however that could be misleading. i'm going to double check with my colleague who observed the bug the last time around whether he made any changes to the file besides moving it to another folder. i assume the answer to be "no" though

stat unfortunately doesn't give me a "Birth" date of the file, it just shows a dash. any advice on how to get the creation date on an ext4 filesystem? still on Ubuntu 18.04

i'll make sure to keep an eye on the exact date(s) next time the bug surfaces. also should soon find time to update to the most recent release.

from directory-connector.

cscharf avatar cscharf commented on May 24, 2024

Thanks @marcquark, I appreciate it. I've got an Ubuntu 18.04 VM setup to try to force re-creation of this while I look into some other things. I'm not a whiz-bang-pro at Linux in general but I'll see what I can find as far as finding a creation date on an ext4 filesystem.

from directory-connector.

cscharf avatar cscharf commented on May 24, 2024

@marcquark , apparently in ext4 you are able to get file creation time: https://tecadmin.net/file-creation-time-linux/

from directory-connector.

cscharf avatar cscharf commented on May 24, 2024

Also, wondering if you could share your chron job script (scrubbed/redacted) either here or if you could email it to me as a text file at cscharf at bitwarden I would appreciate it. I'm having a hard time recreating this and I know it's happening at random and while I have a theory of "what" is happening, the "how" is where I think more detail or operational context may help.

from directory-connector.

cscharf avatar cscharf commented on May 24, 2024

@marcquark , I've not been able to reproduce this however I'll be testing another theory tomorrow with some new info. Because I believe this is due to some odd file access issue where the lowdb storage JSON file that's getting reset is being created fresh as if the file didn't exist... perhaps the cron job itself is running longer than 15 minutes and the next cron scheduled kicks off and there's a file access collision (which would explain the randomness of this whole thing), and perhaps the "midnight"ish reset might coincide with some other system updates on the directory where some latency would cause the job to run longer? These are guesses, but I need to get a large directory or something setup tomorrow where I can simulate this scenario to see if file locks or other access issues might be at play where I can test various collision scenarios with an automated cron setup.

In the meantime if you could check your cron or crontab logs/output to see if during those offending times when the file got wiped if there was a job start/end time overlapping each other that would support this theory (before I put in the time to setup those types of experiments).

from directory-connector.

marcquark avatar marcquark commented on May 24, 2024

Hi Chad,

i already checked all the surroundings - ultimately i cannot tell for sure what's going on, and whether or not the problem is related to conflicting runs of the cronjob. By the looks of everything though, it's not. I would expect the behaviour to surface more frequently when the cronjob runs every 1m instead of every 15m. But i already tested that, nothing really changed.

So at the moment there's still a lot of guesswork and uncertainty, and like you say the effort to test a hypothesis is pretty big. IMHO a debug log, ideally with timestamps, is much needed and would most likely help us get to the root of the problem quicker. I'm fine waiting for a bit. So if you see a chance to improve the logging capabilities in an upcoming release, please do prioritize that over solving this particular issue. Once we have better logs i can collect those along with the exact timestamps of the issue appearing...

from directory-connector.

cscharf avatar cscharf commented on May 24, 2024

Excellent and reasonable request, thanks Marc, we can absolutely do that.

from directory-connector.

cscharf avatar cscharf commented on May 24, 2024

Okay, so there is more verbose logging which can be accessed by using the following environment variable: BITWARDENCLI_CONNECTOR_DEBUG=true (pretty self-explanatory, however will return INFO+ level logging, not DEBUG). This will enable output for things primarily done during the directory sync itself, i.e.:

  • "Group search: {path} => {filter}"
  • "Deleted user search: {deletedPath} => {deletedFilter}"
  • "User search: {path} => {filter}"

I dont' see any calls at the DEBUG level, so it should be fine without those, although if you really wanted it (not recommended for prod) you can use the environment variable, BWCLI_ENV=development.

From what I see, there isn't a lot of INFO level logging wired in, so I'll be adding that where it seems to make sense.

from directory-connector.

cscharf avatar cscharf commented on May 24, 2024

If there are any actions or visibility you would like outside of this particular issue, please let me know what/where and I'll make sure to stick it in there really quick while I'm working on this overall.

from directory-connector.

cscharf avatar cscharf commented on May 24, 2024

@marcquark , you should be able to update to the latest version of the bwdc CLI to get the additional logging around the data.json file. This should help pinpoint any issues leading to the original issue. Please let me know if you hit any snags or have any questions.

from directory-connector.

cscharf avatar cscharf commented on May 24, 2024

Make that v2.8.1 which is not yet fully released, my bad, should be able to get it out tomorrow.

from directory-connector.

marcquark avatar marcquark commented on May 24, 2024

@cscharf installed v2.8.1 and wrapped it in a script that adds timestamps to each line. will wait for the error to resurface and report back. thank you for adding this so quickly :-)

from directory-connector.

cscharf avatar cscharf commented on May 24, 2024

Thanks @marcquark , that's actually super helpful, at least now it would seem this is occurring somehow on process.exit(1), where we'll just need to look into the circumstances around that occurrence but more importantly how that's reacting with the lowdb data file when the process is immediately exited. Will start looking into this more.

from directory-connector.

MGibson1 avatar MGibson1 commented on May 24, 2024

@marcquark

I've been trying to force a lowdb collision to hopefully wipe out a test database instance and thus far I've been unsuccessful.

I've started thinking about weird systems issues that may be happening. Does the machine you're running bwdc from use a network attached drive or a separate, perhaps transient partition for the bwdc app directory?

from directory-connector.

marcquark avatar marcquark commented on May 24, 2024

@MGibson1 unfortunately none of that is the case. The setup is very basic, it's a KVM VM with a single qcow2 virtual disk. Ubuntu installation, docker CE and a straightforward default Bitwarden installation. I've created another unprivileged user "bwdc" which runs bwdc. It's all on the root partition.

I tried correlating the times bwdc failed with other things going on in the infrastructure already (network, storage, hypervisors). No luck, i can't find anything interesting going on. The weird thing is it sometimes happens in quick succession, and then it runs fine for weeks or months.

One last thing that springs to mind that i didn't check is whether it could correlate to whenever Bitwarden does its database backups.
Another thing worth trying might be to throttle the IO on your testing system artificially. On KVM it's very easy to limit the maximum IOPS of a given virtual disk. That's a nice and easy way to simulate poor performance scenarios. Maybe tone it down far enough for BWDC to barely work anymore, then hammer away at it by launching it a couple hundred/thousand times in quick succession with a script?

from directory-connector.

MGibson1 avatar MGibson1 commented on May 24, 2024

that's a good idea, I'll see what I can do to slow my system down and try to simulate a low-resource scenario. Those db backups can be pretty intensive so if those correlate there may be some heavy demand on the disk.

from directory-connector.

MGibson1 avatar MGibson1 commented on May 24, 2024

I was still unable to reproduce this despite manipulating my VMs IO and CPU usage.

My best guess right now is the same as @cscharf's that the cron processes are colliding with each other and causing lowdb to wipe out its own database. There's some chatter about this in a lowdb issue here: typicode/lowdb#277.

As an attempt to solve, the above PRs lock the lowdb file for directory connector whenever the database is accessed. There is debug-level logging around locking which may be helpful to ID exactly when the file is cleared if it happens again.

What I hope will happen is the file will be locked and the colliding application will politely wait its turn. There is a logarithmic wait associated with this that should time out after about 1 min and result in a terminal Error being thrown. If that happens, we'll get some stderr output message like failed to get lockfile. If that happens the db should be preserved since it's thrown prior to accessing the file.

Since I couldn't reproduce all of this is theoretical, so I'm going to keep this issue open in case it happens again.

A release with these changes should be out this week.

from directory-connector.

cscharf avatar cscharf commented on May 24, 2024

@marcquark , v2.9.0 is live which includes Matt's changes described above.

from directory-connector.

marcquark avatar marcquark commented on May 24, 2024

thanks guys! i've installed 2.9.0 now. let's see how it goes. it's a pity that it sometimes runs for weeks or even months without issues, sometimes it happens after a few days.

from directory-connector.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.