Is your feature request related to a problem? Please describe. Du

<a class="commit-link" data-hovercard-type="commit" data-hovercard-url="https://github

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="22

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Thanks <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

Dynamic backup file naming about rqlite HOT 9 CLOSED

peter-fe commented on June 29, 2024

Dynamic backup file naming

from rqlite.

Comments (9)

otoolep commented on June 29, 2024 1

How would that get influenced by daylight saving time/ summer time? It's UTC anyways I guess?

Yeah, would be UTC. Need to update the code actually.

from rqlite.

otoolep commented on June 29, 2024 1

a777e1b

from rqlite.

otoolep commented on June 29, 2024

I think that naming convention (or something very similar) makes sense.

One question for your @peter-fe -- if you look at the Restore functionality (see https://rqlite.io/docs/guides/backup/#restoring-from-cloud-storage) it requires an explicit path name. Would you expect this to change somehow too? To pull down the latest version if there are multiple files with a timestamped name?

In other words how would your "restore" workflow operate if you enable "dynamic" naming?

from rqlite.

peter-fe commented on June 29, 2024

Hi @otoolep,

Thank you for the quick response!
Good question, and maybe no perfect answer I can offer, but some options we can discuss.

Option 1

Technically, to not break current behaviour, using the last backup would work.

In that case I would completely ignore the dynamic part, assuming it could potentially be more generic like a date, epoch, counter, ... if not strictly thinking of a timestamp.
So I'd spare out the dynamic part and opt for the 'Last-Modified' object metadata if possible.

e.g.:
mypath/mydatabase_{{datetime}}.sqlite.gz => would become mypath/mydatabase_*.sqlite.gz for listing and picking the latest modified datzt
This way it would not be needed to rely on the dynamic part for sorting.

Option 2

In my opinion it would make more sense to be stricter when dynamic file names come into play, and not even allow those in auto recovery cases, only fully specified files e.g.: check this on parsing the config if there are any placeholders.

TL;DR

Assuming that the last backup will be the right choice would not fit the logical corrupted DB use cases, thinking it can already be part of the backup.

From my experience it mostly boils down to PITR use cases where I exactly define which backup file to use, may it be the file from last week, ...

In any case, also doing workarounds using features like versioning would require some additional effort in case of recovery, getting the version of choice replacing the fixed named file with the version you have chosen.

Does that make sense?
Happy to discuss further options or iterate over existing ones.

from rqlite.

otoolep commented on June 29, 2024

#1773 should address this. However I'm proposing the following filename format:

`mypath/20240502114500_mydatabase.sqlite.gz`.

In addition timestamping would be able by setting a flag the auto-backup config, not by a variable in the filename itself.

Any objections @peter-fe ?

from rqlite.

peter-fe commented on June 29, 2024

@otoolep, no objections.

Will fit most use cases I guess, sticking with timestamps as a fixed entropy.

Doing a fixed format, may limit some edge cases, but avoiding unexpected input is far more beneficial :)

How would that get influenced by daylight saving time/ summer time? It's UTC anyways I guess?

from rqlite.

jtackaberry commented on June 29, 2024

Though it would be possible to overcome this (partly) by using versions on S3 (where supported) to keep a longer history in case of logical corruption to jump back to a valid backup, the permission aspect will stay in place.

@peter-fe can you elaborate more on what you mean by "the permission aspect will stay in place"? What object storage are you using?

I ask because the bucket versioning approach seems comprehensive to me, to support WORM style backups where versions are immutable from rqlite's perspective.

I tested this policy:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket",
                "s3:PutObject",
                "s3:GetObject"
            ],
            "Resource": [
                "arn:aws:s3:::rqlite",
                "arn:aws:s3:::rqlite/*"
            ]
        }
    ]
}

Notably, there are no permissions for s3:DeleteObject or s3:DeleteObjectVersion (and, tangentially, s3:PutLifecycleConfiguration, which would allow a bypass). rqlite is happy with this: new backup uploads succeed and create a new version of the object. And the user is not able to delete anything.

So I'm mainly struggling to understand the use case. I think perhaps your "where supported" comment (in reference to bucket versioning) is a good clue, which is why I'm curious what object storage you're using such that bucket versioning and a more restrictive IAM policy doesn't solve for your use case.

from rqlite.

peter-fe commented on June 29, 2024

@jtackaberry, you are right I was referring to bucket versioning. The "where supported" part was just to indicate my lack of knowledge, if this is something all flavours of S3 support similarly (AWS, Azure, GCP, Minio, ...)
I was testing on GCS (Google Cloud Storage) using HMAC keys and the S3 compatibility mode.
Using permissions to list, read and create objects, like in the AWS policy.
Here the second backup failed using the same file name, as no delete permission was given.

Tbho the versioning option was something I was thinking about while writing the issue, not giving it a second test run, as I was assuming the delete is producing a new version, not a subsequent write without explicit delete permission.

From the GCS docs

Note: In order to replace existing objects, both storage.objects.create and storage.objects.delete permissions are required

But anyways would it be that bad to have it more explicit as part of the filename, not being bound to S3 versioning and may have a more streamlined access to those files opposed to restore from a version? Many vendors seem to follow this approach having some timestamp in their backup files.

If versioning is the proposed way, and works everywhere equally, feel free to decline the FR and just extend the docu, all fine for me as well.

from rqlite.

jtackaberry commented on June 29, 2024

Thanks @peter-fe. I'm entirely unfamiliar with GCS so I'm quite happy to accept that it works differently from AWS S3, Wasabi, and Minio, which is what I tested with.

If GCS needs delete permissions even to replace an existing object even with versioning enabled, your use case is pretty clear, and the feature makes perfect sense. Thanks!

from rqlite.

Dynamic backup file naming about rqlite HOT 9 CLOSED

Comments (9)

Option 1

Option 2

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent