Giter VIP home page Giter VIP logo

z3's Introduction

z3 Build Status

Welcome to z3

z3 is a ZFS to S3 backup tool. This is basically plumbing around zfs send and zfs receive so you should have at least a basic understanding of what those commands do.

z3 was developed by the awesome engineering team at Presslabs, a Managed WordPress Hosting provider.

For more open-source projects, check Presslabs Code.

Usage

z3 status will show you the current state, what snapshots you have on S3 and on the local zfs dataset.

z3 backup perform full or incremental backups of your dataset.

z3 restore restores your dataset to a certain snapshot.

See zfs SUBCOMMAND --help for more info.

Installing

pip install z3

z3 is tested on python 2.7 with latest boto 2 and boto 2.2.2 (python-boto version on Ubuntu 12.04).

Optional dependencies

# Install pv to get some progress indication while uploading.
apt-get install pv

# Install pigz to provide the pigz compressors.
apt-get install pigz

# Install gnupg to provide public-key encryption and compression with gpg.
apt-get install gnupg gnupg-agent

Configuring

Most options can be configured as command line flags, environment variables or in a config file, in that order of precedence. The config file is read from /etc/z3_backup/z3.conf if it exists, some defaults are provided by the tool. BUCKET S3_KEY_ID and S3_SECRET can't be provided on the command line. For a list of all options see z3/sample.conf.

You'll usually want z3 to only backup certain snapshots (hourly/daily/weekly). To do that you can specify a SNAPSHOT_PREFIX (defaults to zfs-auto-snap:daily).

Defaults for SNAPSHOT_PREFIX and COMPRESSOR can be set per filesystem like so:

[fs:tank/spam]
SNAPSHOT_PREFIX=delicious-daily-spam
COMPRESSOR=pigz4

[fs:tank/ham]
SNAPSHOT_PREFIX=weekly-non-spam

Dataset Size, Concurrency and Memory Usage

Since the data is streamed from zfs send it gets read in to memory in chunks. Z3 estimates a good chunk size for you: no smaller than 5MB and large enough to produce at most 9999 chunks. These are S3 limitation for multipart uploads. Here are some example chunk sizes for different datasets:

  • 50 GiB: 5 MiB
  • 500 GIB: 53 MiB
  • 1 TiB: 110 MiB
  • 2 TiB: 220 MiB

Multiply that by CONCURRENCY to know how much memory your upload will use.

Usage Examples

Status

# show global options
z3 --help

# show status of backups for default dataset
z3 status

# show status for other dataset; only snapshots named daily-spam-*
z3 --dataset tank/spam --snapshot-prefix daily-spam- status

Backup

# show backup options
z3 backup --help

# perform incremental backup the latest snapshot; use pigz4 compressor
z3 backup --compressor pigz4 --dry-run
# inspect the commands that would be executed
z3 backup --compressor pigz4

# perform full backup of a specific snapshot
z3 backup --full --snapshot the-part-after-the-at-sign --dry-run
# inspect the commands that would be executed
z3 backup --full --snapshot the-part-after-the-at-sign

Restore

# see restore options
z3 restore --help

# restore a dataset to a certain snapshot
z3 restore the-part-after-the-at-sign --dry-run
# inspect the commands that would be executed
z3 restore the-part-after-the-at-sign

# force rollback of filesystem (zfs recv -F)
z3 restore the-part-after-the-at-sign --force

Encryption

Encryption of stored objects in S3 is normally provided through AWS Key Management Service (KMS). Alternatively, you can use gnupg for public-key encryption by specifying gpg as a COMPRESSOR and the public key to use as GPG_RECIPIENT. Note: compression and crypto algorithms used by gpg are derived from the public key preferences for GPG_RECIPIENT. Here is a usage example:

# inspect the key preferences for z3_backup
#   based on preference order, gpg will use AES256 cipher, and ZLIB compression
gpg --edit-key z3_backup

gpg> showpref
[ultimate] (1). z3_backup
     Cipher: AES256, AES192, AES, 3DES
     Digest: SHA256, SHA384, SHA512, SHA224, SHA1
     Compression: ZLIB, BZIP2, ZIP, Uncompressed
     Features: MDC, Keyserver no-modify

gpg> quit

# the following assumes that you have z3_backup in your gnupg public-key ring
# perform incremental backup the latest snapshot; use gpg compressor
z3 backup --compressor gpg --gpg-recipient z3_backup --dry-run
# after inspectng the commands that would be executed, perform the backup
z3 backup --compressor gpg --gpg-recipient z3_backup

# the following assumes that you have z3_backup in your gnupg private-key ring
# restore a dataset to a certain snapshot
z3 restore the-part-after-the-at-sign --dry-run
# after inspectng the commands that would be executed, perform the restore
z3 restore the-part-after-the-at-sign

Other Commands

Other command line tools are provided.

pput reads a stream from standard in and uploads the data to S3.

z3_ssh_sync a convenience tool to allow you to push zfs snapshots to another host. If you need replication you should checkout zrep. This exists because we've already got zrep between 2 nodes and needed a way to push backups to a 3rd machine.

z3_get called by z3 restore to download a backup.

Development Overview

Running the tests

The test suite uses pytest. Some of the tests upload data to S3, so you need to setup the following environment:

export S3_KEY_ID=""
export S3_SECRET=""
export BUCKET="mytestbucket"

To skip tests that use S3:

py.test --capture=no --tb=native _tests/ -k "not with_s3"

The Data

Snapshots are obtained using zfs send, optionally piped trough a compressor (pigz by default), and finally piped to pput. Incremental snapshots are always handled individually, so if you have multiple snapshots to send since the last time you've performed a backup they get exported as individual snapshots (multiple calls to zfs send -i dataset@snapA dataset@snapB).

Your snapshots end up as individual keys in an s3 bucket, with a configurable prefix (S3_PREFIX). S3 key metadata is used to identify if a snapshot is full (isfull="true") or incremental. The parent of an incremental snapshot is identified with the parent attribute.

S3 and ZFS snapshots are matched by name.

Health checks

The S3 health checks are very rudimentary, basically if a snapshot is incremental check that the parent exists and is healthy. Full backups are always assumed healthy.

If backup/restore encounter unhealthy snapshots they abort execution.

pput

pput is a simple tool with one job, read data from stdin and upload it to S3. It's usually invoked by z3.

Consistency is important, it's better to fail hard when something goes wrong than silently upload inconsistent or partial data. There are few anticipated errors (if a part fails to upload, retry MAX_RETRY times). Any other problem is unanticipated, so just let the tool crash.

TL;DR Fail early, fail hard.

z3's People

Contributors

aacaz avatar amecea avatar bogdanpetrea avatar jgartrel avatar junousi avatar rciorba avatar sitwatchcloudrise avatar vasiioandre avatar vtemian avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

z3's Issues

Doesn't work for (at least) London and Frankfurt regions

With a London bucket:

[root@backup1 ~]# z3 backup --full --dry-run
Traceback (most recent call last):
  File "/usr/bin/z3", line 9, in <module>
    load_entry_point('z3==0.1.14', 'console_scripts', 'z3')()
  File "/usr/lib/python2.7/site-packages/z3/snap.py", line 562, in main
    bucket = boto.connect_s3(s3_key_id, s3_secret, **extra_config).get_bucket(bucket)
  File "/usr/lib/python2.7/site-packages/boto/s3/connection.py", line 471, in get_bucket
    return self.head_bucket(bucket_name, headers=headers)
  File "/usr/lib/python2.7/site-packages/boto/s3/connection.py", line 518, in head_bucket
    response.status, response.reason, body)
boto.exception.S3ResponseError: S3ResponseError: 400 Bad Request

[root@backup1 ~]# export S3_USE_SIGV4="True"

[root@backup1 ~]# z3 backup --full --dry-run
Traceback (most recent call last):
  File "/usr/bin/z3", line 9, in <module>
    load_entry_point('z3==0.1.14', 'console_scripts', 'z3')()
  File "/usr/lib/python2.7/site-packages/z3/snap.py", line 562, in main
    bucket = boto.connect_s3(s3_key_id, s3_secret, **extra_config).get_bucket(bucket)
  File "/usr/lib/python2.7/site-packages/boto/__init__.py", line 140, in connect_s3
    return S3Connection(aws_access_key_id, aws_secret_access_key, **kwargs)
  File "/usr/lib/python2.7/site-packages/boto/s3/connection.py", line 196, in __init__
    "When using SigV4, you must specify a 'host' parameter."
boto.s3.connection.HostRequiredError: BotoClientError: When using SigV4, you must specify a 'host' parameter.

I believe you are suffering from this bug
boto/boto#2916
boto/boto#2741

AttributeError: 'NoneType' object has no attribute 'name'

when running:
root@cobalt:~# z3 backup
I receive an error:
Traceback (most recent call last):
File "/usr/local/bin/z3", line 11, in
sys.exit(main())
File "/usr/local/lib/python2.7/dist-packages/z3/snap.py", line 53, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/z3/snap.py", line 607, in main
dry=args.dry, compressor=compressor, parseable=args.parseable)
File "/usr/local/lib/python2.7/dist-packages/z3/snap.py", line 504, in do_backup
uploaded = pair_manager.backup_incremental(snap_name=snap_name, dry_run=dry)
File "/usr/local/lib/python2.7/dist-packages/z3/snap.py", line 387, in backup_incremental
z_snap.parent.name, z_snap.name),
AttributeError: 'NoneType' object has no attribute 'name'

below is my config:
[main]
FILESYSTEM=zpool0/config
BUCKET=cobalt-backups
S3_KEY_ID=##########
S3_SECRET=######################
COMPRESSOR=pigz4
HOST=s3.wasabisys.com
CONCURRENCY=64
MAX_RETRIES=3
S3_PREFIX=z3-backup/
SNAPSHOT_PREFIX=zfs-auto-snap_hourly

I verified that the aws s3 tooling works and root@cobalt:~# z3 status

backup status for zpool0/config@zfs-auto-snap_hourly* on cobalt-backups/z3-backup/
NAME | PARENT | TYPE | HEALTH | LOCAL STATE | SIZE
zfs-auto-snap_hourly-2018-06-21-1617 | - | missing | - | ok |
zfs-auto-snap_hourly-2018-06-21-1717 | - | missing | - | ok |
zfs-auto-snap_hourly-2018-06-21-1817 | - | missing | - | ok |

seems to work. Any help troubleshooting would be appreciated.

dose this support China region?

I try to test Z3 in China Region, the test is failed cause by the 403 Forbidden.
I replace the bucket with Tokyo region, the test can be success.

Traceback (most recent call last):
File "/usr/local/bin/z3", line 11, in
sys.exit(main())
File "/usr/local/lib/python2.7/dist-packages/z3/snap.py", line 53, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/z3/snap.py", line 587, in main
bucket = boto.connect_s3(s3_key_id, s3_secret, **extra_config).get_bucket(bucket)
File "/usr/local/lib/python2.7/dist-packages/boto/s3/connection.py", line 509, in get_bucket
return self.head_bucket(bucket_name, headers=headers)
File "/usr/local/lib/python2.7/dist-packages/boto/s3/connection.py", line 542, in head_bucket
raise err
boto.exception.S3ResponseError: S3ResponseError: 403 Forbidden

z3: Feature request: allow adding metadata

Hi Team,

is it possible with z3 that we can add metadata to backup, As we have a use case where we upload files of larger size
and we have to update the metadata after upload by copying the file again in s3.
s3 only allows adding metadata on upload and update metadata on existing object by again copying the file in s3.

any thoughts or workaround for this ? as after uploading, copying same object in s3 just to update the metadata is
time consuming operation.

Thanks
Rahul

Rclone?

Is supporting rclone out of scope? It would allow much more than s3.

z3 fails when doing anything other than 'status'!

Python version: 2.7.16
OS: Linux (Debian 10.4)

The following error occurs whether running z3 obtained via 'pip install', (0.2.0) (on the OS and python version specified) or via git clone of the latest repository here on github.com.

root:~# z3 backup --dry-run
Traceback (most recent call last):
  File "/usr/local/bin/z3", line 10, in <module>
    sys.exit(main())
  File "/usr/local/lib/python2.7/dist-packages/z3/snap.py", line 53, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/z3/snap.py", line 607, in main
    dry=args.dry, compressor=compressor, parseable=args.parseable)
  File "/usr/local/lib/python2.7/dist-packages/z3/snap.py", line 504, in do_backup
    uploaded = pair_manager.backup_incremental(snap_name=snap_name, dry_run=dry)
  File "/usr/local/lib/python2.7/dist-packages/z3/snap.py", line 387, in backup_incremental
    z_snap.parent.name, z_snap.name),
AttributeError: 'NoneType' object has no attribute 'name'

Again, I get this error whether using the version installed with 'pip' on my Linux distribution OR via download. I can use 'z3 status' and I see snapshots I have not yet backed up (I haven't used z3 at all, this is my first experience):

-> # z3 status
backup status for mydatastorage/testtank@snap* on MY-S3-BUCKET/z3-backup/
NAME  | PARENT | TYPE    | HEALTH | LOCAL STATE | SIZE
snap1 | -      | missing | -      | ok          |
snap2 | -      | missing | -      | ok          |

My /etc/z3_backup/z3.conf file looks like so:

[main]
BUCKET=MY-S3-BUCKET
S3_KEY_ID=my-key
S3_SECRET=my-secret
CONCURRENCY=4
CHUNK_SIZE=256M
MAX_RETRIES=3
COMPRESSOR=pigz1
S3_STORAGE_CLASS=STANDARD_IA
SNAPSHOT_PREFIX=snap
FILESYSTEM=mydatastorage/testtank

Any ideas what is wrong here?

Thanks

IndexError: list index out of range

Just trying to do a basic backup, I get a mostly unhelpful stacktrace:

[root@pkgrepo1 ecbuild]# z3 backup 
Traceback (most recent call last):
  File "/usr/local/bin/z3", line 9, in <module>
    load_entry_point('z3==0.1.13', 'console_scripts', 'z3')()
  File "/usr/local/lib/python2.7/site-packages/z3/snap.py", line 575, in main
    dry=args.dry, compressor=compressor, parseable=args.parseable)
  File "/usr/local/lib/python2.7/site-packages/z3/snap.py", line 480, in do_backup
    uploaded = pair_manager.backup_incremental(snap_name=snap_name, dry_run=dry)
  File "/usr/local/lib/python2.7/site-packages/z3/snap.py", line 341, in backup_incremental
    z_snap = self._snapshot_to_backup(snap_name)
  File "/usr/local/lib/python2.7/site-packages/z3/snap.py", line 270, in _snapshot_to_backup
    z_snap = self.zfs_manager.get_latest()
  File "/usr/local/lib/python2.7/site-packages/z3/snap.py", line 214, in get_latest
    return self._snapshots.values()[-1]

If I execute the same command that is ultimately called internally, that works:

zfs list -Ht snap -o 'name,used,refer,mountpoint,written'
pkgrepo@zfs-auto-snap_daily-2016-06-01-2022 0   27.2K   -   27.2K
pkgrepo/mrepo@zfs-auto-snap_daily-2016-06-01-2022   75.1K   17.9G   -   17.9G
pkgrepo/mrepo-cache@zfs-auto-snap_daily-2016-06-01-2022 141K    305M    -   305M
pkgrepo/repo@zfs-auto-snap_daily-2016-06-01-2022    709K    81.3G   -   81.3G
pkgrepo/www@zfs-auto-snap_daily-2016-06-01-2022 7.35M   1.67G   -   1.67G

Any suggestions?

pput should have a --file parameter

pput should use a --file parameter instead of --file-descriptor. It the --file is - (the default) it should read from stdin. --file-descriptor is not so unixy.

Written arguments produce errors

Attempting to run z3 on my Solaris test box before rolling to production and getting the error below

root@solaris-test:~# z3 status
backup status for None@zfs-auto-snap:daily* on moose1/z3-backup/
bad property list: invalid property 'written'
For more info, run: zfs help list
Traceback (most recent call last):
File "/usr/bin/z3", line 9, in
load_entry_point('z3==0.2.0', 'console_scripts', 'z3')()
File "/usr/lib/python2.7/site-packages/z3/snap.py", line 53, in wrapper
return func(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/z3/snap.py", line 596, in main
filesystem=args.filesystem)
File "/usr/lib/python2.7/site-packages/z3/snap.py", line 485, in list_snapshots
for s3_snap, z_snap in pair_manager.list():
File "/usr/lib/python2.7/site-packages/z3/snap.py", line 283, in list
for z_snap in self.zfs_manager.list():
File "/usr/lib/python2.7/site-packages/z3/snap.py", line 229, in list
return self._snapshots.values()
File "/usr/lib/python2.7/site-packages/z3/snap.py", line 25, in cacheing_wrapper
val = func(self, *a, **kwa)
File "/usr/lib/python2.7/site-packages/z3/snap.py", line 226, in _snapshots
return self._build_snapshots(self._fs_name)
File "/usr/lib/python2.7/site-packages/z3/snap.py", line 207, in _build_snapshots
fs_snaps = self._parse_snapshots().get(fs_name, {})
File "/usr/lib/python2.7/site-packages/z3/snap.py", line 184, in _parse_snapshots
snap = self._list_snapshots()
File "/usr/lib/python2.7/site-packages/z3/snap.py", line 174, in _list_snapshots
'name,used,refer,mountpoint,written'])
File "/usr/lib/python2.7/subprocess.py", line 219, in check_output
raise CalledProcessError(retcode, cmd, output=output)
subprocess.CalledProcessError: Command '['zfs', 'list', '-Ht', 'snap', '-o', 'name,used,refer,mountpoint,written']' returned non-zero exit status 2

It would appear the issue is centered arround the "written" argument provided.

root@solaris-test:~# zfs list -Ht snap -o name,used,refer,mountpoint,written
bad property list: invalid property 'written'

Remove the "written" argument and the command produces the following argument

root@solaris-test:~# zfs list -Ht snap -o name,used,refer,mountpoint
rpool/ROOT/solaris@install 59.9M 2.55G -
rpool/ROOT/solaris@2019-05-22-20:48:38-219 76.1M 2.63G -
rpool/ROOT/solaris/var@install 140M 314M -
rpool/ROOT/solaris/var@2019-05-22-20:48:38-219 29.9M 204M -

IAM Role Support

Has anyone determined a way to use an IAM role attached to an EC2 instance to provide S3 permissions versus using an IAM user access/secret key? We have requirements to rotate any in use IAM user key and using a role attached to the instance would negate this need.

z3 restore fails when using HOST= config option

When using HOST=s3.us-west-2.amazonaws.com in the configuration file, z3 restore fails.

# z3 restore  zfs-auto-snap:daily
Traceback (most recent call last):
  File "/usr/local/bin/z3_get", line 11, in <module>
    sys.exit(main())
  File "/usr/local/lib/python2.7/dist-packages/z3/get.py", line 25, in main
    s3 = boto3.client('s3', aws_access_key_id=cfg['S3_KEY_ID'], aws_secret_access_key=cfg['S3_SECRET'], **extra_config)
  File "/usr/local/lib/python2.7/dist-packages/boto3/__init__.py", line 91, in client
    return _get_default_session().client(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/boto3/session.py", line 263, in client
    aws_session_token=aws_session_token, config=config)
  File "/usr/local/lib/python2.7/dist-packages/botocore/session.py", line 838, in create_client
    client_config=config, api_version=api_version)
  File "/usr/local/lib/python2.7/dist-packages/botocore/client.py", line 86, in create_client
    verify, credentials, scoped_config, client_config, endpoint_bridge)
  File "/usr/local/lib/python2.7/dist-packages/botocore/client.py", line 328, in _get_client_args
    verify, credentials, scoped_config, client_config, endpoint_bridge)
  File "/usr/local/lib/python2.7/dist-packages/botocore/args.py", line 85, in get_client_args
    client_cert=new_config.client_cert)
  File "/usr/local/lib/python2.7/dist-packages/botocore/endpoint.py", line 261, in create_endpoint
    raise ValueError("Invalid endpoint: %s" % endpoint_url)
ValueError: Invalid endpoint: s3.us-west-2.amazonaws.com

This appears to be an issue with using boto for backup vs. using boto3 for restore:

z3/z3/get.py

Lines 21 to 22 in afbd1d9

if 'HOST' in cfg:
extra_config['endpoint_url'] = cfg['HOST']

Specifically, I think the expectation for HOST= is a hostname, not a url and is ultimately confusing for the user.

Large upload failed - allow custom options for "zfs send"

I have been using z3 for over a month successfully. Thank you! I have recently run into a problem and have a potential enhancement suggestion.

I have a large dataset that is stored with compression on. Recently, full uploads to S3 started failing. The error from S3 indicates the dataset exceeded 5T (6.3T). Two issues:

  1. Firstly, I expected pput to break this into chunks, but it did not seem to do that:
    zfs send 'pool/data@zfs-auto-snap_daily-2018-12-23-0832' | pput --quiet --estimated 6947160277512 --meta size=6947160277512 --meta isfull=true z3-backup/pool/data@zfs-auto-snap_daily-2018-12-23-0832

  2. Secondly the dataset itself, in its compressed state is ~2.6T. So, I modified snap.py code to add "-Lce" to "zfs send". This seems better. The upload is still running.
    zfs send -Lce 'pool/data@zfs-auto-snap_daily-2018-12-23-0832' | pput --quiet --estimated 2655377785352 --meta size=2655377785352 --meta isfull=true z3-backup/pool/data@zfs-auto-snap_daily-2018-12-23-0832

  3. It would be good to allow custom options to be passes to zfs send as part of z3 backup. E.g. --zfs-options "Lce"

Ideas around retention - deleting of old data

Hi guys, I hope I haven't missed reading it somewhere but I can't find how to keep data only for a certain amount of days, e.g.

keep 4 full backups
keep 28 diff backups

As of now z3 will keep backing up without removing older backups, what do you guys use to keep data only for a specific time?

Cheers

error: unrecognized arguments

Not sure why but the argument parsing doesn't seem to work?

# z3 status --filesystem rpool
usage: z3 [-h] [--s3-prefix S3_PREFIX] [--filesystem FILESYSTEM]
          [--snapshot-prefix SNAPSHOT_PREFIX]
          {backup,restore,status} ...
z3: error: unrecognized arguments: --filesystem rpool

but env var does:

# FILESYSTEM=rpool z3 status
backup status for rpool@zfs-auto-snap_daily* on worker-prd03-backups/z3-backup/
NAME                                | PARENT | TYPE | HEALTH | LOCAL STATE | SIZE
zfs-auto-snap_daily-2018-10-22-2327 |        | full | ok     | ok          | 0.04 M

verbosity

Moved from https://github.com/PressLabs/operations/issues/202#issuecomment-179790556 by @zalmanu
Regarding the pput logging I think it would be more clear if it would behave like this:

  • by default it should output what file(s) is going to upload and where (the S3 bucket)
  • when passing the --quiet arg. it should be completely quiet and output nothing (useful when using it as part of other scripts). Right now it shows the upload summary.
  • it should have a -v/verbose arg. which causes it to output what is currently outputting by default: no. of workers, chunk size, parts count, etc.

What do you think @rciorba?

ability to sync filesystems

since zrep is iffy about multiple destinations and we only care about daily snapshots, we need an easy way to send snapshots to the backup machines.

z3 does not support MinIO

I'd like to use z3 with MinIO, but I think it's currently not possible. The main problem is that z3 always uses virtual hosted style adressing while MinIO by default only supports path style addressing.

Technical details

Here's an example of how to use boto3 to access MinIO, but z3 uses an older version of boto. I think with the older it's possible to enable path style addressing by passing calling_format = boto.s3.connection.OrdinaryCallingFormat() to boto.connect_s3(). However, z3 exposes no way to do so.

Plans for python3

Are there plans to port to python3 ?

On debian:bullseye you can "pip install z3" but because it's under python3 you are greeted with:

z3 status
Traceback (most recent call last):
  File "/usr/local/bin/z3", line 5, in <module>
    from z3.snap import main
  File "/usr/local/lib/python3.9/dist-packages/z3/snap.py", line 14, in <module>
    from z3.config import get_config
  File "/usr/local/lib/python3.9/dist-packages/z3/config.py", line 1, in <module>
    import ConfigParser
ModuleNotFoundError: No module named 'ConfigParser'

This is due to the renaming of ConfigParser module to configparser, there could be other issues, I'm curious if anyone is planning to scrub z3 for python3 support?

Mouting ZFS snapshot from S3

The performed ZFS snapshot is not subject to the next modification. Snapshots are a set of blocks that have been modified and can no longer be changed.

Object-oriented storage is optimized for storing data that is "write-once read-many delete-eventually". Snapshots fully meet these assumptions fully.

In the case of Amazon S3, reading data may have comparable speed with reading data from disks that are also internally connected via the network as well.

Have you considered implementing ZFS snapshot mounting directly from S3 to enable backup data preview? If necessary, we can consider a local cache for some blocks. Similiar approach use TrailDB ( http://tech.nextroll.com/blog/data/2016/11/29/traildb-mmap-s3.html ) and MezzFS ( https://medium.com/netflix-techblog/mezzfs-mounting-object-storage-in-netflixs-media-processing-platform-cda01c446ba ) with very interesting result.

It is worth noting that volume snapshots in AWS EBS are mounted on-line and the necessary blocks are downloaded locally only at the time of access ( https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-initialize.html ). Implementing the ability to read backups would open up similar possibilities.

I notice that it will not be possible to read blocks with the compression used. However, I think that the potentially unlimited capacity of snapshots and free access to them offset this limitation, especially since ZFS can support compression and block-level encryption on its own.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.