Giter VIP home page Giter VIP logo

volumerize's Introduction

Blacklabelops Volumerize

Circle CI Open Issues Stars on GitHub Docker Stars Docker Pulls

Try in PWD

Blacklabelops backup and restore solution for Docker volume backups. It is based on the command line tool Duplicity. Dockerized and Parameterized for easier use and configuration.

This is not a tool that can clone and backup data from running databases. You should always stop all containers running on your data before doing backups. Always make sure you're not a victim of unexpected data corruption.

Also note that the easier the tools the easier it is to lose data! Always make sure the tool works correct by checking the backup data itself, e.g. S3 bucket. Check the configuration double time and enable some check options this image offers. E.g. attaching volumes read only.

Features:

  • Multiple Backends
  • Cron Schedule
  • Start and Stop Containers

Supported backends:

  • Filesystem
  • Amazon S3
  • DropBox
  • Google Drive
  • ssh/scp
  • rsync

and many more: Duplicity Supported Backends

Volume Backups Tutorials

Docker Volume Backups on:

Backblaze B2: Readme

Amazon S3: Readme

Dropbox: Readme

Google Drive: Readme

Mega: Readme

Make It Short

You can make backups of your Docker application volume just by typing:

$ docker run --rm \
    --name volumerize \
    -v yourvolume:/source:ro \
    -v backup_volume:/backup \
    -v cache_volume:/volumerize-cache \
    -e "VOLUMERIZE_SOURCE=/source" \
    -e "VOLUMERIZE_TARGET=file:///backup" \
    blacklabelops/volumerize backup

Hooks up your volume with the name yourvolume and backups to the volume backup_volume

How It Works

The container has a default startup mode. Any specific behavior is done by defining envrionment variables at container startup (docker run). The default container behavior is to start in daemon mode and do incremental daily backups.

Your application data must be saved inside a Docker volume. You can list your volumes with the Docker command docker volume ls. You have to attach the volume to the backup container using the -v option. Choose an arbitrary name for the folder and add the :rooption to make the sources read only.

Example using Jenkins:

$ docker run \
     -d -p 80:8080 \
     --name jenkins \
     -v jenkins_volume:/jenkins \
     blacklabelops/jenkins

Starts Jenkins and stores its data inside the Docker volume jenkins_volume.

Now attach the Jenkins data to folders inside the container and tell blacklabelops/volumerize to backup folder /source to folder /backup.

$ docker run -d \
    --name volumerize \
    -v jenkins_volume:/source:ro \
    -v backup_volume:/backup \
    -v cache_volume:/volumerize-cache \
    -e "VOLUMERIZE_SOURCE=/source" \
    -e "VOLUMERIZE_TARGET=file:///backup" \
    blacklabelops/volumerize

Will start the Volumerizer. The volume jenkins_volume is now folder /source and backups_volume is now folder /backup inside the container.

You can execute commands inside the container, e.g. doing an immediate backup or even restore:

$ docker exec volumerize backup

Will trigger a backup.

Backup Multiple volumes

The container can backup one source folder, see environment variable VOLUMERIZE_TARGET. If you want to backup multiple volumes you will have to hook up multiple volumes under the same source folder.

Example:

  • Volume: application_data
  • Volume: application_database_data
  • Volume: application_configuration

Now start the container hook them up under the same folder source.

$ docker run -d \
    --name volumerize \
    -v application_data:/source/application_data:ro \
    -v application_database_data:/source/application_database_data:ro \
    -v application_configuration:/source/application_configuration:ro \
    -v backup_volume:/backup \
    -v cache_volume:/volumerize-cache \
    -e "VOLUMERIZE_SOURCE=/source" \
    -e "VOLUMERIZE_TARGET=file:///backup" \
    blacklabelops/volumerize

Will run Volumerize on the common parent folder /source.

Backup Restore

A restore is simple. First stop your Volumerize container and start a another container with the same environment variables and the same volume but without read-only mode! This is important in order to get the same directory structure as when you did your backup!

Tip: Now add the read-only option to your backup container!

Example:

You did your backups with the following settings:

$ docker run -d \
    --name volumerize \
    -v jenkins_volume:/source:ro \
    -v backup_volume:/backup \
    -v cache_volume:/volumerize-cache \
    -e "VOLUMERIZE_SOURCE=/source" \
    -e "VOLUMERIZE_TARGET=file:///backup" \
    blacklabelops/volumerize

Then stop the backup container and restore with the following command. The only difference is that we exclude the read-only option :ro from the source volume and added it to the backup volume:

$ docker stop volumerize
$ docker run --rm \
    -v jenkins_volume:/source \
    -v backup_volume:/backup:ro \
    -v cache_volume:/volumerize-cache \
    -e "VOLUMERIZE_SOURCE=/source" \
    -e "VOLUMERIZE_TARGET=file:///backup" \
    blacklabelops/volumerize restore
$ docker start volumerize

Triggers a once time restore. The container for executing the restore command will be deleted afterwards

You can restore from a particular backup by adding a time parameter to the command restore. For example, using restore -t 3D at the end in the above command will restore a backup from 3 days ago. See the Duplicity manual to view the accepted time formats.

To see the available backups, use the command list before doing a restore.

Dry run

You can pass the --dry-run parameter to the restore command in order to test the restore functionality:

$ docker run --rm \
    -v jenkins_volume:/source \
    -v backup_volume:/backup:ro \
    -v cache_volume:/volumerize-cache \
    -e "VOLUMERIZE_SOURCE=/source" \
    -e "VOLUMERIZE_TARGET=file:///backup" \
    blacklabelops/volumerize restore --dry-run

But in order to see the differences between backup and source you need the verify command:

$ docker run --rm \
    -v jenkins_volume:/source \
    -v backup_volume:/backup:ro \
    -v cache_volume:/volumerize-cache \
    -e "VOLUMERIZE_SOURCE=/source" \
    -e "VOLUMERIZE_TARGET=file:///backup" \
    blacklabelops/volumerize verify

Periodic Backups

The default cron setting for this container is: 0 0 4 * * *. That's four o'clock in the morning UTC. You can set your own schedule with the environment variable VOLUMERIZE_JOBBER_TIME.

You can set the time zone with the environment variable TZ.

The syntax is different from cron because I use Jobber as a cron tool: Jobber Time Strings

Example:

$ docker run -d \
    --name volumerize \
    -v jenkins_volume:/source:ro \
    -v backup_volume:/backup \
    -v cache_volume:/volumerize-cache \
    -e "TZ=Europe/Berlin" \
    -e "VOLUMERIZE_SOURCE=/source" \
    -e "VOLUMERIZE_TARGET=file:///backup" \
    -e "VOLUMERIZE_JOBBER_TIME=0 0 3 * * *" \
    blacklabelops/volumerize

Backups at three o'clock in the morning according to german local time.

Docker Container Restarts

This image can stop and start Docker containers before and after backup. Docker containers are specified using the environment variable VOLUMERIZE_CONTAINERS. Just enter their names in a empty space separated list.

Example:

  • Docker container application with name application
  • Docker container application database with name application_database

Note: Needs the parameter -v /var/run/docker.sock:/var/run/docker.sock in order to be able to start and stop containers on the host.

Example:

$ docker run -d \
    --name volumerize \
    -v /var/run/docker.sock:/var/run/docker.sock \
    -v jenkins_volume:/source:ro \
    -v backup_volume:/backup \
    -v cache_volume:/volumerize-cache \
    -e "VOLUMERIZE_SOURCE=/source" \
    -e "VOLUMERIZE_TARGET=file:///backup" \
    -e "VOLUMERIZE_CONTAINERS=application application_database" \
    blacklabelops/volumerize

The startup routine will be applied to the following scripts: backup, backupFull, restore and periodBackup.

Test the routine!

$ docker exec volumerize backup

Additional Docker CLI API configurations

If the docker host version is earlier than 1.12 then include the following docker api setting, Volumerize uses docker CLI ver 1.12 which uses Docker API version 1.24. One needs to set the compatible API version of the docker host ie. Docker host version 1.11 uses API 1.23

docker version
Client:
 Version:      1.11.2
 API version:  1.23
 Go version:   go1.8
 Git commit:   5be46ee-synology
 Built:        Fri May 12 16:36:47 2017
 OS/Arch:      linux/amd64

Server:
 Version:      1.11.2
 API version:  1.23
 Go version:   go1.8
 Git commit:   5be46ee-synology
 Built:        Fri May 12 16:36:47 2017
 OS/Arch:      linux/amd64

Then use the following -e argument

$ docker run -d \
    --name volumerize \
    -v /var/run/docker.sock:/var/run/docker.sock \
    ...
    ...
    -e "DOCKER_API_VERSION=1.23" \
    ...
    ...
    blacklabelops/volumerize

Additional Docker considerations

Warning: Make sure your container is running under the correct restart policy. Tools like Docker, Docker-Compose, Docker-Swarm, Kubernetes and Cattle may restart the container even when Volumerize stops it. Backups done under running instances may end in corrupted backups and even corrupted data. Always make sure that the command docker stop really stops an instance and there will be no restart of the underlying deployment technology. You can test this by running docker stop and check with docker ps that the container is really stopped.

Duplicity Parameters

Under the hood blacklabelops/volumerize uses duplicity. See here for duplicity command line options: Duplicity CLI Options

You can pass duplicity options inside Volumerize. Duplicity options will be passed by the environment-variable VOLUMERIZE_DUPLICITY_OPTIONS. The options will be added to all blacklabelops/volumerize commands and scripts. E.g. the option --dry-run will put the whole container in demo mode as all duplicity commands will only be simulated.

Example:

$ docker run -d \
    --name volumerize \
    -v jenkins_volume:/source:ro \
    -v backup_volume:/backup \
    -v cache_volume:/volumerize-cache \
    -e "VOLUMERIZE_SOURCE=/source" \
    -e "VOLUMERIZE_TARGET=file:///backup" \
    -e "VOLUMERIZE_DUPLICITY_OPTIONS=--dry-run" \
    blacklabelops/volumerize

Will only operate in dry-run simulation mode.

Symmetric Backup Encryption

You can encrypt your backups by setting a secure passphrase inside the environment variable PASSPHRASE.

Creating a secure passphrase:

$ docker run --rm blacklabelops/volumerize openssl rand -base64 128

Prints an appropriate password on the console.

Example:

$ docker run -d \
    --name volumerize \
    -v jenkins_volume:/source:ro \
    -v backup_volume:/backup \
    -v cache_volume:/volumerize-cache \
    -e "VOLUMERIZE_SOURCE=/source" \
    -e "VOLUMERIZE_TARGET=file:///backup" \
    -e "PASSPHRASE=Jzwv1V83LHwtsbulVS7mMyijStBAs7Qr/V2MjuYtKg4KQVadRM" \
    blacklabelops/volumerize

Same functionality as described above but all backups will be encrypted.

Asymmetric Key-Based Backup Encryption

You can encrypt your backups with secure secret keys.

You need:

  • A key, specified by the environment-variable VOLUMERIZE_GPG_PRIVATE_KEY
  • A key passphrase, specified by the environment-variable PASSPHRASE

Creating a key? Install gpg on your comp and type:

$ gpg2 --full-gen-key
Please select what kind of key you want:
   (1) RSA and RSA (default)
   (2) DSA and Elgamal
   (3) DSA (sign only)
   (4) RSA (sign only)
Your selection? 1
RSA keys may be between 1024 and 4096 bits long.
What keysize do you want? (2048)
Requested keysize is 2048 bits   
Please specify how long the key should be valid.
         0 = key does not expire
      <n>  = key expires in n days
      <n>w = key expires in n weeks
      <n>m = key expires in n months
      <n>y = key expires in n years
Key is valid for? (0)
Key does not expire at all
Is this correct? (y/N) y

GnuPG needs to construct a user ID to identify your key.

Real name: YourName
Email address: [email protected]
Comment:                            
You selected this USER-ID:
    "YourName <[email protected]>"

Change (N)ame, (C)omment, (E)mail or (O)kay/(Q)uit? O
$ gpg2 --export-secret-keys --armor [email protected] > MyKey.asc

Note: Currently, this image only supports keys without passwords. The import routine is at fault, it would always prompt for passwords.

You need to get the key id:

$ gpg -k [email protected] | head -n 2 | tail -n 1 | awk '{print $1}'

Example:

$ docker run -d \
    --name volumerize \
    -v jenkins_volume:/source:ro \
    -v backup_volume:/backup \
    -v cache_volume:/volumerize-cache \
    -v $(pwd)/MyKey.asc:/key/MyKey.asc \
    -e "VOLUMERIZE_SOURCE=/source" \
    -e "VOLUMERIZE_TARGET=file:///backup" \
    -e "VOLUMERIZE_GPG_PRIVATE_KEY=/key/MyKey.asc" \
    -e GPG_KEY_ID=<MyKeyID>
    -e "PASSPHRASE=" \
    blacklabelops/volumerize

This will import a key without a password set.

Test the routine!

$ docker exec volumerize backup

Enforcing Full Backups Periodically

The default behavior is that the initial backup is a full backup. Afterwards, Volumerize will perform incremental backups. You can enforce another full backup periodically by specifying the environment variable VOLUMERIZE_FULL_IF_OLDER_THAN.

The format is a number followed by one of the characters s, m, h, D, W, M, or Y. (indicating seconds, minutes, hours, days, weeks, months, or years)

Examples:

  • After three Days: 3D
  • After one month: 1M
  • After 55 minutes: 55m

Volumerize Example:

$ docker run -d \
    --name volumerize \
    -v jenkins_volume:/source:ro \
    -v backup_volume:/backup \
    -v cache_volume:/volumerize-cache \
    -e "TZ=Europe/Berlin" \
    -e "VOLUMERIZE_SOURCE=/source" \
    -e "VOLUMERIZE_TARGET=file:///backup" \
    -e "VOLUMERIZE_FULL_IF_OLDER_THAN=7D" \
    blacklabelops/volumerize

Will enforce a full backup after seven days.

For the difference between a full and incremental backup, see Duplicity's documentation.

Post scripts and pre scripts (prepost strategies)

Pre-scripts must be located at /preexecute/$duplicity_action/$your_scripts_here.

Post-scripts must be located at /postexecute/$duplicity_action/$your_scripts_here.

$duplicity_action folder must be named backup, restore or verify.

Note: backup action is the same for the scripts backup, backupFull, backupIncremental and periodicBackup.

All .sh files located in the $duplicity_action folder will be executed in alphabetical order.

When using prepost strategies, this will be the execution flow: pre-scripts -> stop containers -> duplicity action -> start containers -> post-scripts.

Some premade strategies are available at prepost strategies.

Container Scripts

This image creates at container startup some convenience scripts. Under the hood blacklabelops/volumerize uses duplicity. To pass script parameters, see here for duplicity command line options: Duplicity CLI Options

Script Description
backup Creates an backup with the containers configuration
backupFull Creates a full backup with the containers configuration
backupIncremental Creates an incremental backup with the containers configuration
list List all available backups
verify Compare the latest backup to your local files
restore Be Careful! Triggers an immediate force restore with the latest backup
periodicBackup Same script that will be triggered by the periodic schedule
startContainers Starts the specified Docker containers
stopContainers Stops the specified Docker containers
remove-older-than Delete older backups (Time formats)
cleanCacheLocks Cleanup of old Cache locks.
prepoststrategy $execution_phase $duplicity_action Execute all .sh files for the specified exeuction phase and duplicity action in alphabetical order.

$execution_phase must be preAction or postAction.

$duplicity_action must be backup, verify or restpore.

Example triggering script inside running container:

$ docker exec volumerize backup

Executes script backup inside container with name volumerize

Example passing script parameter:

$ docker exec volumerize backup --dry-run

--dry-run will simulate not execute the backup procedure.

Build The Project

Check out the project at Github.

Multiple Backups

You can specify multiple backup jobs with one container with enumerated environment variables. Each environment variable must be followed by a number starting with 1. Example VOLUMERIZE_SOURCE1, VOLUMERIZE_SOURCE2 or VOLUMERIZE_SOURCE3.

The following environment variables can be enumerated:

  • VOLUMERIZE_SOURCE
  • VOLUMERIZE_TARGET
  • VOLUMERIZE_CACHE
  • VOLUMERIZE_INCLUDE

When using multiple backup jobs you will have to specify a cache directory for each backup. The minimum required environment variables for each job is:

  • VOLUMERIZE_SOURCE
  • VOLUMERIZE_TARGET
  • VOLUMERIZE_CACHE

Also the included helper scripts will change their behavior when you use enumerated environment variables. By default each script will run on all backup jobs.

Example: Executing the script backup will backup all jobs.

The first parameter of each script can be a job number, e.g. 1, 2 or 3.

Example: Executing the script backup 1 will only trigger backup on job 1.

Full example for multiple job specifications:

$ docker run -d \
    --name volumerize \
    -v /var/run/docker.sock:/var/run/docker.sock \
    -v jenkins_volume:/source:ro \
    -v jenkins_volume2:/source2:ro \
    -v backup_volume:/backup \
    -v backup_volume2:/backup2 \
    -v cache_volume:/volumerize-cache \
    -v cache_volume2:/volumerize-cache2 \
    -e "VOLUMERIZE_CONTAINERS=jenkins jenkins2" \
    -e "VOLUMERIZE_SOURCE1=/source" \
    -e "VOLUMERIZE_TARGET1=file:///backup" \
    -e "VOLUMERIZE_CACHE1=/volumerize-cache" \
    -e "VOLUMERIZE_SOURCE2=/source2" \
    -e "VOLUMERIZE_TARGET2=file:///backup2" \
    -e "VOLUMERIZE_CACHE2=/volumerize-cache2" \
    blacklabelops/volumerize

Build the Image

$ docker build -t blacklabelops/volumerize .

Run the Image

$ docker run -it --rm blacklabelops/volumerize bash

volumerize's People

Contributors

alexsat avatar auphofbsf avatar benjamin-mack-fnt avatar cybre-finn avatar davidjameshowell avatar eedrah avatar flowgunso avatar jwillmer avatar mikafouenski avatar miladiir avatar rockerboo avatar shnorbluk avatar srzorro avatar sven-weber avatar zhangriyueming avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

volumerize's Issues

GUI support (please...)

I really love this well-made image for docker :)

How about supporting simple Web Interface?
Maybe that can make more people(including me..) to join this great project XD

Option passed to duplicity causes error (only when running in volumerize)

I have the following environment vars:

VOLUMERIZE_SOURCE: /source
VOLUMERIZE_TARGET: pexpect+scp://app:[REDACTED]@server-backup//media/offsite-backups/data
VOLUMERIZE_DUPLICITY_OPTIONS: --scp-command "scp -l 100000"
PASSPHRASE: [REDACTED]

Now if i open a shell into the running container, i get the follwing running ./backup:

bash-5.0# ./backup
Usage: 
  duplicity [full|incremental] [options] source_dir target_url
  duplicity [restore] [options] source_url target_dir
  duplicity verify [options] source_url target_dir
  duplicity collection-status [options] target_url
  duplicity list-current-files [options] target_url
  duplicity cleanup [options] target_url
  duplicity remove-older-than time [options] target_url
  duplicity remove-all-but-n-full count [options] target_url
  duplicity remove-all-inc-of-but-n-full count [options] target_url

Backends and their URL formats:
  cf+http://container_name
  file:///some_dir
  ftp://user[:password]@other.host[:port]/some_dir
  ftps://user[:password]@other.host[:port]/some_dir
  hsi://user[:password]@other.host[:port]/some_dir
  imap://user[:password]@other.host[:port]/some_dir
  rsync://user[:password]@other.host[:port]::/module/some_dir
  rsync://user[:password]@other.host[:port]/relative_path
  rsync://user[:password]@other.host[:port]//absolute_path
  s3://other.host[:port]/bucket_name[/prefix]
  s3+http://bucket_name[/prefix]
  scp://user[:password]@other.host[:port]/some_dir
  ssh://user[:password]@other.host[:port]/some_dir
  swift://container_name
  tahoe://alias/directory
  webdav://user[:password]@other.host/some_dir
  webdavs://user[:password]@other.host/some_dir
  gdocs://user[:password]@other.host/some_dir
  pydrive://[email protected]/some_dir
  mega://user[:password]@other.host/some_dir
  copy://user[:password]@other.host/some_dir
  dpbx:///some_dir
  onedrive://some_dir
  azure://container_name
  b2://account_id[:application_key]@bucket_name/[some_dir/]
  mf://user[:password]@other.host/some_dir

Commands:
  cleanup <target_url>
  collection-status <target_url>
  full <source_dir> <target_url>
  incr <source_dir> <target_url>
  list-current-files <target_url>
  restore <source_url> <target_dir>
  remove-older-than <time> <target_url>
  remove-all-but-n-full <count> <target_url>
  remove-all-inc-of-but-n-full <count> <target_url>
  verify <target_url> <source_dir>

duplicity: error: no such option: -l

I then put an echo before line 34 in the backup script, copy and paste the printed command into the shell and the backup executes as expected.

Please note that the pexpect+scp:// url depends on my pull request #74 in order to work.

Restore not working

Project looks cool and feature rich. However, I'm failing to get it working on a reduced test case to see how it works in practice.

test.zip

If you read the readme I've explained the series of steps taken. I'm sure there's something really obvious I've missed.

Extract of the readme for your ease:

A simple container that just logs the current time to a file time.txt in an attached volume.

We can use this to test that volumerize does indeed work.

Right run the container:

docker-compose up

You will see it generating time stamps. At some point we decide it's time to back it up.

We stop our container:

docker-compose stop

We start the volumerize container:

docker run -d \
    --name volumerize \
    -v test-volume:/source/test-volume:ro \
    -v backup_volume:/backup \
    -v cache_volume:/volumerize-cache \
    -e "VOLUMERIZE_SOURCE=/source" \
    -e "VOLUMERIZE_TARGET=file:///backup" \
    blacklabelops/volumerize

We run a backup:

docker exec volumerize backup

We can stop volumerize and start our container again:

docker stop volumerize
docker-compose up

All our data should still be there:

docker-compose exec test cat /home/test/time.txt

Now let's destroy our container and it's volumes:

docker-compose down -v

We can start the container again to check that the data was lost:

docker-compose up
docker-compose exec test cat /home/test/time.txt

Right we've lost our data. Let's restore it.

Stop our container and start volumerize.

docker-compose down
docker run --rm \
    -v test-volume:/source/test-volume \
    -v backup_volume:/backup:ro \
    -v cache_volume:/volumerize-cache \
    -e "VOLUMERIZE_SOURCE=/source" \
    -e "VOLUMERIZE_TARGET=file:///backup" \
    blacklabelops/volumerize restore

We can start the container again to check that the data was recovered:

docker-compose up
docker-compose exec test cat /home/test/time.txt

At this point I end up with time.txt not containing the original lines from before I forcibly took down the test-volume by running docker-compose down -v

Allow listing of available backups and script to restore from a specific date

I think I can make the pull request for this, but I might need your help verifying it ๐Ÿ˜„

I was going to use this to show the backups:

duplicity collection-status file:///backup

and then look into if I can reuse your "restore" script to choose the date. If I can, then I'll just add documentation saying this.

Watch this space for the pull request shortly ๐Ÿ˜€

Collect Exit Code

In issue #43 all errors are ignored in order to have containers started regardless of errors.

In this issue the original return code of duplicity should be preserved and duplicitie's return code should be every scripts return code.

Is it possible to add an environment variable to control how many full backups to keep?

Hello,

thx for the great tool. I started using it today, but I'm missing the option to only save 2 (or n) full backups.

I'm currently using the following environment variables:
-e "VOLUMERIZE_JOBBER_TIME=0 */5 * * * *" \ (for testing purposes all 5 mins a inc backup)
-e "VOLUMERIZE_FULL_IF_OLDER_THAN=1h" (for testing purposes after 1h a fresh full backup)

here is what I found in the duplicity man page:
remove-all-inc-of-but-n-full [--force]
Delete incremental sets of all backups sets that are older than the count:th last full backup (in other words, keep only old full backups and not their increments). count must be larger than zero. A value of 1 means that only the single most recent backup chain will be kept intact. Note that --force will be needed to delete the files instead of just listing them.

As I'm new to docker, Is there a way to achieve this with the current version of volumerize or is it something that needs to be added?

Thx for your work!

Bye,
Jack

Using this image this Hubic

Hi! Thanks for great work! But for me it`s not clever, how to use backends mentioned in Duplicity manual, but not described in backend tutorials in this repo. For example: Hubic.

Thanks for help!

Repository Deprecated

I will stop supporting and implementing Docker images starting in early 2019.

Repositories will be transferred to blacklabelops-legacy and archived for read only purpose.

The following repositories and images are affected:

January 2019

  • blacklabelops/crowd
  • blacklabelops/nginx
  • blacklabelops/bitbucket
  • blacklabelops/java
  • blacklabelops/letsencrypt
  • blacklabelops/postgres

March 2019

  • blacklabelops/baseimages
  • blacklabelops/confluence
  • blacklabelops/jira
  • blacklabelops/volumerize
  • blacklabelops/logrotate

<Snip>

Sorry, I realised I made a user error shortly after creating the issue. If possible, please delete this issue :)

Notification on successful backup

Great work! You make an otherwise tedious backup task really simple.

I was wondering if it is somehow possible to automatically send notifications to an email address on successful (or failed) backups?

Refactoring Image Scripts

Image scripts are currently cat >> into the images file system.

This is hard to maintain and extend.

Next versions scripts will use environment variables to function. No reason to generate static scripts.

Environment variable BACKUP_TYPE its not working as expected

If the container is executed as docker run <args> blacklabelops/volumerize [backup/restore/etc] it sets correctly the BACKUP_TYPE env var. But if its not specified the action (aka let the cron job do his thing), its not going to be correctly set.

stderr warning on backup

There is a stderr warning for urllib3 at end of each backup:

/usr/lib/python2.7/site-packages/requests/__init__.py:80: RequestsDependencyWarning: urllib3 (1.23) or chardet (3.0.4) doesn't match a supported version!

Multiple volumes backup + S3

Does multiple volume backup work with s3?

The following writes dirrectly to the root folder of VOLUMERIZE_TARGET. In this case at the root folder of BUCKET.

docker run -d \
    --name volumerize \
    -v volume1:/source/volume1:ro \
		-v volume2:/source/volume2:ro \
    -e "VOLUMERIZE_SOURCE=/source" \
    -e "VOLUMERIZE_TARGET=s3://s3.eu-central-1.amazonaws.com/BUCKET" \
    -e "AWS_ACCESS_KEY_ID=KEY_ID" \
    -e "AWS_SECRET_ACCESS_KEY=SECRET" \
    blacklabelops/volumerize

A nicer layout would have been:

Bucket:

  • root
    • volume1
      • backup 1
      • backup 2
    • volume2
      • backup 1
      • backup 2

Restore with overwriting files

My goal is to make restore with overwriting files? I mean I made backup, change file inside container and restore. But file is not overwriting, just if I delete file it will be restored. I use volumerize with docker-compose with such config.

  backups:
    container_name: volumerize
    image: myvolumerize
    restart: unless-stopped
    environment:
        - TZ=Europe/Warsaw
        - VOLUMERIZE_SOURCE=/source
        - VOLUMERIZE_TARGET=file:///backups
        - VOLUMERIZE_JOBBER_TIME=0 40 12 * * *
        - VOLUMERIZE_CONTAINERS=backend frontend postgres redis solr
        - DOCKER_API_VERSION=1.39
    volumes:
        - /var/run/docker.sock:/var/run/docker.sock
        - ~/data/backup-cache:/volumerize-cache
        ...
        - env:/source/src/.env
        - _gunicorn_access:/source/src/._gunicorn_access.txt
        - gunicorn_log:/source/src/gunicorn_log.txt

Checked and if I'm backuping one file it is not possible (maybe I mistake), but if it will be a folder it is overwriting a file after changing

Define Host Path

Is it possible to define Local Host Path as VOLUMERIZE_TARGET?

Internal CA certificate cannot be verified

Hello

I'm trying to use volumerize (docker) with an internal server, which has a company internal CA-signed ccertificate.

I'm trying to pass the certificate in via environment variable, but the duplicity process is not picking this up, or not using this mechanism.

What is the correct way to have volumerize/duplicity trust the internal CA certificate ?

Volumerize command:

 docker run
	--rm
	--name volumerize-s3_minio.company.lan_volumerize.mybuntu-3.appgrafana
	-v appgrafana_grafana:/source/appgrafana_grafana:ro
	-v appgrafana_opentsdb:/source/appgrafana_opentsdb:ro
	-v appgrafana_whisper:/source/appgrafana_whisper:ro
	-v /home/git/CA-Deployer/root-certificates/rnd-ca.cer:/etc/pki/server-cert.cer
	-e SSL_CERT_FILE=/etc/pki/server-cert.cer
	-e SSL_CERT_DIR=/etc/pki
	-v volumerize_s3_minio.company.lan_volumerize.mybuntu-3.appgrafana_cache:/volumerize_cache
	-v volumerize_s3_minio.company.lan_volumerize.mybuntu-3.appgrafana_credentials:/credentials
	-e VOLUMERIZE_SOURCE=/source
	-e VOLUMERIZE_TARGET=s3://minio.company.lan/volumerize.mybuntu-3.appgrafana
	-e AWS_ACCESS_KEY_ID=$KEY
	-e AWS_SECRET_ACCESS_KEY=$SECRET
	-e VOLUMERIZE_FULL_IF_OLDER_THAN=6D
	--env-file /tmp/tmp.tr6tIIyrjK blacklabelops/volumerize:1.1.0 backup

The output I get is

- name: VolumerizeBackupJob
  cmd: /etc/volumerize/periodicBackup
  time: '0 0 4 * * *'
  onError: Continue
  notifyOnError: false
  notifyOnFailure: false

Traceback (most recent call last):
  File "/usr/bin/duplicity", line 1546, in <module>
    with_tempdir(main)
  File "/usr/bin/duplicity", line 1540, in with_tempdir
    fn()
  File "/usr/bin/duplicity", line 1375, in main
    action = commandline.ProcessCommandLine(sys.argv[1:])
  File "/usr/lib/python2.7/site-packages/duplicity/commandline.py", line 1126, in ProcessCommandLine
    backup, local_pathname = set_backend(args[0], args[1])
  File "/usr/lib/python2.7/site-packages/duplicity/commandline.py", line 1015, in set_backend
    globals.backend = backend.get_backend(bend)
  File "/usr/lib/python2.7/site-packages/duplicity/backend.py", line 223, in get_backend
    obj = get_backend_object(url_string)
  File "/usr/lib/python2.7/site-packages/duplicity/backend.py", line 209, in get_backend_object
    return factory(pu)
  File "/usr/lib/python2.7/site-packages/duplicity/backends/_boto_single.py", line 166, in __init__
    self.resetConnection()
  File "/usr/lib/python2.7/site-packages/duplicity/backends/_boto_single.py", line 191, in resetConnection
    location=self.my_location)
  File "/usr/lib/python2.7/site-packages/boto/s3/connection.py", line 616, in create_bucket
    data=data)
  File "/usr/lib/python2.7/site-packages/boto/s3/connection.py", line 668, in make_request
    retry_handler=retry_handler
  File "/usr/lib/python2.7/site-packages/boto/connection.py", line 1071, in make_request
    retry_handler=retry_handler)
  File "/usr/lib/python2.7/site-packages/boto/connection.py", line 1030, in _mexe
    raise ex
SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:661)

I have confirmed that the docker environment's python does see the certificate

$> docker exec -it volumerize-s3_minio.company.lan_volumerize.mybuntu-3.appgrafana bash
bash-4.3# ps aux
PID   USER     TIME   COMMAND
    1 root       0:00 /bin/tini -- /opt/volumerize/docker-entrypoint.sh backup
    6 root       0:00 {backup} /bin/bash /etc/volumerize/backup
   29 root       0:02 python2 /usr/bin/duplicity --allow-source-mismatch --arch
   34 root       0:00 bash
   40 root       0:00 ps aux
bash-4.3# python -c "import ssl; print(ssl.get_default_verify_paths())" | sed 's/,/\n/g'
DefaultVerifyPaths(cafile='/etc/pki/server-cert.cer'
 capath='/etc/ssl/certs'
 openssl_cafile_env='SSL_CERT_FILE'
 openssl_cafile='/etc/ssl/cert.pem'
 openssl_capath_env='SSL_CERT_DIR'
 openssl_capath='/etc/ssl/certs')

B2 Backup broken cause of known bug

Local and Remote metadata are synchronized, no sync needed.
Last full backup left a partial set, restarting.
Last full backup date: Wed Sep 12 16:30:18 2018
RESTART: The first volume failed to upload before termination.
         Restart is impossible...starting backup from beginning.
Local and Remote metadata are synchronized, no sync needed.
Last full backup date: none
No signatures found, switching to full backup.
Attempt 1 failed. AttributeError: B2ProgressListener instance has no attribute '__exit__'
Attempt 2 failed. AttributeError: B2ProgressListener instance has no attribute '__exit__'

should be fixed in duplicity 0.7.18

Can we update?

https://askubuntu.com/questions/1067296/how-to-fix-duplicity-b2-backblaze-error
https://bugs.launchpad.net/duplicity/+bug/1785520

Support Multiple Backup Jobs

Requirements:

  • Enable multiple job definitions per environment variables.
  • Scripts should decide between single execution (legacy mode) and execute multiple jobs.
  • Scripts should be able to return error code when any job fails.
  • Scripts should be able to execute all jobs or a specific job.

x86_64 only, set it architecture independent?

Hello
Could you modify it's Dockerfile to set it architecture independent?

I think that you should find a solution to replace this line:
curl -fSL "https://get.docker.com/builds/Linux/x86_64/docker-${DOCKER_VERSION}.tgz" -o /tmp/docker.tgz &&

Thanks & regards

Dealing with duplicity cache lock (.lock file)

@blacklabelops
While I was testing volumerize in a small sized storage, duplicity backup process was interrupt because of storage limitation.

This left .lock file at /volumerize-cache directory, which causes the error below

Another instance is already running with this archive directory
If you are sure that this is the  only instance running you may delete
the following lockfile and run the command again :
/volumerize-cache/3fe07cc0f71075f95f411fb55ec60120/lockfile.lock

How about adding a cache-removing script before start backing up?

VOLUMERIZE_CONTAINERS stop start order

I have 2 containers:

  • container_db
  • container_server --link container_db:db

I have volumerize container with variable VOLUMERIZE_CONTAINERS="container_server container_db"

I've noticed that the containers in variable VOLUMERIZE_CONTAINERS are stopped and started in same order (the order in the list).

Following with the example, container_server is stopped at first place, and then container_db is stopped. That's the desired behaviour.

However when restarting the containers, container_server is started at first, but there is an error that says:

Error response from daemon: Cannot link to a non running container: /container_db AS /container_server/db
Error: failed to start containers: container_server

I think containers should be stopped in the indicated order and restarted in reverse order. What do you think?

Display the name of the source when executing multiple volume backup

When executing a multiple volume backup, the console displays a lot of information but not the source volume, eg.:

Local and Remote metadata are synchronized, no sync needed.
Last full backup date: Tue Apr 21 10:45:21 2020
--------------[ Backup Statistics ]--------------
StartTime 1587737867.32 (Fri Apr 24 10:17:47 2020)
EndTime 1587737867.46 (Fri Apr 24 10:17:47 2020)
ElapsedTime 0.14 (0.14 seconds)
SourceFiles 651
SourceFileSize 9548079 (9.11 MB)
NewFiles 0
NewFileSize 0 (0 bytes)
DeletedFiles 0
ChangedFiles 0
ChangedFileSize 0 (0 bytes)
ChangedDeltaSize 0 (0 bytes)
DeltaEntries 0
RawDeltaSize 0 (0 bytes)
TotalDestinationSizeChange 20 (20 bytes)
Errors 0
-------------------------------------------------

When backing up a lot of volumes, the name of the source is critical.
Please add it to the log.

Containers are not restarted on failure

I've noticed that when a backup fails (in my case it's due to the underlying mount point being unavailable), the docker containers that are automatically stopped are not restarted:

Local and Remote metadata are synchronized, no sync needed.
Last full backup left a partial set, restarting.
Last full backup date: Thu Sep  6 16:51:36 2018
RESTART: The first volume failed to upload before termination.
         Restart is impossible...starting backup from beginning.
Local and Remote metadata are synchronized, no sync needed.
Last full backup date: none
No signatures found, switching to full backup.
Giving up after 1 attempts. IOError: [Errno 2] No such file or directory: '/backup/duplicity-full.20180906T165246Z.vol1.difftar.gz'

Would it be possible to ensure that docker containers are started back up even when there is an error backing up?

Support Restore of multiple volumes at once

I see in the docs that backing up multiple volumes at once is supported by mounting multiple local volumes into the /source folder. I haven't found a way, however, to restore those individual volumes back to individual volumes instead of one big volume with multiple sub-folders. Right now I just have a big script that restores each individually, but it would be nice if there were a way to do this natively.

Adding `remove-all-but-n-full`, `remove-all-inc-of-but-n-full` scripts

@blacklabelops
Maybe this issue is related with issue #5 .
This request is simply adding 2 removing method to the plugin, so that most issues dealing with removing method will be solved :)

I don't know much about shell scripting.. but I just made an example to explain the request

e.g.
at /imagescripts/create_scripts.sh

cat > ${VOLUMERIZE_SCRIPT_DIR}/remove-all-but-n-full <<_EOF_
#!/bin/bash
set -o errexit
exec ${DUPLICITY_COMMAND} remove-all-but-n-full ${PARAMETER_PROXY} ${DUPLICITY_OPTIONS} ${VOLUMERIZE_INCUDES} ${VOLUMERIZE_TARGET}
_EOF_

cat > ${VOLUMERIZE_SCRIPT_DIR}/remove-all-inc-of-but-n-full <<_EOF_
#!/bin/bash
set -o errexit
exec ${DUPLICITY_COMMAND} remove-all-inc-of-but-n-full ${PARAMETER_PROXY} ${DUPLICITY_OPTIONS} ${VOLUMERIZE_INCUDES} ${VOLUMERIZE_TARGET}
_EOF_

e.g.
Removing all except for 1 full backup set on every 3 a.m.

$ docker run -d \
    --name volumerize \
    -v jenkins_volume:/source:ro \
    -v backup_volume:/backup \
    -e "VOLUMERIZE_SOURCE=/source" \
    -e "VOLUMERIZE_TARGET=file:///backup" \
    -e "JOB_NAME2=RemoveOldFullBackups" \
    -e "JOB_COMMAND2=/etc/volumerize/remove-all-but-n-full 1" \
    -e "JOB_TIME2=0 0 3 * * *" \
    blacklabelops/volumerize

It was just my imagination :)
Can it be implemented?

Question: Should i run one instance of volumerize for each container i want backuped?

Hey, i just stumpled upon this application and it looks very useful. But im not sure how i should use it.

Lets say i want the data of 2 applications running in docker backuped.

Should i add all volumes accross the 2 applications to a single volumerize instance, or do seperate instances for example run a volumerize_app1 and a volumerize_app2 container?

Stop and Start Containers in Docker Swarm mode

Hi,

This is more of a question than an issue. I used volumerize to backup some Docker Compose managed named container volumes and it works great. Now I have moved to Docker Swarm and I am running into problems.

One of them was that to get the right volume names, which I resolved by putting the volumerize container in the same stack.

The second pb I have, not resolved yet, is about stopping and starting containers. Since the container managed by Swarm have names that cannot be determined in advance, I cannot use the VOLUMERIZE_CONTAINERS variable.

Any clue on how to work around this?

/credentials folder its not used for AmazonS3 backups

Looks like the /credentials folder is only used by Google Drive and not AmazonS3 but in backends/AmazonS3/README.md All docker run examples contain a reference to a volume called volumerize_credentials mounted to that folder.

This is intended for a future feature where the access key and secret key are saved there or its a typo caused by duplicating the GoogleDrive Readme when creating the docs?

Restore only partially working

Right I'm being a bit nuts here and playing with volumerize to backup a ghost blog (not sure this is a good idea). In any case I'm experiencing issues. Backing up and restoring has been working great on my developer machine.

Then I wished to deploy it to my server. So I created a docker machine on digtitalocean and up'ed the service on it. After that I did a restore from a backup I had on dropbox. The mysql database was correctly restored, however, the ghost volumes were not. I tried to restore them again and get lots of warning about pre-existing files.

docker run --rm     --name volumerize     -v server_heiberg-volume:/source/server_heiberg-volume     -v backup_volume:/backup     -v cache_volume:/volumerize-cache     -e "VOLUMERIZE_SOURCE=/source"     -e "VOLUMERIZE_TARGET=dpbx:///server"     -e "DPBX_ACCESS_TOKEN=q3dUBFFaGn0AAAAAAABpsyQ0kqDy7GJTpcs1rNUf_6JL2WhNfTT5ITqLMYQdSETK"     blacklabelops/volumerize restore
---

- name: VolumerizeBackupJob
  cmd: /etc/volumerize/periodicBackup
  time: '0 0 4 * * *'
  onError: Continue
  notifyOnError: false
  notifyOnFailure: false

Local and Remote metadata are synchronized, no sync needed.
Last full backup date: Fri Sep 14 03:32:46 2018
Error '[Errno 17] File exists' processing server_heiberg-volume/config.development.json
Error '[Errno 17] File exists' processing server_heiberg-volume/content/themes/casper
Error '[Errno 17] File exists' processing server_heiberg-volume/content.orig/themes/casper
Error '[Errno 17] File exists' processing server_heiberg-volume/current
Error '[Errno 17] File exists' processing server_heiberg-volume/versions/2.0.3/node_modules/.bin/acorn
Error '[Errno 17] File exists' processing server_heiberg-volume/versions/2.0.3/node_modules/.bin/analytics
...
Error '[Errno 17] File exists' processing server_heiberg-volume/versions/2.0.3/node_modules/static-eval/node_modules/escodegen/node_modules/.bin/esvalidate
Error '[Errno 17] File exists' processing server_heiberg-volume/versions/2.0.3/node_modules/superagent/node_modules/.bin/mime
Error '[Errno 17] File exists' processing server_heiberg-volume/versions/2.0.3/node_modules/tar/node_modules/.bin/mkdirp
Error '[Errno 17] File exists' processing server_heiberg-volume/versions/2.0.3/node_modules/write/node_modules/.bin/mkdirp
Error '[Errno 17] File exists' processing server_heiberg-volume/versions/2.0.3/node_modules/yargs/node_modules/.bin/window-size

I did some digging and found

${DUPLICITY_COMMAND} restore --force ${PARAMETER_PROXY} ${DUPLICITY_OPTIONS} ${VOLUMERIZE_INCUDES} ${VOLUMERIZE_TARGET} ${VOLUMERIZE_SOURCE} || true
which suggests this is really a duplicity error.

Which lead me to https://askubuntu.com/questions/594879/duplicity-command-line-restore-usage

first.. duplicity will never overwrite data while restoring. that is a design choice to protect your data. what you've got to do is to restore to a temporary folder and copy/move the restored data manually to your desired location.

That's not exactly what I was expecting from this tool. The weird thing is that everything was working just fine on my local machine. What am I doing wrong here?

Prexecute and postexecute scripts

Add the ability to execute scripts before and after volumerize does his job.

In my setup i need to backup a mariadb and mongodb containers, but instead of saving the volumes as it is i need a way to backup the databases processed (aka mongodump and mysqldump) and the final processed result save it with volumerize.

So, instead of adding a special way to do it for mongo and another for mysql, i think its better if there is some way to execute arbitrary scripts, so if another person has a diferent need (like copying some files around or what not) he can implement his own prexecute and postexecute strategies.

[feature] automatic named volume backup?

I recently tried bivac and the idea of automatically include all named volumes, combined with a blacklist option made a lot of sense to me. I often create new volumes for containers and having to manually add them to a backup list is error prone.

So maybe its worth thinking about adding a similar feature to volumerize, since local filesystem backups do not work in bivac unfortunately.

healthcheck dry-run

I there,

i want to docker health-check this backup service itself. Therefor i try to run the script in the --dry-run mode:
test: "/opt/volumerize/docker-entrypoint.sh backup --dry-run"

This seems to work fine, unless i am using additionally the VOLUMERIZE_CONTAINERS environment variable. So i have to ask, is the stop and start of the containers defined in VOLUMERIZE_CONTAINERS during dry-run really a desired behavior? I would like to see that those stop and starts are not toggled in the dry-run mode to be able to use this statement as a health-check one.

[enhancement] Container restart

Hi,
I have try volumerize and it's very useful for save my volumes.
I'am using container restart option and i saw a little problem for me.
When it save multi containers, it begin to stop it and save all containers, and at the end restart all containers.
It takes so much time to save all containers and during this, they are all stopped.
Is it possible to saving all container one by one ( stop -> save -> start , stop -> save -> start, etc ) ?

Sorry for bad english i'am french you know :)

Thomas

"Local and Remote metadata are synchronized, no sync needed"

Hello!
I tried the jenkins example:

  1. I launch a jenkins docker
  2. add some items
  3. backup
  4. add some more items
  5. stop jenkins container
  6. restore

Then I get the following message:

docker run --rm -v jenkins_volume:/source -v b_jenkins_volume:/backup:ro -v cache_volume:/volumerize-cache -e "VOLUMERIZE_SOURCE=/source" -e "VOLUMERIZE_TARGET=file:///backup" blacklabelops/volumerize restore

  • name: VolumerizeBackupJob
    cmd: /etc/volumerize/periodicBackup
    time: '0 0 4 * * *'
    onError: Continue
    notifyOnError: false
    notifyOnFailure: false

/usr/lib/python2.7/site-packages/requests/init.py:80: RequestsDependencyWarning: urllib3 (1.23) or chardet (3.0.4) doesn't match a supported version!
RequestsDependencyWarning)
Local and Remote metadata are synchronized, no sync needed.
Last full backup date: Thu Jun 7 14:34:23 2018

  1. start jenkins container

All items are still in jenkins dashboard.
Any ideas?

Thanks in advance!

ARM support

I want to compile the Docker image on my arm32v7 server and see the Dockerfile uses "blacklabelops/alpine:3.8" as base image. Is there some magic behind this image or can it be easily replaced with the standard alpine image?

Run as particular user

By default this seems to be running as the root user, and so creates permissions just for root.

I'm wanting to then sync backup files remotely which requires that a different user is given access to the backed up files. Can we get this so we can specify the user to run the duplicity commands as?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.