Giter VIP home page Giter VIP logo

prune_backups's Introduction

prune_backups Go

A small tool to prune a bunch of backup directories to the typical pattern of one per hour for a day, one per day for a month, and then one per month.

What does the tool do?

The tool prune_backups takes one directory name as command line argument. It will look for subdirectories in the given directory that start with a certain naming pattern: YYYY-MM-DD_HH-mm. The tool interpretes these directory names as dates and keeps exactly one of these directories for the current hour, one for the last hour and so on. The tool will always keep the latest and move all other directories to a newly created subdirectory 'to_delete'. The tool only moves directories and does not actually delete anything!

What will a pruned directory look like?

Scenario: This example assumes you have a cron-job running hourly in the 49th minute, each creating a separate backup directory (for example with rsync --link-dest). It is the 17th of June 2024 today, 09:54 in the morning when you run prune_backups. It will leave your backup directory (-dir parameter) with the following directory layout:

Directory name Directory name (cntd.) Directory name (cntd.) Directory name (cntd.)
🟨 2024-06-17_09-49/ 🟨 2024-06-16_18-49/ 🟦 2024-06-09_23-49/ 🟦 2024-05-25_23-49/
🟨 2024-06-17_08-49/ 🟨 2024-06-16_17-49/ 🟦 2024-06-08_23-49/ 🟦 2024-05-24_23-49/
🟨 2024-06-17_07-49/ 🟨 2024-06-16_16-49/ 🟦 2024-06-07_23-49/ 🟦 2024-05-23_23-49/
🟨 2024-06-17_06-49/ 🟨 2024-06-16_15-49/ 🟦 2024-06-06_23-49/ 🟦 2024-05-22_23-49/
🟨 2024-06-17_05-49/ 🟨 2024-06-16_14-49/ 🟦 2024-06-05_23-49/ 🟦 2024-05-21_23-49/
🟨 2024-06-17_04-49/ 🟨 2024-06-16_13-49/ 🟦 2024-06-04_23-49/ 🟦 2024-05-20_23-49/
🟨 2024-06-17_03-49/ 🟨 2024-06-16_12-49/ 🟦 2024-06-03_23-49/ 🟦 2024-05-19_23-49/
🟨 2024-06-17_02-49/ 🟨 2024-06-16_11-49/ 🟦 2024-06-02_23-49/ 🟦 2024-05-18_23-49/
🟨 2024-06-17_01-49/ 🟨 2024-06-16_10-49/ 🟦 2024-06-01_23-49/ 🟦 2024-05-17_23-49/
🟨 2024-06-17_00-49/ 🟦 2024-06-15_23-49/ 🟦 2024-05-31_23-49/ 🟩 2024-04-30_23-49/
🟨 2024-06-16_23-49/ 🟦 2024-06-14_23-49/ 🟦 2024-05-30_23-49/ 🟩 2024-03-31_23-49/
🟨 2024-06-16_22-49/ 🟦 2024-06-13_23-49/ 🟦 2024-05-29_23-49/ 🟩 2024-02-29_23-49/
🟨 2024-06-16_21-49/ 🟦 2024-06-12_23-49/ 🟦 2024-05-28_23-49/ πŸŸͺ to_delete/
🟨 2024-06-16_20-49/ 🟦 2024-06-11_23-49/ 🟦 2024-05-27_23-49/ 🟫 some_other_directory/
🟨 2024-06-16_19-49/ 🟦 2024-06-10_23-49/ 🟦 2024-05-26_23-49/ 🟫 latest -> 2024-06-17_09-49/
  • 🟨 Your backup directory will contain (up to) 24 directories for the last 24h. If there are multiple directories for a certain hour, prune_backups will keep the latest directory (determined by name, not by metadata) and prune the other directories for this hour. If there is no backup directory for a certain hour, that hour will simply be skipped; i.e. there won't be any extra hourly backups appended for compensation after the 24h mark.
  • 🟦 Your backup directory will contain (up to) 30 directories for the last 30 days. If there are multiple directories for a certain day, prune_backups will keep the last hourly directory (determined by name, not by metadata) and prune the other directories for this day. If there is no backup directory for a certain day, that day will simply be skipped; i.e. there won't be any extra daily backups appended for compensation after the 30 days mark.
  • 🟩 Your backup directory will contain directories for each month before that. If there are multiple directories for a certain month, prune_backups will keep the last daily directory (determined by name, not by metadata) and prune the other directories for this month. If there is no backup directory for a certain month, that month will simply be skipped; i.e. there won't be any extra daily backups kept for compensation in neighboring months or any magic like that.
  • πŸŸͺ The directory to_delete is created by prune_backups in the backup directory; and it moves all pruned directories here. You can change the name of this directory with the to_directory parameter. Please note that this directory should reside in the same filesystem as your backup directory.
  • 🟫 Files, softlinks, or directories with other naming schemes than YYYY-MM-DD* will remain untouched.

What do I need to run it?

You only need the golang-compiler once to build the executable. It is available on a wide variety of operating systems and processor architectures, including x86, x64, ARM32, ARM64, Windows, Linux, MacOS, etc.. See https://go.dev/dl/

How do I run it?

Download the code and run "go build prune_backups.go" once. This will build a platform specific command line executable for you. You can run this executable on the shell or in scrips or in cron jobs... as you like.

Example: prune_backups -dir=/mnt/backups

Longer Example (with context):

#!/bin/sh

my_backup_storage_dir=/srv/backup/mywebserver
current_snapshot_dir=$(date +%Y-%m-%d_%H-%M)

rsync -avR --checksum --delete --link-dest=$my_backup_storage_dir/latest [email protected]:/var/www $my_backup_storage_dir/_$current_snapshot_dir

cd $my_backup_storage_dir

# to make sure we can identify incomplete backups by their directory name, we started the directory name with an underscore character (_). now rename it to indicate it was complete.
mv _$current_snapshot_dir $current_snapshot_dir

# make the latest snapshot easily referable
ln -nsf $current_snapshot_dir latest

# prune old backups
prune_backups -dir=$my_backup_storage_dir

# uncomment the following line if you really want to delete the old backups
# rm -rf $my_backup_storage_dir/to_delete

(You would run this script hourly via cron on your backup server to backup your web server.)

What is the exact naming pattern? And how do I change this?

The exact naming pattern is YYYY-MM-DD_HH-mm, where

  • YYYY is the 4-digit year,
  • MM the 2-digit† month,
  • DD the 2-digit† day,
  • HH the 2-digit† hour (24h format), and
  • mm the 2-digit† minute of the time the backup was created.

You cannot change this pattern unless you change the code. However, the tool will also work when you don't have the minutes or hours in your directory names, i.e. a naming pattern of YYYY-MM-DD is sufficient. The tool will simply will not prune hourly backups for you in this case.

† Please be aware that the tool needs a trailing zero.

prune_backups's People

Contributors

tomtonic avatar

Stargazers

 avatar Malwikl avatar

Watchers

 avatar

prune_backups's Issues

Unexpected behavior for 'yesterday' if some backups are missing

If today is the 5th and there are hourly backups - but not for the time between 24h ago and the end of the 4th, then NO backup will be kept for the 4th. Even if the program works as specified, this behavior is counter-intuitive.

Expected behavior: Keep at least one directory for the 4th.

Check: The same behavior is to be expected on the border of daily to monthly backups for the last month.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.