snapdiff's Introduction

snapdiff

snapdiff compares two snapshots of a directory tree, captured at different points in time. (Think of a “snapshot” as a backup of the original directory tree, in the sense of a full copy.) That way, it gives a high-level insight into how the directory tree has evolved over time.

Learn more in this blog post.

Example

Say, you want to compare two snapshots, one taken at 2023-09-01, and another one taken at 2023-10-01:

$ snapdiff 2023-09-01/ 2023-10-01/

                           FILES             BYTES
                                     G   M   K   B
TOTAL       Snap 1        87,243    98,407,188,994
            Snap 2        87,319    98,591,738,372
            
OF WHICH    Identical     87,134    97,551,550,976
            Moved             38       134,217,728
            Added             87       234,881,024
            Deleted           11        50,331,648
            Modified         147       671,088,644 (+282,172)

The categories are defined as:

Identical: both snapshots contain a file at the same path with the same contents.
Moved: both snapshots contain a file with the same contents, but at different paths.
Added: the second snapshot contains a file whose path or contents is not present in the first snapshot.
Deleted: the first snapshot contains a file whose path or contents is not present in the second snapshot.
Modified: both snapshots contain a file at the same path, but with different contents.

Note: the files count doesn’t include folders.

Usage

snapdiff
    [--report PATH]
    [--include-dot-paths]
    [--include-symlinks]
    [--workers N] OR [--workers N1:N2]
    [--no-color]
    SNAP1 SNAP2

Run snapdiff --help for all details.

Build from Sources

Prerequisites: Rust toolchain (see Cargo.toml for required version).

Compile via cargo build --release. (Produces binary to target/release/snapdiff.)

About

snapdiff was created by Jan Heuermann. The sources are available under the terms of the MIT license.

snapdiff's People

Contributors

Stargazers

Watchers

snapdiff's Issues

Define and implement behaviour for symlinks

For my own purposes, I rely on snapdiff’s default behaviour, which is to skip symlinks. Therefore, I’m actually not sure how well things go in case they are enabled. (Think: circular references.)

It probably doesn’t make sense to resolve symlinks, and count the bytes of the target file. (Also note, the symlink could point to a file outside of the snapshot directory, in which case we certainly don’t want to count these bytes.) But it still might be of interest to see whether a symlink has changed it’s target, or whether it still points to the same file.

I’m wondering whether it’s reasonable enough to include symlinks in the regular file count, without incrementing the byte count, though. So, e.g., a symlink target was changed, increment the “modified” file counter, but not the “modified” byte size.

Otherwise, it might be necessary to introduce a completely separate category or counting logic for symlinks.

(Originally from #2.)

Define and implement behaviour for hardlinks

I also haven’t considered hardlinks so far, so their behaviour is not well-defined.

Hardlinks are quite tricky, because their behaviour depends on how the snapshots were created, or whether you compare two snapshots against each other, or whether you compare one snapshot against the original directory tree.

If you do cp -R for creating a snapshot, then all hardlinks from the original directory tree are created as individual files in the snapshot.
If you do rsync -r --hard-links for creating a snapshot, then all hardlinks from the original directory tree are cloned (as hardlinks) in the snapshot.

The other issue is that it’s more complex to determine the “redundant” hardlinks within the same snapshot in the first place.

Not sure yet, what the best solution is here. I’m also not sure how common this problem actually is, or whether the additional complexity of solving it is worth the benefit.

Recommend Projects

jotaen / snapdiff Goto Github PK

snapdiff's Introduction

snapdiff

Example

Usage

Build from Sources

About

snapdiff's People

Contributors

Stargazers

Watchers

Forkers

snapdiff's Issues

Define and implement behaviour for symlinks

Define and implement behaviour for hardlinks

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent