Giter VIP home page Giter VIP logo

snapdiff's Introduction

snapdiff

snapdiff compares two snapshots of a directory tree, captured at different points in time. (Think of a “snapshot” as a backup of the original directory tree, in the sense of a full copy.) That way, it gives a high-level insight into how the directory tree has evolved over time.

Learn more in this blog post.

Example

Say, you want to compare two snapshots, one taken at 2023-09-01, and another one taken at 2023-10-01:

$ snapdiff 2023-09-01/ 2023-10-01/

                           FILES             BYTES
                                     G   M   K   B
TOTAL       Snap 1        87,243    98,407,188,994
            Snap 2        87,319    98,591,738,372
            
OF WHICH    Identical     87,134    97,551,550,976
            Moved             38       134,217,728
            Added             87       234,881,024
            Deleted           11        50,331,648
            Modified         147       671,088,644 (+282,172)

The categories are defined as:

  • Identical: both snapshots contain a file at the same path with the same contents.
  • Moved: both snapshots contain a file with the same contents, but at different paths.
  • Added: the second snapshot contains a file whose path or contents is not present in the first snapshot.
  • Deleted: the first snapshot contains a file whose path or contents is not present in the second snapshot.
  • Modified: both snapshots contain a file at the same path, but with different contents.

Note: the files count doesn’t include folders.

Usage

snapdiff
    [--report PATH]
    [--include-dot-paths]
    [--include-symlinks]
    [--workers N] OR [--workers N1:N2]
    [--no-color]
    SNAP1 SNAP2

Run snapdiff --help for all details.

Build from Sources

Prerequisites: Rust toolchain (see Cargo.toml for required version).

Compile via cargo build --release. (Produces binary to target/release/snapdiff.)

About

snapdiff was created by Jan Heuermann. The sources are available under the terms of the MIT license.

snapdiff's People

Contributors

jotaen avatar jdx avatar

Stargazers

 avatar Luke Hamburg avatar Christopher DeGuise avatar SampleSpace avatar Dan Kaslovsky avatar Alex Kup avatar Edwin Kofler avatar Adanos avatar Pavel Tishkov avatar Gavin Baker avatar Niranjan Anandkumar avatar Valentin Ivanov avatar Amped Meds avatar Sergey Mordvinov avatar Борис Грибов avatar  avatar (⋆❛ ہ ❛⋆)⊃.:☆..:*・☆ avatar  avatar 0xYYY avatar Dan Reiland avatar Szymon Marczak avatar Christoph Kappel avatar F.Baube avatar Sebastian YEPES avatar Masanori Ogino avatar Meysam avatar  avatar James D avatar Mario Finelli avatar Viet Phan avatar Bryant Biggs avatar ashfinal avatar Salvatore Gentile avatar Tauasa Timoteo avatar Javier Tia avatar Christopher avatar tomotomo avatar  avatar Alex Whitman avatar Muhammad Mominul Huque avatar Sebastian Thiel avatar Andriy Romanov avatar Alex Kwiatkowski avatar Alex Stratoudakis avatar

Watchers

 avatar  avatar  avatar

Forkers

jdx dmreiland

snapdiff's Issues

Define and implement behaviour for symlinks

For my own purposes, I rely on snapdiff’s default behaviour, which is to skip symlinks. Therefore, I’m actually not sure how well things go in case they are enabled. (Think: circular references.)

It probably doesn’t make sense to resolve symlinks, and count the bytes of the target file. (Also note, the symlink could point to a file outside of the snapshot directory, in which case we certainly don’t want to count these bytes.) But it still might be of interest to see whether a symlink has changed it’s target, or whether it still points to the same file.

I’m wondering whether it’s reasonable enough to include symlinks in the regular file count, without incrementing the byte count, though. So, e.g., a symlink target was changed, increment the “modified” file counter, but not the “modified” byte size.

Otherwise, it might be necessary to introduce a completely separate category or counting logic for symlinks.

(Originally from #2.)

Define and implement behaviour for hardlinks

I also haven’t considered hardlinks so far, so their behaviour is not well-defined.

Hardlinks are quite tricky, because their behaviour depends on how the snapshots were created, or whether you compare two snapshots against each other, or whether you compare one snapshot against the original directory tree.

  • If you do cp -R for creating a snapshot, then all hardlinks from the original directory tree are created as individual files in the snapshot.
  • If you do rsync -r --hard-links for creating a snapshot, then all hardlinks from the original directory tree are cloned (as hardlinks) in the snapshot.

The other issue is that it’s more complex to determine the “redundant” hardlinks within the same snapshot in the first place.

Not sure yet, what the best solution is here. I’m also not sure how common this problem actually is, or whether the additional complexity of solving it is worth the benefit.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.