Giter VIP home page Giter VIP logo

scorch's Introduction

scorch (Silent CORruption CHecker)

scorch is a tool to catalog files and their hashes to help in discovering file corruption, missing files, duplicate files, etc.

Usage

usage: scorch [<options>] <instruction> [<directory>]

scorch (Silent CORruption CHecker) is a tool to catalog files and hashes
to help in discovering file corruption, missing files, duplicates, etc.

positional arguments:
  instruction:             * add: compute and store hashes for all found files
                           * append: compute and store for newly found files
                           * backup: backs up selected database
                           * restore: restore backed up database
                           * list-backups: list database backups
                           * diff-backup: show diff between current & backup DB
                           * hashes: print available hash functions
                           * check: check stored hashes against files
                           * update: update metadata of changed files
                           * check+update: check and update if new
                           * cleanup: remove hashes of missing files
                           * delete: remove hashes for found files
                           * list-dups: list files w/ dup hashes
                           * list-missing: list files no longer on filesystem
                           * list-solo: list files w/ no dup hashes
                           * list-unhashed: list files not yet hashed
                           * list: md5sum'ish compatible listing
                           * in-db: show if hashed files exist in DB
                           * found-in-db: print files found in DB
                           * notfound-in-db: print files not found in DB
  directory:               Directory or file to scan

optional arguments:
  -d, --db=:               File to store hashes and other metadata in.
                           (default: /var/tmp/scorch/scorch.db)
  -v, --verbose:           Make `instruction` more verbose. Actual behavior
                           depends on the instruction. Can be used multiple
                           times.
  -q, --quote:             Shell quote/escape filenames when printed.
  -r, --restrict=:         * sticky: restrict scan to files with sticky bit
                           * readonly: restrict scan to readonly files
  -f, --fnfilter=:         Restrict actions to files which match regex
  -F, --negate-fnfilter    Negate the fnfilter regex match
  -s, --sort=:             Sorting routine on input & output (default: natural)
                           * random: shuffled / random
                           * natural: human-friendly sort, ascending
                           * reverse-natural: human-friendly sort, descending
                           * radix: RADIX sort, ascending
                           * reverse-radix: RADIX sort, descending
                           * time: sort by file mtime, ascending
                           * reverse-time: sort by file mtime, descending
  -m, --maxactions=:       Max actions to take before exiting (default: maxint)
  -M, --maxdata=:          Max bytes to process before exiting (default: maxint)
  -b, --break-on-error:    Any error or hash failure will exit
  -D, --diff-fields=:      Fields to use to indicate a file has 'changed' and
                           and should be rehashed. Combine with ','.
                           (default: size)
                           * size
                           * inode
                           * mtime
                           * mode
  -H, --hash=:             Hash algo. Use 'scorch hashes' get available algos.
                           (default: md5)
  -h, --help:              Print this message

exit codes:
  *  0 : success, behavior executed, something found
  *  1 : processing error
  *  2 : error with command line arguments
  *  4 : hash mismatch
  *  8 : found
  * 16 : not found, nothing processed

Database

Format

The file is simply CSV compressed with gzip.

$ # file, hash digest, size, mode, mtime, inode
$ zcat /var/tmp/scorch/scorch.db
/tmp/files/a,md5:d41d8cd98f00b204e9800998ecf8427e,0,33188,1546377833.3844686,123456

--db argument

The --db argument is takes more than a path.

  • /tmp/test/myfiles.db : Full path. Used as is.
  • /tmp/test : If /tmp/test is a directory -> /tmp/test/scorch.db
  • /tmp/test/ : Force interpretation as directory -> /tmp/test/scorch.db
  • /tmp/test : /tmp/test is not a directory -> /tmp/test.db
  • ./test : Prepend current working directory and same as above. Any relative path with a '/'.
  • test : No forward slashes -> /var/tmp/scorch/test.db

If there is no extension then .db will be added.

Upgrade

If you're using an older version of scorch with the default database in /var/tmp/scorch.db just copy/move the file to /var/tmp/scorch/scorch.db. The old format was not compressed but scorch will handle reading it uncompressed and compressing it on write.

Backup / Restore

To simplify backing up the scorch database there is a backup command. Without a directory defined it will store the database to the same location as the database. If directories are added to the arguments then the database backup will be stored there.

$ scorch -v backup
/var/tmp/scorch/scorch.db.backup_2019-07-29T02:35:46Z
$ scorch -v backup /tmp
/tmp/scorch.db.backup_2019-07-29T02:36:12Z
$ scorch list-backups
/var/tmp/scorch/scorch.db.backup_2019-07-29T02:35:46Z
$ scorch list-backups /tmp
/tmp/scorch.db.backup_2019-07-29T02:36:12Z
/tmp/scorch.db.backup_2019-07-29T02:13:34Z
$ scorch restore /tmp/scorch.db.backup_2019-07-29T02:36:12Z

Example

$ ls -lh /tmp/files
total 0
-rw-rw-r-- 1 nobody nogroup 0 May  3 16:30 a
-rw-rw-r-- 1 nobody nogroup 0 May  3 16:30 b
-rw-rw-r-- 1 nobody nogroup 0 May  3 16:30 c

$ scorch -v -d /tmp/hash.db add /tmp/files
1/3 /tmp/files/c: d41d8cd98f00b204e9800998ecf8427e
2/3 /tmp/files/a: d41d8cd98f00b204e9800998ecf8427e
3/3 /tmp/files/b: d41d8cd98f00b204e9800998ecf8427e

$ scorch -v -d /tmp/hash.db check /tmp/files
1/3 /tmp/files/a: OK
2/3 /tmp/files/b: OK
3/3 /tmp/files/c: OK

$ echo asdf > /tmp/files/d

$ scorch -v -d /tmp/hash.db list-unhashed /tmp/files
/tmp/files/d

$ scorch -v -d /tmp/hash.db append /tmp/files
1/1 /tmp/files/d: 2b00042f7481c7b056c4b410d28f33cf

$ scorch -v -d /tmp/hash.db list-dups /tmp/files
d41d8cd98f00b204e9800998ecf8427e /tmp/files/a /tmp/files/b /tmp/files/c

$ echo foo > /tmp/files/a
$ scorch -v -d /tmp/hash.db check+update /tmp/files
1/4 /tmp/files/b: OK
2/4 /tmp/files/c: OK
3/3 /tmp/files/c: FILE CHANGED
 - size: 0B -> 4B
 - mtime: Tue Jan  1 16:23:57 2019 -> Tue Jan  1 16:24:09 2019
 - hash: d41d8cd98f00b204e9800998ecf8427e -> d3b07384d113edec49eaa6238ad5ff00
4/4 /tmp/files/d: OK

$ scorch -v -d /tmp/hash.db list /tmp/files | cut -d: -f2- | md5sum -c
/tmp/files/c: OK
/tmp/files/d: OK
/tmp/files/a: OK
/tmp/files/b: OK

Automation

A typical setup would probably be initialized manually by using add or append. After it's finished creating the database a cron job can be created to check, update, append, and cleanup the database. By not placing scorch into verbose mode only differences or failures will be printed and the output from the job running will be emailed to the user (if setup to do so).

#!/bin/sh

scorch check+update /tmp/files
scorch append /tmp/files
scorch cleanup /tmp/files

Support

Contact / Issue submission

Support development

This software is free to use and released under a very liberal license. That said if you like this software and would like to support its development donations are welcome.

  • PayPal: [email protected]
  • Patreon: https://www.patreon.com/trapexit
  • Bitcoin (BTC): 12CdMhEPQVmjz3SSynkAEuD5q9JmhTDCZA
  • Bitcoin Cash (BCH): 1AjPqZZhu7GVEs6JFPjHmtsvmDL4euzMzp
  • Ethereum (ETH): 0x09A166B11fCC127324C7fc5f1B572255b3046E94
  • Litecoin (LTC): LXAsq6yc6zYU3EbcqyWtHBrH1Ypx4GjUjm

scorch's People

Contributors

trapexit avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.