Giter VIP home page Giter VIP logo

archiver3's People

Contributors

r3cgm avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

archiver3's Issues

turn off journaling with ext4

Consider removing the journaling feature of ext4 if it makes sense to do so. For example, since we (will) have such good data validation going on already, is it worth the performance penalty to have journaling in play?

backup checksum

Add an optional feature to enable the --checksum option when performing the backup rsync. This will perform a verification of files as they are being transferred. However, this feature is expensive, requiring twice as much bandwidth. It is also somewhat superseded by the verify.py utility which will perform this sort of validation anyway. But if you have bandwidth to spare, this could be a good option to have.

check archive for no files

The code assumes that the archive directory is populated with files. Handle the scenario where the archive has no files gracefully.

error checking for next free loop device

When using losetup -f to determine the next free loopback device, consider the case where none are available. There is a builtin limit, maybe 8? This limit is possible to extend through mknods or something.

yubikey integration

Provide for use of a Yubikey to perform second factor (or even possibly 'only' factor) decryption of volumes.

show more inline help

Show an ASCII diagram of the ArchiveR3 structure. For example:

container > loopback > encrypt > map > format > mount

Basically, show some sort of hierarchical way for users to understand what the underlying structure is.

progress update when encrypting volume

When encrypting a volume via tc-play, there is typically no output generated. Since this operation could conceivably take > 24 hours on a 50+ gig file, not having any kind of status update is annoying. Imagine there was a problem that happened 2 hours in and you had no idea... you might wait until 36 hours had passed before grudgingly stopping the process, and you would have wasted a day and a half. Find some way to print progress updates.

negative sizing for small archives

For small directories, it is possible to get negative size calculations. Fix this.

             Estimated Consumption: -0.90% 8192/-909262 8K/-909262B

warn if remote backup directory low on space

Perform a check when a new archive is being transferred, or when an existing backup is being extended in size, if the remote backup directory is not big enough to handle it.

detect if files are not accessible

When determining the size of files in directories, we are not accounting for files which are not accessible (such as those contained in a subdirectory which we do not have permission to enter).

breaking out of directory sizing causes unclean abort

If you hit Ctrl-C...

                           Archive: /mnt/remote/r3cgm/dir/
      Sizing /mnt/remote/r3cgm/dir/: ^CTraceback (most recent call last):

File "./backup.py", line 370, in
backup.main()
File "./backup.py", line 342, in main
rc = self.backup()
File "./backup.py", line 162, in backup
arc_block = dir_size(archive_dir, block_size=512)
File "/home/user/ArchiveR3/ArchiveR3.py", line 94, in dir_size
status_result('UNEXPECTED ERROR' + sys.exc_info()[0], 3)
TypeError: cannot concatenate 'str' and 'type' objects

tc-play api instead of cli

Consider using the tc-play API instead of the command-line interface. And, if abandoning the CLI, it may be possible to also remove the dependency for expect.

calibrate upload rate against availability

Perform a bandwidth test to see what the upload rate currently is, e.g. 6mbit/s. Take that result and use it to calculate some fractional upload rate that rsync can use, such as 85%.

Also consider a dependent feature where the rsync is periodically terminated and restarted to allow for recalibration of the upload rate. Since partial transfers are preserved, this should not harm the file upload in progress. And this would help with the use case where local non-backup network needs might increase, at the same time that a long sustained backup was in progress and taking all that bandwidth.

perform remote creation of dd image

Although the encryption and transfer phase of backups need to be performed 'locally' for security reasons, there is no harm in having the creation of the 0'ed out file via dd performed remotely. So for example we can ssh to remote host and run the dd command there. That could yield a 50x speed increase during this phase of the backup. Currently there is a container being dd'ed the old fashioned way, and will take 16 hours to provision a 50 gig container.

global verbose attribute

Refactor the code such that the ArchiveR3 module features a verbose attribute which can be set during argument processing, rather than making this an option that gets passed around to various subroutines to let them know they should be verbose.

convert SHA-1 hashing checksums to SHA-3

SHA-1 is insecure. Although this probably does not matter given the context, it'd be a good idea to go to SHA-3. However, there is no builtin Python lib for SHA-3, so we'll stick with SHA-1 for now. (Confirm?)

verify archive creation ownership and permissions

If for example the backup filesystem is mounted in such a way that newly created files become owned by root instead of the normal user executing the backup script, subsequent operations like attempting to mount the encrypted volume might fail.

password echoed in plaintext

Need to fix the timing on the expect script internally, such that it does not accidentally echo the password to the console:

spawn sudo tcplay -c -d /dev/loop0 -a whirlpool -b AES-256-XTS
Passphrase:
Repeat passphrase: asoentusnoaehuntoaehuntoahusnoaehu-files.archive

rsync output not captured properly

The rsync call produces output which seems to imply the process is forked. Also, attempting to print it out crashes the program. Get a handle on all output and do something useful with it.

refactor Popen calls to use shell=False

Now that a helper utility has been located to properly tokenize the arguments to Popen, convert the existing shell=True calls to shell=False, because that is the recommended way to do it.

support for dark-colored terminals

Having just switched the ANSI colors to dark (assuming light background), we should not just assume that everyone has the same sort of terminal. Add a config file option to support dark terminals.

password entry for encrypted container fails due to race condition

By using the system 'expect' tool, there is a race condition during the mounting of the encrypted container (via 'tcplay -m') where control has not yet been passed by the child process to STDIN before the parent python process issues the password. This results in the password being displayed directly on the screen, and also prevents the backup from proceeding. This race condition has been solved by introducing a 50ms delay automatically in the Python pexpect library.

reprovision does not honor --auto

Make it so that if an existing container is too small for an archive, the reprovisioning of the container happens automatically if the --auto or --reprovision arguments are specified.

resize archive should extend rather than replace existing

Part of the archive container creation process is to use 'dd' to create a file filled with 0's. When an existing container becomes too small and an increase is needed, take the existing archive and pad with 0's rather than scrapping the entire archive and starting over from scratch.

benchmark crypt speed to give some sense for how long the operation will take

Encrypting containers which are 100's of gigs in size could conceivably take over a day, so perform a benchmark before getting started so that the user has some sense for how long the operation will take. Command-line tests show that encrypting a 50 meg container should provide a reasonable balance between benchmark time and results accuracy.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.