bscp-tool / bscp Goto Github PK

View Code? Open in Web Editor NEW

42.0 8.0 22.0 39 KB

Bscp – Secure and efficient copying of block devices

Home Page: https://bscp.njh.eu

License: ISC License

Python 69.66% HTML 30.34%

bscp's Introduction

Bscp – Secure and efficient copying of block devices

Please find the project website and documentation at

https://bscp.njh.eu

bscp's People

Contributors

Stargazers

Watchers

bscp's Issues

bscp is slower than rsync

I'm copying a bunch of LV's from the local machine to a remote machine. My old script did the following:

Get the name of the next LV
Snapshot the LV
Calculate the number of "chunks" in the LV (chunksize = 1G)
3a) Use dd to read the chunk from the snapshot, and write to a ramdrive on the local system
3b) Use ssh to the remote system, and use dd to read the chunk from the remote LV to a ramdrive on the remote
3c) Depending on time of day and day of week, either set the rate to 1.5m or unlimited
3d) Use rsync with the rate to update the remote copy of the "chunk"
3e) Use dd to write the remote chunk back to the LV
3f) Delete the local chunk copy
3g) loop back to 3a for the next chunk
Delete the snapshot

The "new" script seemed like a much better solution, since it would save steps 3a, 3b, and 3e which should all be somewhat "slow" compared to just reading, and writing the changes. It simply replaced all of step 3 with bscp to copy the snapshot to the remote.

The old script took approx 36 hours to complete (max 36 hours, min 29.5 hours, avg 33.8 hours over 10 runs) , and my hope was that bscp would help to reduce it to closer to 24 hours (or less, since it no longer sets a bwlimit during the sync copy). However, it is taking significantly longer, the current run has taken over 72 hours, and is still not complete yet. This isn't the first run, just I don't have the logs from the previous runs.
Can anyone advise if they have seen bscp as being (significantly) less efficient than rsync, or am I doing something wrong? Is there an alternative?

For reference, the maximum speed of the connection is 20Mbps, and the total size of all LV's being copied is 5.4TB.

Switch to a shorter GitHub repository name and project domain name

Switch from the pretty lengthy and confusing GitHub-pages-induced repository name:

bscp-tool.github.io

to the more obivious plain:

bscp

All existing GitHub clones using previous repository URIs should continue to work:

These should all redirect to the new repository URI:

https://github.com/bscp-tool/bscp

This simplification means, however, that we can't use our current project domain anymore:

https://bscp-tool.github.io

Keeping this would require maintaining a separate repository with that long name, as enforced by GitHub pages. This is against our design to keep the code and documentation in one repository. One possibility would be to keep that second repository in sync by some CI triggers, but that would add yet another moving part to the system and might cause trouble in the future. Moreover, recreating that repository would mean we loose the redirect of the repository URIs that would otherwise be automatically generated by GitHub upon the renaming.

Luckily, there is a simpler solution to this messy situation: We can just use an entirely different project domain, adding a CNAME file to the repository and be fine with it. For this purpose, I'm donating the following subdomain of my private main domain to this project:

https://bscp.njh.eu

As this new project domain is even shorter than our previous GitHub subdomain, this will be a win-win situation.

For your attention: @advanced-schema @alkmim @art0int @countingpine @imbmf @jhcloos @koollman @leprelnx @urjaman @xtaran

Destination bigger than source

Hi Volker, I used in the pass blocksync.py for these tasks, but it imposes a mandatory condition: source must be identical in size to the target. I was looking for an alternative an I found your utility in github and after test it I saw that it is possible to copy with a bigger destination device, but I don't know if it is safe because I did not see anything on the web about it.

RuntimeError: Checksum mismatch after transfer

I dont know if the syntax is right.
I'm using:

./bscp /dev/sda2 10.32.0.255:/dev/sda2

And I get:

Traceback (most recent call last):
  File "./bscp", line 162, in <module>
    (in_total, out_total, size) = bscp(local_filename, remote_host, remote_filename, blocksize, hashname)
  File "./bscp", line 142, in bscp
    raise RuntimeError('Checksum mismatch after transfer')
RuntimeError: Checksum mismatch after transfer

No other output before that.
I have python 2.7.15rc1 in Ubuntu 18.

Poll: Use stronger hash algorithm by default?

@art0int @geaplanet @introlinux @rtmf

Bscp uses SHA-1 by default, which is no longer considered a strong crypto hash. Should we switch to SHA-256 by default?

Or, are you already using SHA-256 (or stronger) via the "HASH" command line option anyway?

Please share your opinions!

Lack of any progress information

As this is designed for copying block devices, it's highly likely that we are talking about multiple gigabytes of data, so there should really be some form of progress information

Remote script failed to execute properly

CentOS 7:

lvcreate -L 1M -n bscp-test staticnode13
lvcreate -L 2M -n bscp-test2 staticnode13
/tmp/bscp /dev/staticnode13/bscp-test localhost:/dev/staticnode13/bscp-test2 4194304 sha1
results in:
Traceback (most recent call last): File "/tmp/bscp", line 176, in <module> (in_total, out_total, size) = bscp(local_filename, remote_host, remote_filename, blocksize, hashname) File "/tmp/bscp", line 132, in bscp raise RuntimeError('Remote script failed to execute properly') RuntimeError: Remote script failed to execute properly

While it does not make any difference, if you use LVM backed or ZFS backed or File backed raw devices as well if you talk to the sshd via localhost or external.

What exactly does this mean "Remote script failed to execute properly" ?

I understand that its coming from

sanity_digest = hashlib.new(hashname, remote_filename).digest() remote_digest = io.read(len(sanity_digest)) if remote_digest != sanity_digest: raise RuntimeError('Remote script failed to execute properly')

But i have no idea why :-)

Bandwidth limit?

Any suggestion how one can limit bandwidth?

Can only work in one direction?

Why?...

Documentation says:

Usage
bscp SRC HOST:DEST [BLOCKSIZE] [HASH]

What if I need to copy a remote block device into a local file? This fails:

$ /path/to/bscp hostname:/dev/vg/vm-vmname.snap vmname.img
Usage:

    bscp SRC HOST:DEST [BLOCKSIZE] [HASH]

Is there any reason it can't work both ways?

Future of bscp

I'd like to hear opinions about the future development of the bscp tool.

Bscp was initially created to fill a gap where Rsync had a really bad performance back then: Backing up single, large block devices over the network, transferring only changed blocks. This was useful to transfer disk images of virtual machines, even encrypted ones.

Regarding #14, Rsync seems to have caught up meanwhile, so maybe this tool is no longer needed at all?

In case there are still users who have important use cases for Bscp: Would anyone step up to take maintainership of this project? I'm willing to create an organization on GitHub and to move Bscp to it, given that one or two people join in.

Otherwise, I'd propose to close and archive this project, and to put a notice into the README redirecting people to Rsync.

What do you think?

By default what size of blocksize is taken?

If you don't pass any blocksize what size is taken?

bscp-tool / bscp Goto Github PK

bscp's Introduction

Bscp – Secure and efficient copying of block devices

bscp's People

Contributors

Stargazers

Watchers

Forkers

bscp's Issues

Recommend Projects

Recommend Topics

Recommend Org