Giter VIP home page Giter VIP logo

btrfs-snapshots-diff's People

Contributors

darkdragon-001 avatar sysnux avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

btrfs-snapshots-diff's Issues

tests output multiple lines to stderr

Tests outputs those messages to the stderr:

$ ./tests.sh > test.txt
ERROR: Could not open: No such file or directory
At subvol btrfs-diff-tests.child
Found a valid Btrfs stream header, version 1
At subvol btrfs-diff-tests.child
Found a valid Btrfs stream header, version 1
./tests.sh: line 51: jq: command not found
At subvol btrfs-diff-tests.child
Found a valid Btrfs stream header, version 1
Exception ignored in: <_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>
BrokenPipeError: [Errno 32] Broken pipe

Expected

No output should be observed in stderr.

Enhancement: tests.sh output should be programmatically compared with expected outputs

Currently, test outputs can only be reviewed by:

./tests.sh > tests.output
git diff
git checkout -- tests.output

procedure. An example output of git diff is:

diff --git a/tests.output b/tests.output
index 4025582..5c5e8e6 100644
--- a/tests.output
+++ b/tests.output
@@ -7,160 +7,160 @@ btrfs-snapshots-diff.py normal output:
 ======================================
 
 btrfs-diff-tests.child
-	snapshot: uuid=181877fd3bc2444b91d237a8bf59048f, ctransid=75437, clone_uuid=b7512557e0d2384eaea45e3649dc5682, clone_ctransid=75433
+	snapshot: uuid=b05cf7fd116eb9418f8dd50569f4fbe3, ctransid=75849, clone_uuid=beec252fcd969e4d975d0802a5b2e70e, clone_ctransid=75845
 
 __sub_root__
-	times a=2021/01/13 07:55:40 m=2021/01/13 07:55:44 c=2021/01/13 07:55:44
+	times a=2021/01/13 09:56:28 m=2021/01/13 09:56:33 c=2021/01/13 09:56:33
 
...
 dir
-	renamed from "o258-75437-0"
+	renamed from "o258-75849-0"
 	owner 1000:1000
 	mode 755
-	times a=2021/01/13 07:55:44 m=2021/01/13 07:55:44 c=2021/01/13 07:55:44
-o259-75437-0
+	times a=2021/01/13 09:56:33 m=2021/01/13 09:56:33 c=2021/01/13 09:56:33
+o259-75849-0
 	[18, 20] ====================
 
 
 btrfs-snapshots-diff.py CSV output:
 ===================================
-snapshot;clone_ctransid=75433;clone_uuid=b7512557e0d2384eaea45e3649dc5682;ctransid=75437;path=btrfs-diff-tests.child;uuid=181877fd3bc2444b91d237a8bf59048f
-utimes;atime=1610513740.878372;ctime=1610513744.8304334;mtime=1610513744.8304334;path=
-mkfile;path=o257-75437-0
-rename;path=o257-75437-0;path_to=hardlink
+snapshot;clone_ctransid=75845;clone_uuid=beec252fcd969e4d975d0802a5b2e70e;ctransid=75849;path=btrfs-diff-tests.child;uuid=b05cf7fd116eb9418f8dd50569f4fbe3
+utimes;atime=1610520988.8380048;ctime=1610520993.3627512;mtime=1610520993.3627512;path=
+mkfile;path=o257-75849-0
+rename;path=o257-75849-0;path_to=hardlink
 link;path=file2;path_link=hardlink
...
 
 btrfs-snapshots-diff.py JSON output:
@@ -169,25 +169,25 @@ btrfs-snapshots-diff.py JSON output:
   {
     "command": "snapshot",
     "path": "btrfs-diff-tests.child",
-    "uuid": "181877fd3bc2444b91d237a8bf59048f",
-    "ctransid": 75437,
-    "clone_uuid": "b7512557e0d2384eaea45e3649dc5682",
-    "clone_ctransid": 75433
+    "uuid": "b05cf7fd116eb9418f8dd50569f4fbe3",
+    "ctransid": 75849,
+    "clone_uuid": "beec252fcd969e4d975d0802a5b2e70e",
+    "clone_ctransid": 75845
   },

As can be seen, specific values (mtime, atime, some uuids, some file placeholder names, etc.) are changed between tests.sh run, naturally.

TODO

Find a way to modify the expected output's relevant values with calculated values in order to match with the newly created output.

Doesn't work with btrfs-progs 4.12?

Great tool, but didn't work any more with btrfs-progs 4.12. Perhaps this came when I upgraded my ubuntu to 17.10.

The error message was:

ERROR: not dumping send stream into a terminal, redirect it into a file
Error: <class 'subprocess.CalledProcessError'>
executing "btrfs send -p /srv/_btrbk_snapshots/video.20180319T0303 /srv/_btrbk_snapshots/video.20180323T0303 --no-data -f /tmp/snaps-diff"

I looked into man btrfs send. The subvolume has to be the last parameter! (Has this recently changed in btrfs-progs?) So I changed your code accordingly:

<             cmd = ['btrfs', 'send', '-p', args.parent, args.child, '--no-data',
<                    '-f', '/tmp/snaps-diff']
---
>             cmd = ['btrfs', 'send', '-p', args.parent, '--no-data',
>                    '-f', '/tmp/snaps-diff', args.child]

Works again :-)

Tests do not pass

Currently tests throw an exception:

./tests.sh
Create subvolume './btrfs-diff-tests'
Create a readonly snapshot of './btrfs-diff-tests' in './btrfs-diff-tests.parent'
ERROR: Could not open: No such file or directory
~/.sbin/erik-sync/smith-sync/btrfs-snapshots-diff/btrfs-diff-tests ~/.sbin/erik-sync/smith-sync/btrfs-snapshots-diff
~/.sbin/erik-sync/smith-sync/btrfs-snapshots-diff
Create a readonly snapshot of './btrfs-diff-tests' in './btrfs-diff-tests.child'
btrfs-snapshots-diff.py normal output:
======================================
  File "/home/ceremcem/.sbin/erik-sync/smith-sync/btrfs-snapshots-diff/btrfs-snapshots-diff.py", line 97
    raise ValueError(f'Unexpected attribute {self.send_attrs[attr]}')
                                                                   ^
SyntaxError: invalid syntax

btrfs-snapshots-diff.py CSV output:
===================================
  File "/home/ceremcem/.sbin/erik-sync/smith-sync/btrfs-snapshots-diff/btrfs-snapshots-diff.py", line 97
    raise ValueError(f'Unexpected attribute {self.send_attrs[attr]}')
                                                                   ^
SyntaxError: invalid syntax

btrfs-snapshots-diff.py JSON output:
====================================
./tests.sh: line 51: jq: command not found
  File "/home/ceremcem/.sbin/erik-sync/smith-sync/btrfs-snapshots-diff/btrfs-snapshots-diff.py", line 97
    raise ValueError(f'Unexpected attribute {self.send_attrs[attr]}')
                                                                   ^
SyntaxError: invalid syntax

Delete subvolume (commit): '/home/ceremcem/.sbin/erik-sync/smith-sync/btrfs-snapshots-diff/btrfs-diff-tests'
Delete subvolume (commit): '/home/ceremcem/.sbin/erik-sync/smith-sync/btrfs-snapshots-diff/btrfs-diff-tests.parent'
Delete subvolume (commit): '/home/ceremcem/.sbin/erik-sync/smith-sync/btrfs-snapshots-diff/btrfs-diff-tests.child'

SyntaxError: Non-ASCII character

Following error is thrown because of © character used without the encoding declaration in here:

File ".../btrfs-snapshots-diff.py", line 37
SyntaxError: Non-ASCII character '\xc2' in file .../btrfs-snapshots-diff.py on line 38, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details

Optional --child snapshot is confusing

It's confusing to omit the --child parameter because it's not clear how the CHILD snapshot is created (or what is done).

Proposal

A diff tool should not attemt to auto-generate the second reference, that's a confusing behavior. This "having optional --child parameter" feature should be removed and --child parameter should be a requirement.

Add to documentation: Use cases

I'm planning to use this tool to create another implementation of btrfs send | btrfs receive mechanism. See this issue for the rationale.

This tool might be useful while investigating the snapshots to determine which snapshot bloats the disk space.

What else?

(Should we add such a section to the documentation?)

Reading symlinks

Symlinks create inode data which must be read using self._tlv_get_u64() instead of self._tlv_get_string() in L163.

I tried to read out the resulting inode using btrfs inspect-internal inode-resolve -v INODE BTRFS-ROOT, but it returned only the path of the symlink and not of its destination...

ValueError: Unexpected attribute BTRFS_SEND_A_FILE_OFFSET


Traceback (most recent call last):
  File "/home/user/@temp/btrfs-snapshots-diff.py", line 441, in <module>
    modified, commands = stream.decode()
                         ^^^^^^^^^^^^^^^
  File "/home/user/@temp/btrfs-snapshots-diff.py", line 322, in decode
    idx2, path = self._tlv_get_string(
                 ^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/@temp/btrfs-snapshots-diff.py", line 134, in _tlv_get_string
    raise ValueError('Unexpected attribute %s' % self.send_attrs[attr])
ValueError: Unexpected attribute BTRFS_SEND_A_FILE_OFFSET

JSON export

As I wanted to create BTRFS two-way sync tool I also needed to parse there streams.

I have version with outputs JSON data with summary of stram content. If it is useful for someone I can send pull request with this output format.

my version is available here: https://gitlab.grifart.cz/jkuchar1/btrfs-send-parser

CSV output is broken if a filename contains a semicolon

Reproduction

touch "$HOME/hello;there;my;file;is;here"

CSV output becomes:

home/ceremcem/hello;renamed from "o7899611-71742-0";owner 1000:1000;mode 644;times a=2021/01/11 12:36:48 m=2021/01/11 12:36:48 c=2021/01/11 12:36:48
o7899612-71742-0;mkfile;rename to "home/ceremcem/hello;there;my;file;is;here"

The first column is inaccurate.

Unexpected attribute BTRFS_SEND_A_FILE_OFFSET

Using kernel 4.8.3 and btrfs-progs 4.7.2 I get the following error when I do a diff:

$> python2 btrfs-snapshots-diff.py -p /mnt/snapshots/\@home/2016-07-28T10\:12\:53+0100 -c /mnt/snapshots/\@home/2016-10-20T03\:10\:16+0100
At subvol /mnt/snapshots/@home/2016-10-20T03:10:16+0100
Found a valid Btrfs stream header, version 1
Traceback (most recent call last):
  File "btrfs-snapshots-diff.py", line 377, in <module>
modified, commands = stream.decode()
  File "btrfs-snapshots-diff.py", line 270, in decode
'BTRFS_SEND_A_PATH', idx + self.l_head)
  File "btrfs-snapshots-diff.py", line 99, in _tlv_get_string
raise ValueError('Unexpected attribute %s' % self.send_attrs[attr])
ValueError: Unexpected attribute BTRFS_SEND_A_FILE_OFFSET

Is this by any chance a new attribute that's crept into the send format?

generate GNU diffs

Since you made all the effort of decoding the stream, what about another output format: GNU diff

This would allow to extract patch files from btrfs snapshots :)

struct.error: unpack requires a buffer of 10 bytes

la orana,

when passing a file that is not done yet, it might be nice to have a user-friendly message

sudo btrfs send --no-data -p ~/d/.snapshots/one/ /mnt/d/.snapshots/two/ > diff  & # still running...

sudo ./bin/btrfs-snapshots-diff.py --by_path -t -f diff
Traceback (most recent call last):
  File "./bin/btrfs-snapshots-diff.py", line 653, in <module>
    main()
  File "./bin/btrfs-snapshots-diff.py", line 620, in main
    commands, paths = stream.decode(bogus=args.bogus)
  File "./bin/btrfs-snapshots-diff.py", line 152, in decode
    l_cmd, cmd, _ = unpack('<IHI', self.stream[offset : offset + self.l_head])
struct.error: unpack requires a buffer of 10 bytes

maybe this could be:

if  (fuser diff):
	It looks like the file you referenced is still being saved
else:
	It looks like the file you referenced is corrupt

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.