Giter VIP home page Giter VIP logo

Comments (8)

tpwrules avatar tpwrules commented on June 13, 2024 1

Good to know you can replicate it. Once the problem is figured out, I'd like to be sure that it can't create bogus archives. If it crashes and fails, that's fine. But if it can silently skip files or something, that's a problem.

from dwarfs.

mhx avatar mhx commented on June 13, 2024 1

The fix was actually much simpler than I thought.

When I introduced the optimization, I had to change the map that keeps track of "unique file sizes" to instead keep track of "unique file (size + 4 KiB hash)s". The code uses this map to determine if:

  1. a (size, hash) pair has never been seen before,
  2. the pair has been seen before exactly once (in which case it needs to kick off hashing jobs for both the first file and the current file),
  3. the pair has been seen before more than once (in which case only the current file needs to be hashed).

This part worked just fine. However, there is a second map that keeps track of a "latch" used to synchronize insertion into a third map that tracks all duplicate files. That whole shebang is necessary because inode objects need to be allocated as soon as possible in order to perform additional work (such as categorization or similarity hashing) asynchronously. So as we discover an unseen (size, hash) pair, we know for sure that this must be a new inode. Any subsequent files with the same (size, hash) pair could refer to the same inode, though. In order avoid creating an additional inode for a duplicate file, we must ensure that the first file (for which we have already created an inode) is inserted to the duplicate files table first. But since hashing is done asynchronously, hashing of the first file could finish after hashing of subsequent files. Hence the latch, which all subsequent files wait for before checking the duplicate files table, and which is released only after the first file has been added to the duplicate files table.

The bug was that the latch map was still only keyed on size rather than (size, hash). This meant there could be collisions in that map after the optimization was added, and there was no check ensuring there are no collisions (there is a check now). As a result, it was possible that files waited on the wrong latch. That "wrong" latch could be released earlier than the "correct" latch, so the subsequent file didn't find its hash in the duplicates map yet. Falsely assuming that it was not a duplicate, this triggered the creation of a new inode object.

Ultimately, despite this, all duplicates will end up correctly in the duplicates map. However, there are now two or more inodes associated with them. Up until the "finalizing" stage, the files know which inode they belong to, but the inodes don't yet know the full list of files. During "finalization", the files from the duplicates map are assigned to the inode referenced by the first file (assuming all files reference the same inode). In the case of the bug, one or more inodes that were created in error will end up with no files. And these "empty" inodes will eventually trigger the exception.

To reiterate: even in the presence of this bug, no individual files or data were lost. It was only that unnecessary inode objects were created and some additional unnecessary work was done.

The fix was simply to change the latch map from using size keys to using (size, hash) keys.

from dwarfs.

mhx avatar mhx commented on June 13, 2024

Ouch, that takes some time to reproduce unfortunately, sorry.

The stack trace sadly isn't of much help since the damage is done earlier, quite likely in the finalizing file inodes stage. That's where the files are assigned to the inode objects, and at least one of these inode objects ends up without at least one file.

I have a hunch that this could be related either to an optimization introduced between the 0.8.0 and 0.9.0 releases (which skips hashing of large files if they have a unique size) or a relatively large change to handle file errors more gracefully, introduced between 0.7.x and 0.8.0.

I'd appreciate if you could try if this problem is reproducible with dwarfs-0.8.0. If it is, it'd be interesting if it also reproduces with dwarfs-0.7.5. Please also add --log-level=verbose for the 0.8.0 release (0.7.5 doesn't support that yet).

In the meantime, I'll be running mkdwarfs on a multi-TiB directory here to see if I can reproduce the problem myself.

from dwarfs.

mhx avatar mhx commented on June 13, 2024

Got it:

root@balrog data # mkdwarfs -i backup -o /dev/null --force --order=revpath --log-level=verbose
V 05:45:46.814369 [mkdwarfs_main.cpp:1221] [--order]
V ...............                            default: revpath
V 05:45:46.814435 [mkdwarfs_main.cpp:1231] [--max-lookback-blocks]
V ...............                            default: 1
V 05:45:46.814447 [mkdwarfs_main.cpp:1241] [--window-size]
V ...............                            default: 12
V 05:45:46.814458 [mkdwarfs_main.cpp:1251] [--window-step]
V ...............                            default: 3
V 05:45:46.814467 [mkdwarfs_main.cpp:1261] [--bloom-filter-size]
V ...............                            default: 4
V 05:45:46.815249 [mkdwarfs_main.cpp:1298] [--compression]
V ...............                            default: zstd [level=22]
I 05:45:46.815908 [scanner.cpp:575] scanning "/raid/data/backup"
I 08:38:58.377506 [scanner.cpp:597] assigning directory and link inodes...
I 08:39:01.342210 [scanner.cpp:607] waiting for background scanners...
I 08:39:01.342283 [scanner.cpp:611] scanning CPU time: 1.593h
I 08:39:01.342316 [scanner.cpp:614] finalizing file inodes...
V 08:39:01.342345 [file_scanner.cpp:403] finalized 0 hardlinks [2.783us]
V 08:39:01.434909 [file_scanner.cpp:429] finalized 176532 unique files [92.54ms]
V 08:39:01.435841 [file_scanner.cpp:429] finalized 0 non-unique files [16.56us]
V 08:39:19.765972 [file_scanner.cpp:429] finalized 5644384 non-unique files [18.33s]
I 08:39:19.791115 [scanner.cpp:626] saved 593.7 GiB / 3.941 TiB in 7327217/13148135 duplicate files
I 08:39:19.996119 [scanner.cpp:647] assigning device inodes...
I 08:39:20.235349 [scanner.cpp:653] assigning pipe/socket inodes...
I 08:39:20.474160 [scanner.cpp:658] building metadata...
I 08:39:20.474215 [scanner.cpp:697] building blocks...
I 08:39:20.474273 [scanner.cpp:661] saving names and symlinks...
I 08:39:20.475152 [scanner.cpp:795] waiting for segmenting/blockifying to finish...
V 08:39:20.624835 [inode_manager.cpp:780] ordering 5820918 inodes by reverse path name...
F 08:39:20.825246 [worker_group.cpp:284] exception thrown in worker thread: dwarfs::runtime_error: inode has no file (any) [inode_manager.cpp:205]
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
scanning: /raid/data/backup/tux/tmp/logitechmediaserver-8.2.1-1652959290/lib/Tie/IxHash.pm
1345324 dirs, 81424/0 soft/hard links, 13148135/13148135 files, 0 other
original size: 3.941 TiB, hashed: 1.042 TiB (12971603 files, 14.48 MiB/s)
scanned: 0 B (0 files, 0 B/s), categorizing: 0 B/s
saved by deduplication: 593.7 GiB (7327217 files), saved by segmenting: 0 B
filesystem: 0 B in 0 blocks (0 chunks, 5820917 fragments, 5820918/5820918 inodes)
compressed filesystem: 0 blocks/0 B written
▏                                                                                                                                                         ▏  0% 🌑
*** Aborted at 1714293560 (Unix time, try 'date -d @1714293560') ***
*** Signal 6 (SIGABRT) (0x200c) received by PID 8204 (pthread TID 0x7f05727ee6c0) (linux TID 8483) (maybe from PID 8204, UID 0) (code: -6), stack trace: ***
    @ 00000000009a1470 (not found)
    @ 0000000000d432af (not found)
    @ 0000000000d7154c (not found)
    @ 0000000000d4322d (not found)
    @ 0000000000420ef1 (not found)
    @ 00000000004e6ea5 (not found)
    @ 000000000043b5c9 (not found)
    @ 0000000000571ab4 (not found)
    @ 0000000000cdc4c3 (not found)
    @ 0000000000d6fc7b (not found)
    @ 0000000000dc275b (not found)
Aborted

So don't worry about investigating further, I can run the tests mentioned above myself.

from dwarfs.

mhx avatar mhx commented on June 13, 2024

Good to know you can replicate it. Once the problem is figured out, I'd like to be sure that it can't create bogus archives. If it crashes and fails, that's fine. But if it can silently skip files or something, that's a problem.

Yes, absolutely agree.

from dwarfs.

mhx avatar mhx commented on June 13, 2024

I have a hunch that this could be related either to an optimization introduced between the 0.8.0 and 0.9.0 releases (which skips hashing of large files if they have a unique size) [...]

It is definitely that optimization. I can now reproduce this with a small set of data. There's even an assert() that triggers much earlier in a debug build.

The good news is: it's not a race condition and it's not a problem that can occur undetected. If you run into this problem, mkdwarfs will crash.

The problem occurs if there are multiple files of the same size, some of which are identical, and some of which are different, but have the first 4 KiB in common. Before the optimization introduced in 0.9.0, if there was more than one file of the same size, all files of that size would be fully hashed to determine duplicates. This is extremely time consuming if you're dealing with e.g. raw images that tend to be identical in size. So with the optimization, the scanner attempts to only hash those files that are identical in both the size and a hash of the first 4 KiB.

I think I know roughly why this goes wrong, but I need to write some tests first that reliably trigger the issue before attempting a fix.

from dwarfs.

mhx avatar mhx commented on June 13, 2024

Once the problem is figured out, I'd like to be sure that it can't create bogus archives.

BTW, an easy way to check the integrity of a DwarFS image using dwarfsck:

$ cd data
$ diff <(dwarfsck /tmp/data.dwarfs --list | sort) <(find * | sort)
$ dwarfsck /tmp/data.dwarfs --checksum=sha256 | sha256sum --check

dwarfsck --list lists all files/dirs in the image. This line can be used to check the image is complete.

dwarfsck --checksum computes a hash for each regular file from the actual data in the image. This line can be used to check that all files can be read correctly.

from dwarfs.

mhx avatar mhx commented on June 13, 2024

Fixed in v0.9.9.

from dwarfs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.