Giter VIP home page Giter VIP logo

dwarfs's People

Contributors

concatime avatar kspalaiologos avatar maxirmx avatar mhx avatar mrwitek avatar rarogcmex avatar txkxgit avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dwarfs's Issues

docker container

It took me 20 minutes to install everything for the project on one of my computer as a test.
I do not want to do it again.

A simple docker container with a volume would solve all of the problems.
Also a seperate project for the mounting part

Decompress/extracting dwarfs images?

I apologize if there is a way to do this and I'm just not seeing it after reading the documentation a few times over, but there doesn't appear to be any direct way to extract created dwarfs images which is a serious missing feature.

Of course, this can be done by instead mounting the FUSE filesystem and copying out, but it seems like a big oversight (assuming there is actually no way to do this and I'm not blind).

Add comparison with `erofs` (present Linux mainline since 5.x)

Apparently the erofs read-only file-system has made it into the mainline stable Linux kernel (since 5.x):

https://www.kernel.org/doc/html/latest/filesystems/erofs.html

Therefore it would be nice to compare dwarfs against erofs, especially since its purpose seems similar (i.e. performance).

From my own limited experimentation with erofs it seems to be twice as fast compared to squashfs.

For example compressing dwarfs own source folder and build folder (with a block size of 4 KiB and using LZ4 or LZ4HC where possible), yields the following:

# mkdwarfs -i . -o /tmp/dwfs.img -l 9 -S 12 -C lz4hc --no-owner --no-time
197M    /tmp/dwfs.img

# mkdwarfs -i . -o /tmp/dwfs.img-d --no-owner --no-time
114M    /tmp/dwfs.img-d

# mkfs.erofs -z lz4,9 -x -1 -T 0 -E force-inode-extended -- /tmp/erofs.img .
216M    /tmp/erofs.img

# mksquashfs . /tmp/sqfs.img -b 4K -reproducible -mkfs-time 0 -all-time 0 -no-exports -no-xattrs -all-root -progress -comp lz4 -Xhc -noappend
196M    /tmp/sqfs.img

Mounting them and reading yields:

  • erofs ~900 MiB/s, no difference on repeats;
  • squashfs ~400 MiB/s, no difference on repeats;
  • dwarfs (4KB) ~250 MiB/s, and on repeat ~900 MiB/s;
  • dwarfs (all defaults) ~300 MiB/s, and on repeat ~900 MiB/s;

Thus, and this is a guess, I would say that erofs is as fast as dwarfs, at least on repeat reads (although all images were stored on tmpfs).

Enhance `mkdwarfs` to support specifying a list of files to include (similar to `cpio`)

A very nice feature to cpio (actually the only operation mode) or tar (via the --files-from) is the option of specifying a list of files to include (instead of recursing through the root folder).

Such a feature would allow one to easily exclude certain files from the source, without having to resort to rsync for example to build a temporary tree.

This could work in conjunction with -i as such: any file within the list are treated as relative to the -i folder, regardless if they start with /, ./ or plain path. Also warn if one tries to traverse outside the -i folder. For example, given that -i source is used:

  • whatever is actually source/whatever;
  • ./whatever is the same as above;
  • /whatever is the same as above;
  • ../whatever would issue an error as it tries to escape the source;
  • a/b/../../c is actually source/c, although it could issue a warning;
  • /some-folder (given it is a folder) would not be recursed, but only itself is created within the resulting image; (it is assumed that one would add other files afterwards);

Also it would be nice to have an option to zero-terminate files list instead of newline.


The above could be quite simple to implement, however an even more useful option would be something like this:

  • in the Linux kernel there is a small tool gen_init_cpio.c (https://github.com/torvalds/linux/blob/master/usr/gen_init_cpio.c#L452) which takes a file describing how an cpio archive (to be used for the initramfs) should be created (see the source code at the hinted line for the file syntax); thus in addition to the previous feature of file-lists, such a "file-system" descriptor would allow one to create (without root credentials on his machine) a file-system with any layout;
  • as an extension to the above, perhaps JSON would be a better choice; :)

Image contents are not accessible when mounting on 0.5.2 and up

As the title says, mounting any dwarfs image results in a seemingly successful mount, with all the filesystem contents visible, however no files within are accessible. Notably, this issue does not occur with the -f flag enabled, something I noticed when trying to get debug output. Also note that my distribution uses fuse version 29. This happens with both the prebuilt dwarfs2 binary on the releases page and my own local builds.

dwarfsextract aborts instead of skipping when corrupt files are encountered

Extracting 2b2tca.dwarfs...
E 03:24:38.703663 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.714209 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.714279 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.714363 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.714432 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.714515 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.714588 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.714693 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.714805 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.714884 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.715012 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.715088 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.715217 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.715300 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.715390 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.715468 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.715540 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.715607 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.715676 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.715740 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.715820 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.715897 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.715975 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.716048 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.716123 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.716199 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.716266 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.716341 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.716408 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.716484 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.716555 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.716654 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.716719 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.716794 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.716876 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.716950 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.717018 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.717081 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.717174 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.717247 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.717354 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.717426 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.717502 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.717570 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.717641 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.717710 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.717773 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.717842 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.717909 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.717983 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.718189 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.718259 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.718332 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.718401 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.718487 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.718557 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.718631 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.718690 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.718769 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.718831 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.718902 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.718977 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.719046 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
I 03:24:41.164877 blocks created: 1298
I 03:24:41.164926 blocks evicted: 1290
I 03:24:41.164955 request sets merged: 9576
I 03:24:41.164983 total requests: 150010
I 03:24:41.165008 active hits (fast): 22247
I 03:24:41.165033 active hits (slow): 122699
I 03:24:41.165063 cache hits (fast): 3184
I 03:24:41.165090 cache hits (slow): 582
I 03:24:41.165117 total bytes decompressed: 87106428928
I 03:24:41.180369 average block decompression: 100.0%
I 03:24:41.180425 fast hit rate: 16.953%
I 03:24:41.180464 slow hit rate: 82.182%
I 03:24:41.180502 miss rate: 0.865%
dwarfs::runtime_error: extraction aborted

Mounts just fine with dwarfs, and almost every file reads just fine - however, as soon as dwarfsextract encounters invalid data, it seems to just completely bail out, unlike dwarfs. I would use dwarfs to extract instead, but the performance seems to be orders of magnitude slower.

I think the expected behavior ought to be printing a warning and skipping the file instead.

PS: dwarfsextract doesn't seem to give realtime progress updates like mkdwarfs does either :(

Add new memory management options for FUSE driver

Granted you might be right. Next time that I try DwarFS, I'll issue a sysctl -q vm.drop_caches=3 which if I'm not mistaken should drop the kernel file-system caches.


(In what follows I refer to the dwarfs image as just image, and to the uncompressed files exposed through the mount point as files.)

However, on the same topic, wouldn't it be useful to have the following complementary options:

  • whether to let the kernel cache the files (not the image), like all normal file-systems do; (I think this is the default;)
  • whether dwarfs daemon accesses the image without using the kernel cache (either via O_DIRECT or by using madvise with MADV_DONTNEED in case of mmap access after a block was used);

At the moment I think that both the files and the image are eventually cached by the kernel, thus increasing the memory pressure of the system.

However by using the two proposed options, one could fine tune the CPU / memory usage to fit one's particular use-case:

  • disable the kernel cache for the files, but enable the kernel cache for the image, one trades CPU for and saves some memory; (useful for example when the application reading the files already has its own caches;)
  • (my proposed default) enable the kernel cache for files, but disable the kernel cache for the image, one saves on memory for the image but trades some CPU (less than in the previous case); (I think this would be the closest thing to how a normal file-system works, only actual files are cached, but not the block device data;)
  • disable the kernel cache for both files and image, one would heavily trade CPU for minimal memory usage; (this would be useful for example when one needs only a single pass of the stored files;)
  • (the current default?) enable the kernel cache for both files and image, one would trade memory for minimal CPU usage;

Originally posted by @cipriancraciun in #9 (comment)

How to disable building binaries for fuse 2?

In case there are both fuse2 and fuse3 version is installed DwarFS will build binaries for both (/usr/sbin/dwarfs for fuse3 and /usr/sbin/dwarfs2 for fuse2). How to override that discouraging behaviour to build dwarfs with only fuse3 or fuse2?

dwarfs attempts to load googletest even if it's already installed during compilation

If tests enabled with -DWITH_TESTS=ON and googletest ( dev-cpp/gtest ) installed globally, dwarfs' cmake tries to download it regardless, that causes crash with network-sandbox.

-- Configuring done
-- Generating done
-- Build files have been written to: /var/tmp/portage/sys-fs/dwarfs-0.5.2-r1/work/dwarfs-0.5.2/googletest-download
[1/9] Creating directories for 'googletest'
[2/9] Performing download step (git clone) for 'googletest'
FAILED: googletest-prefix/src/googletest-stamp/googletest-download 
cd /var/tmp/portage/sys-fs/dwarfs-0.5.2-r1/work/dwarfs-0.5.2 && /usr/bin/cmake -P /var/tmp/portage/sys-fs/dwarfs-0.5.2-r1/work/dwarfs-0.5.2/googletest-download/googletest-prefix/tmp/googletest-gitclone.cmake && /usr/bin/cmake -E touch /var/tmp/portage/sys-fs/dwarfs-0.5.2-r1/work/dwarfs-0.5.2/googletest-download/googletest-prefix/src/googletest-stamp/googletest-download
Cloning into 'googletest-src'...
fatal: unable to access 'https://github.com/google/googletest.git/': Could not resolve host: github.com
Cloning into 'googletest-src'...
fatal: unable to access 'https://github.com/google/googletest.git/': Could not resolve host: github.com
Cloning into 'googletest-src'...
fatal: unable to access 'https://github.com/google/googletest.git/': Could not resolve host: github.com
-- Had to git clone more than once:
          3 times.
CMake Error at googletest-download/googletest-prefix/tmp/googletest-gitclone.cmake:31 (message):
  Failed to clone repository: 'https://github.com/google/googletest.git'

Solution: it should use system googletest library

mkdwarfs aborted with SIGBUS after around 13 hours of runtime

Here's the log:

nabla@satella /media/veracrypt1/squash $ mkdwarfs -i /media/veracrypt1/squash/mp/ -o "/run/media/nabla/General Store/TEMP/everything.dwarfs"
I 17:46:07.266160 scanning /media/veracrypt1/squash/mp/
E 18:14:36.699276 error reading entry: readlink('/media/veracrypt1/squash/mp//raid0array0-2tb-2018.sqsh/Program Files (x86)/Internet Explorer/ExtExport.exe'): Invalid argument
I 19:27:28.763515 assigning directory and link inodes...
I 19:27:29.319281 waiting for background scanners...
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
scanning: /media/veracrypt1/squash/mp//pucktop-echidna-dec2020.sqsh/.local/share/Steam/steamapps/common/Half-Life 2/hl2/bin/server.so
694746 dirs, 299340/1488 soft/hard links, 1254591/5749940 files, 0 other
original size: 1.352 TiB, dedupe: 200.6 GiB (364325 files), segment: 0 B
filesystem: 0 B in 0 blocks (0 chunks, 888778/5384127 inodes)
compressed filesystem: 0 blocks/0 B written
▏                                                                                                                                ▏  0% /
*** Aborted at 1619766236 (Unix time, try 'date -d @1619766236') ***
*** Signal 7 (SIGBUS) (0x7fe8f3af8000) received by PID 15018 (pthread TID 0x7fe94c3e8640) (linux TID 15042) (code: nonexistent physical address), stack trace: ***
/usr/lib64/libfolly.so.0.58.0-dev(+0x2b64bf)[0x7fe9599e54bf]
/usr/lib64/libfolly.so.0.58.0-dev(_ZN5folly10symbolizer21SafeStackTracePrinter15printStackTraceEb+0x31)[0x7fe959924471]
/usr/lib64/libfolly.so.0.58.0-dev(+0x1f6112)[0x7fe959925112]
/lib64/libc.so.6(+0x396cf)[0x7fe9592286cf]
/usr/lib64/libxxhash.so.0(XXH3_64bits_update+0x774)[0x7fe958c6d584]
/usr/lib64/libdwarfs.so(+0x788cd)[0x7fe959e4f8cd]
/usr/lib64/libdwarfs.so(_ZN6dwarfs4file4scanERKSt10shared_ptrINS_4mmifEERNS_8progressE+0x95)[0x7fe959e5b525]
/usr/lib64/libdwarfs.so(+0xe9a89)[0x7fe959ec0a89]
/usr/lib64/libdwarfs.so(+0xf7f6b)[0x7fe959ecef6b]
/usr/lib/gcc/x86_64-pc-linux-gnu/10.2.0/libstdc++.so.6(+0xd315f)[0x7fe95949f15f]
/lib64/libpthread.so.0(+0x7fbd)[0x7fe959142fbd]
/lib64/libc.so.6(clone+0x3e)[0x7fe9592ee26e]
(safe mode, symbolizer not available)
Bus error

I left this running while trying to compress over 8 TiB of data, and after about 13 hours of scanning, it just sorta crashed and gave up. I don't really want to run it again to debug it or anything, so I'm just going to leave this here.

Running Gentoo Linux on a Ryzen 5 3600 with 64 GB of memory, if that helps.

Sorry about the lack of information. I'd really like to provide more - and if there's anything you'd like me to try to resolve this, let me know (I really like dwarfs, and was hoping it would work for this obscenely large dataset too!) Just uhhh.. keep in mind that I'm prooobably not going to wait 13 hours again unless I know it works :/

EDIT: Forgot to specify my version number. Whoops. I'm using 0.5.4-rc2 from the GURU repositoriy here https://github.com/gentoo/guru/blob/master/sys-fs/dwarfs/dwarfs-0.5.4-r2.ebuild - I did build with -O3, but dwarfs seems to work just fine with smaller inputs so I dunno. Specifically I'm using this: https://github.com/InBetweenNames/gentooLTO

Clear compile-time jemalloc definition

Good daytime!
I'm Gentoo GURU project (semi-official) maintainer. I've ported dwarfs to Gentoo.
From dependency list dwarfs requires libjemalloc-dev, but It was succesfully build without ones? Does jemalloc really need?
Could you set up clear compile-time cmake option (something like current WITH_LUA) to set up that jemalloc really used?

CMake fails to detect missing `sparsehash` dependency

I'm trying to build 0.2.0 on OpenSUSE Tumbleweed, and after I've successfully run cmake and started building it broke stating it can't find sparsehash dependency:

[ 92%] Linking CXX static library libfolly.a
/tmp/dwarfs-0.2.0/src/dwarfs/block_manager.cpp:32:10: fatal error: sparsehash/dense_hash_map: No such file or directory
   32 | #include <sparsehash/dense_hash_map>
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
make[2]: *** [CMakeFiles/dwarfs.dir/build.make:108: CMakeFiles/dwarfs.dir/src/dwarfs/block_manager.cpp.o] Error 1
make[2]: *** Waiting for unfinished jobs....
[ 92%] Built target folly
make[1]: *** [CMakeFiles/Makefile2:284: CMakeFiles/dwarfs.dir/all] Error 2
make: *** [Makefile:171: all] Error 2

I think and extra check in CMake should solve this.

Mkdwarfs crashes when creating images using compression levels 6+

Hello!

Mkdwarfs crashes for me when i use compression levels 6 and higher, lower levels work fine. This is the error i get:

** Aborted at 1616416371 (Unix time, try 'date -d @1616416371') ***
*** Signal 4 (SIGILL) (0x4b9ec2) received by PID 24703 (pthread TID 0x25a2140) (linux TID 24703) (code: illegal operand), stack trace: ***
Illegal instruction (core dumped)

mkdwarfs-coredump.zip

I'm using the latest (0.4.1) static executables from the releases page. OS is Arch Linux.

SIGBUS happened again - twice!

⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
scanning: /media/veracrypt3/dwarfs/mount//2b2tca.dwarfs/2b2tca/backups/backup-2b2tca-10june2019/main_map_nether/DIM-1/region/r.11.-2.mca
678293 dirs, 297774/10 soft/hard links, 46525/5413471 files, 0 other
original size: 91.28 GiB, dedupe: 24.74 GiB (15061 files), segment: 0 B
filesystem: 0 B in 0 blocks (0 chunks, 31454/5398400 inodes)
compressed filesystem: 0 blocks/0 B written
▏                                                                                                                                                                                      ▏  0% /
*** Aborted at 1623187919 (Unix time, try 'date -d @1623187919') ***
*** Signal 7 (SIGBUS) (0x7f3559c9b000) received by PID 5233 (pthread TID 0x7f35853eb700) (linux TID 5254) (code: nonexistent physical address), stack trace: ***
Bus error (core dumped)

The same issue, as described in issue #45, happened again. On the pre-compiled 0.5.5 release binaries. Twice, actually - the first time was on a completely different system, but the second time I was able to get a core dump. Even better, I actually think I know what's causing it.

When I ran mksquashfs instead of mkdwarfs on the exact same data, this happened:

nabla@satella /media/veracrypt3/dwarfs $ doas mksquashfs /media/veracrypt3/dwarfs/mount/ /media/veracrypt2/LiterallyEverything-08-Jun-2021.sqsh -comp zstd -Xcompression-level 22 -b 1M
Parallel mksquashfs: Using 12 processors
Creating 4.0 filesystem on /media/veracrypt2/LiterallyEverything-08-Jun-2021.sqsh, block size 1048576.
[|                                                                                                                                                                     ]   33715/7846602   0%
Read failed because Input/output error

Failed to read file /media/veracrypt3/dwarfs/mount//2b2tca.dwarfs/2b2t.ca world and plugins/main map/region/r.-105.2.mca, creating empty file
[\                                                                                                                                                                     ]   34894/7846602   0%
Read failed because Input/output error
[|                                                                                                                                                                     ]   34894/7846602   0%
Failed to read file /media/veracrypt3/dwarfs/mount//2b2tca.dwarfs/2b2t.ca world and plugins/main map/region/r.-13.5.mca, creating empty file
[/                                                                                                                                                                     ]   35075/7846602   0%
Read failed because Input/output error

Failed to read file /media/veracrypt3/dwarfs/mount//2b2tca.dwarfs/2b2t.ca world and plugins/main map/region/r.-1332.157.mcr, creating empty file
[/                                                                                                                                                                     ]   35771/7846602   0%
Read failed because Input/output error

Failed to read file /media/veracrypt3/dwarfs/mount//2b2tca.dwarfs/2b2t.ca world and plugins/main map/region/r.-18.-17.mcr, creating empty file
[/                                                                                                                                                                     ]   36003/7846602   0%
Read failed because Input/output error
[-                                                                                                                                                                     ]   36003/7846602   0%
Failed to read file /media/veracrypt3/dwarfs/mount//2b2tca.dwarfs/2b2t.ca world and plugins/main map/region/r.-19.6.mcr, creating empty file
[-                                                                                                                                                                     ]   47362/7846602   0%
Read failed because Input/output error

Failed to read file /media/veracrypt3/dwarfs/mount//2b2tca.dwarfs/2b2t.ca world and plugins/main map/region/r.15.7.mca, creating empty file
[/                                                                                                                                                                     ]   47552/7846602   0%
Read failed because Input/output error
[-                                                                                                                                                                     ]   47553/7846602   0%
Failed to read file /media/veracrypt3/dwarfs/mount//2b2tca.dwarfs/2b2t.ca world and plugins/main map/region/r.16.-30.mcr, creating empty file
[=/                                                                                                                                                                    ]   48764/7846602   0%
Read failed because Input/output error

Failed to read file /media/veracrypt3/dwarfs/mount//2b2tca.dwarfs/2b2t.ca world and plugins/main map/region/r.19531.31249.mca, creating empty file
[=/                                                                                                                                                                    ]   54736/7846602   0%
Read failed because Input/output error

Failed to read file /media/veracrypt3/dwarfs/mount//2b2tca.dwarfs/2b2t.ca world and plugins/main map/region/r.45.-34.mcr, creating empty file
[=/                                                                                                                                                                    ]   63050/7846602   0%
Read failed because Input/output error

Failed to read file /media/veracrypt3/dwarfs/mount//2b2tca.dwarfs/2b2t.ca world and plugins/main map_nether/DIM-1/region/r.-14.-9.mcr, creating empty file
[=|                                                                                                                                                                    ]   63522/7846602   0%
Read failed because Input/output error
[=/                                                                                                                                                                    ]   63522/7846602   0%
Failed to read file /media/veracrypt3/dwarfs/mount//2b2tca.dwarfs/2b2t.ca world and plugins/main map_nether/DIM-1/region/r.-16.-8.mca, creating empty file
[=|                                                                                                                                                                    ]   63758/7846602   0%
Read failed because Input/output error
[=/                                                                                                                                                                    ]   63760/7846602   0%
Failed to read file /media/veracrypt3/dwarfs/mount//2b2tca.dwarfs/2b2t.ca world and plugins/main map_nether/DIM-1/region/r.-17.5.mcr, creating empty file
[=-                                                                                                                                                                    ]   66762/7846602   0%
Read failed because Input/output error
[=\                                                                                                                                                                    ]   66762/7846602   0%
Failed to read file /media/veracrypt3/dwarfs/mount//2b2tca.dwarfs/2b2t.ca world and plugins/main map_nether/DIM-1/region/r.-33.21.mcr, creating empty file
[=\                                                                                                                                                                    ]   88907/7846602   1%
Read failed because Input/output error

Failed to read file /media/veracrypt3/dwarfs/mount//2b2tca.dwarfs/2b2tca/backups/backup-2b2tca-10june2019/main_map/region/r.-17.-6.mca, creating empty file
[=\                                                                                                                                                                    ]   91183/7846602   1%
Read failed because Input/output error
[=|                                                                                                                                                                    ]   91183/7846602   1%
Failed to read file /media/veracrypt3/dwarfs/mount//2b2tca.dwarfs/2b2tca/backups/backup-2b2tca-10june2019/main_map/region/r.-3.31.mca, creating empty file
[==-                                                                                                                                                                   ]  115004/7846602   1%

So, here's my theory (and I'm bad at these theories, so please take it with a grain of salt): mksquashfs probably just reads these files with regular old open() and read() calls, so whenever it encounters an I/O error, it can just skip the file and create an empty one as if nothing happened. But mkdwarfs, as @mhx mentioned in the previous issue about this, makes extensive use of mmap, so perhaps every time an I/O error occurs, the region of memory represented by the file it's trying to read becomes inaccessible, and a SIGBUS is triggered instead?

Perhaps this SIGBUS could be caught, and behavior similar to mksquashfs could be preserved whereby the file is simply skipped and replaced with an empty one, or maybe we could try re-reading the file several times before giving up and moving on instead?

Also of note - these I/O errors are coming from a mounted DwarFS filesystem. When I got the SIGBUS error in #45, I was trying to read from a bunch of SquashFS filesystems, not a bunch of DwarFS filesystems, so this could be an issue with the physical disk (Although I still think DwarFS should definitely be robust enough to skip these errors rather than completely bailing out, as I uh... do actually need to recover these files)

Even more bizarrely, despite both SquashFS and DwarFS failing consistently when trying to read roughly the same files, when I re-mounted the 2b2tca.dwarfs filesystem in question, I was able to read all of its content without any I/O errors at all. Truly baffling.

Anyway, I have replied to the email I sent @mhx of the core dump for the previous issue with the new core dump (which is actually much smaller this time), so hopefully it's possible to figure out where this is specifically happening.

systemd recipe?

I'm struggling to come up with a proper systemd recipe. I'm a systemd noob.
When starting it as systemd service dependent on local-fs my mount is not readable by the local user.

systemctl enable dwarfs-mount
systemctl start dwarfs-mount
$ ls -al
drwxr-xr-x.  5 rurban 69632 Nov 30 13:48 perl
d??????????  ? ?          ?            ? perl.s

when starting it locally it works fine.

sudo cat /etc/systemd/user/dwarfs-mount.service
[Unit]
Description=Local DwarfFS Mounts
Documentation=man:dwarfs(1) https://github.com/mhx/dwarfs
DefaultDependencies=no
#ConditionKernelCommandLine=
OnFailure=emergency.target
Conflicts=umount.target
# Run after core mounts
After=-.mount var.mount
After=systemd-remount-fs.service
# But we run *before* most other core bootup services that need write access to /etc and /var
#Before=local-fs.target umount.target
#Before=systemd-random-seed.service plymouth-read-write.service systemd-journal-flush.service
#Before=systemd-tmpfiles-setup.service

[Service]
Type=oneshot
User=rurban
Group=root
RemainAfterExit=yes
ExecStart=/usr/local/bin/dwarfs /usr/src/perl/perl.dwarfs /usr/src/perl.s
StandardInput=null
StandardOutput=journal
StandardError=journal+console

[Install]
WantedBy=local-fs.target

Or maybe should I just add it to my fstab?

EDIT: Beware With an syntax error in such a system unit, depending on local-fs, you can lock yourself out and need to boot from USB. emergency mode i.e. rescue.target will not work.

Error:variable ‘folly::symbolizer::Symbolizer symbolizer’ has initializer but incomplete type

I'm trying to create ebuild for 0.3.0 dwarfs version. And here is an error:
/usr/bin/x86_64-pc-linux-gnu-g++ -DDWARFS_HAVE_LIBLZ4 -DDWARFS_HAVE_LIBLZMA -DDWARFS_HAVE_LIBZSTD -DDWARFS_STATIC_BUILD=OFF -DDWARFS_USE_JEMALLOC -DDWARFS_VERSION="" -DFMT_LOCALE -DFMT_SHARED -DGFLAGS_IS_A_DLL=0 -Ddwarfs_EXPORTS -Iinclude -I/usr/include/libiberty -isystem folly -isystem thrift -isystem fbthrift -isystem zstd/lib -isystem xxHash -isystem . -march=skylake -mtune=skylake -O2 -pipe -mmmx -msse -msse2 -msse3 -mssse3 -mcx16 -msahf -maes -mpclmul -mpopcnt -mabm -mfma -mbmi -msgx -mbmi2 -mavx -mavx2 -msse4.2 -msse4.1 -mlzcnt -mrtm -mhle -mrdrnd -mf16c -mfsgsbase -mrdseed -mprfchw -madx -mfxsr -mxsave -mxsaveopt -mclflushopt -mxsavec -mxsaves --param l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=6144 -fPIC -Wall -Wextra -pedantic -pthread -std=c++17 -MD -MT CMakeFiles/dwarfs.dir/src/dwarfs/logger.cpp.o -MF CMakeFiles/dwarfs.dir/src/dwarfs/logger.cpp.o.d -o CMakeFiles/dwarfs.dir/src/dwarfs/logger.cpp.o -c src/dwarfs/logger.cpp
src/dwarfs/logger.cpp: In member function ‘virtual void dwarfs::stream_logger::write(dwarfs::logger::level_type, const string&, const char*, int)’:
src/dwarfs/logger.cpp:102:51: error: variable ‘folly::symbolizer::Symbolizer symbolizer’ has initializer but incomplete type
102 | Symbolizer symbolizer(LocationInfoMode::FULL);
| ^

Investigate memory consumption when compressing large files

I have to admit I've been doing most of my compression tasks on machines with 64GB of memory, so optimizing for low memory consumption hasn't really been a priority yet. There are some knobs you might be able to turn, though. I'm not sure large files per se are an issue, but a large number of files definitely is. You might be able to tweak --memory-limit a bit, which determines how many uncompressed blocks can be queued. If you lower this limit, the compressor pool may run out of blocks more quickly, resulting in overall slower compression. Reducing the number of workers (-N) might also help a bit.

A small update on this (apologies that this is on an unrelated issue). Did some experimentation and found that lowering memory limit and workers works in some instances but not in others. Large files seems to be the biggest hold up, in particular an instance when I tried to put a 3.1 GB file which seemingly had no way of compressing via dwarfs with my 16 GB of memory (even with very low options like -L1m -N1).

What I did find instead was that using the -l0 option and then recompressing the image works in these cases without issue. Creating the initial image with -S24 results in very well recompressed files in these instances, the 3.1 GB file compressing down to 2.3 GB whereas the default block size for -l0 resulted in a 2.6 GB file (which is approx what mksquashfs -comp zstd -b 1M -Xcompression-level 22 also gave me).

Originally posted by @Phantop in #33 (comment)

Possibility to build atop exact system libraries (unbundle its)

At this moment dwarfs use own bundled libraries. That's bad practice if we are building in source-based system like gentoo: I've just found off that revdep-rebuild (utility which checks if all package are consistent) triggers to dwarfs:
RarogCmexDell ~ # revdep-rebuild --pretend

  • This is the new python coded version
  • Please report any bugs found using it.
  • The original revdep-rebuild script is installed as revdep-rebuild.sh
  • Please file bugs at: https://bugs.gentoo.org/
  • Collecting system binaries and libraries
  • Checking dynamic linking consistency
  • Assign files to packages

emerge --pretend --oneshot --complete-graph=y sys-fs/dwarfs:0

These are the packages that would be merged, in order:

Calculating dependencies... done!
[ebuild R ~] sys-fs/dwarfs-0.3.1-r2

So it'll be good to unbound at least xxhash and zstd sources (at this moment it used in cmake, I'm not able to patch cmake because I don't know it) like previous 0.2.4 version which does not relly on bundled libraries.

The ebuild to play around in GURU repository:
eselect repository enable guru

Add rubygen-ronn dependency

to the README and somehow integrate the man/Makefile into cmake.
I had to manually do it, and only then was able to do a sudo make install

Create empty files when unable to access them

Would be useful to have an additional command line option to tell mkdwarfs to create empty files when it can't access them. Mksquashfs does this by default:

Failed to read file dir/filename, creating empty file

Useful for preserving the file structure of an input directory even when not all files are readble (for instance, if it's owned by another user).

Not a highly important feature, of course, but it would be nice to have it, if it's not too hard to implement.

FUSE daemon `-o cachesize` issue

So I've created quite a largish image, of about 800 MiB, which uncompressed had around 2.5 GiB.

I've tried starting dwarfs with -o cachesize=128m, ran a find /tmp/dwarfs -type f -exec md5sum {} + and after it was done the FUSE daemon process still retains ~800 MiB to ~1 GiB of RAM. (This is not the virtual memory, but instead the RES column of htop which reports the actual memory committed. When dwarfs starts it reports ~ 150 MiB.)

Now given that there are no more any open files, and the fact that I've already passed through all the files, there shouldn't be any uncompressed blocks (thus uncompression state) lingering around.

Even using an uncompressed image (i.e. -l 0) of ~400 MiB uncompressed size, results in ~400 MiB RAM usage.

dwarfs throws link errors on arm64

make[1]: *** [CMakeFiles/Makefile2:459: CMakeFiles/dwarfsextract.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
/usr/bin/ld: metadata_v2.cpp:(.text._ZN6dwarfs9metadata_INS_19debug_logger_policyEEC2ERNS_6loggerEN5folly5RangeIPKhEES9_RKNS_16metadata_optionsEib[_ZN6dwarfs9metadata_INS_19debug_logger_policyEEC2ERNS_6loggerEN5folly5RangeIPKhEES9_RKNS_16metadata_optionsEib]+0x65c): undefined reference to fmt::v8::vformat[abi:cxx11](fmt::v8::basic_string_view<char>, fmt::v8::basic_format_args<fmt::v8::basic_format_context<fmt::v8::appender, char> >)' /usr/bin/ld: metadata_v2.cpp:(.text._ZN6dwarfs9metadata_INS_19debug_logger_policyEEC2ERNS_6loggerEN5folly5RangeIPKhEES9_RKNS_16metadata_optionsEib[_ZN6dwarfs9metadata_INS_19debug_logger_policyEEC2ERNS_6loggerEN5folly5RangeIPKhEES9_RKNS_16metadata_optionsEib]+0x6d0): undefined reference to fmt::v8::vformat[abi:cxx11](fmt::v8::basic_string_view, fmt::v8::basic_format_args<fmt::v8::basic_format_context<fmt::v8::appender, char> >)'
/usr/bin/ld: libdwarfs.a(metadata_v2.cpp.o):metadata_v2.cpp:(.text._ZNK6dwarfs9metadata_INS_19debug_logger_policyEE4dumpERSoiRKNS_15filesystem_infoERKSt8functionIFvRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEjEE[_ZNK6dwarfs9metadata_INS_19debug_logger_policyEE4dumpERSoiRKNS_15filesystem_infoERKSt8functionIFvRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEjEE]+0x4b8): more undefined references to `fmt::v8::vformat[abi:cxx11](fmt::v8::basic_string_view, fmt::v8::basic_format_args<fmt::v8::basic_format_context<fmt::v8::appender, char> >)' follow
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [CMakeFiles/dwarfsbench.dir/build.make:118: dwarfsbench] Error 1
make[1]: *** [CMakeFiles/Makefile2:599: CMakeFiles/dwarfsbench.dir/all] Error 2
/usr/bin/ld: folly/folly/experimental/exception_tracer/libfolly_exception_tracer.a(ExceptionStackTraceLib.cpp.o)(.debug_info+0x38): R_AARCH64_ABS64 used with TLS symbol _ZN12_GLOBAL__N_17invalidE
/usr/bin/ld: folly/folly/experimental/exception_tracer/libfolly_exception_tracer.a(ExceptionStackTraceLib.cpp.o)(.debug_info+0x5a): R_AARCH64_ABS64 used with TLS symbol _ZN12_GLOBAL__N_118uncaughtExceptionsE
/usr/bin/ld: folly/folly/experimental/exception_tracer/libfolly_exception_tracer.a(ExceptionStackTraceLib.cpp.o)(.debug_info+0x74): R_AARCH64_ABS64 used with TLS symbol _ZN12_GLOBAL__N_116caughtExceptionsE
/usr/bin/ld: folly/libfolly.a(CacheLocality.cpp.o)(.debug_info+0x13c2b): R_AARCH64_ABS64 used with TLS symbol _ZZN5folly18SequentialThreadId3getEvE5local
/usr/bin/ld: folly/libfolly.a(AsyncStack.cpp.o)(.debug_info+0x64): R_AARCH64_ABS64 used with TLS symbol _ZN5folly12_GLOBAL__N_127currentThreadAsyncStackRootE
/usr/bin/ld: folly/folly/experimental/exception_tracer/libfolly_exception_tracer.a(ExceptionTracerLib.cpp.o)(.debug_info+0x132bf): R_AARCH64_ABS64 used with TLS symbol _ZZN5folly15SharedMutexImplILb0EvSt6atomicNS_24SharedMutexPolicyDefaultEE26tls_lastDeferredReaderSlotEvE2tl
/usr/bin/ld: folly/folly/experimental/exception_tracer/libfolly_exception_tracer.a(ExceptionTracerLib.cpp.o)(.debug_info+0x1355d): R_AARCH64_ABS64 used with TLS symbol _ZZN5folly15SharedMutexImplILb0EvSt6atomicNS_24SharedMutexPolicyDefaultEE21tls_lastTokenlessSlotEvE2tl
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [CMakeFiles/mkdwarfs.dir/build.make:118: mkdwarfs] Error 1
make[1]: *** [CMakeFiles/Makefile2:564: CMakeFiles/mkdwarfs.dir/all] Error 2
/usr/bin/ld: folly/libfolly.a(SharedMutex.cpp.o)(.debug_info+0x6105): R_AARCH64_ABS64 used with TLS symbol _ZZN5folly15SharedMutexImplILb1EvSt6atomicNS_24SharedMutexPolicyDefaultEE21tls_lastTokenlessSlotEvE2tl
/usr/bin/ld: folly/libfolly.a(SharedMutex.cpp.o)(.debug_info+0x613c): R_AARCH64_ABS64 used with TLS symbol _ZZN5folly15SharedMutexImplILb1EvSt6atomicNS_24SharedMutexPolicyDefaultEE26tls_lastDeferredReaderSlotEvE2tl
/usr/bin/ld: folly/libfolly.a(SharedMutex.cpp.o)(.debug_info+0x61a3): R_AARCH64_ABS64 used with TLS symbol _ZZN5folly15SharedMutexImplILb0EvSt6atomicNS_24SharedMutexPolicyDefaultEE21tls_lastTokenlessSlotEvE2tl
/usr/bin/ld: folly/libfolly.a(SharedMutex.cpp.o)(.debug_info+0x61da): R_AARCH64_ABS64 used with TLS symbol _ZZN5folly15SharedMutexImplILb0EvSt6atomicNS_24SharedMutexPolicyDefaultEE26tls_lastDeferredReaderSlotEvE2tl
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [CMakeFiles/dwarfs_compat_test.dir/build.make:121: dwarfs_compat_test] Error 1
make[1]: *** [CMakeFiles/Makefile2:528: CMakeFiles/dwarfs_compat_test.dir/all] Error 2
make: *** [Makefile:163: all] Error 2

Sharing the test archive

I noticed that, during the comparison with zpaq, the "placebo" compression mode (-m5) was used while, in reality, the default one (-m1) is almost always used.
Could you please share the file you used for testing to make some analysis?
Thanks

`mkdwarfs` refuses to start if the source is a symlink to a folder

As the title says, if the given source argument is a symlink to a folder, mkdwarfs fails with an error stating it wants a folder.

As a workaround one could call mkdwarfs -i .../folder/ -o .../output.

I would suggest allowing (perhaps warning) a symlink to a folder as a source argument.

Grabbled output due to progress bar and mismatched `TERM`

For some reason, if running mkdwarfs over SSH under screen from urxvt (i.e. urxvt -e ssh user@host then run screen then run mkdwarfs) the progress bar gets grabbled and outputs â¯â¯â¯â¯â¯â¯.

I've run strace and the progress bar seems to be written as:

[pid 17539] write(2, "\342\216\257\342\216\257\342\216\257\342\216\257\342\216\257\342\216\257\342\216\257\342\216\257\342\216\257\342\216\257\342\216"..., 1206)

Running with an empty environment (i.e. env -i) and even setting TERM to any of vt100, linux, rxvt-unicode, screen, screen.rxvt, xterm doesn't seem to fix it.

Granted if one doesn't use screen the progress bar looks OK. (Although locally on my laptop, with a newer screen it does seem to work just fine.)

My assumption is that the progress bar characters trick screen into displaying wrong characters.

Perhaps add an option to use ASCII-only progress bar or VT100 only compliant codes. Alternatively, add an option to print the progress from time-to-time as simple print statements as opposed to the current nice progress dashboard.

mkdwarfs 0.5.0 crashes at creating images

Unlike the previous time, now it crashes with any compression level.

** Aborted at 1617652815 (Unix time, try 'date -d @1617652815') ***
*** Signal 4 (SIGILL) (0x4fd686) received by PID 9296 (pthread TID 0x2c0e1c0) (linux TID 9296) (code: illegal operand), stack trace: ***
Illegal instruction (core dumped)

mkdwarfs-0.5.0-coredump.zip

This is with the static 0.5.0 binary from the releases page. Manually compiled dynamically linked build works fine.

`cmake' won't configure on Debian

After cloning the repository and following the steps i get the following error:

 0 [16:04] ~/workspace % git clone --recurse-submodules https://github.com/mhx/dwarfs
[...]
 0 [16:04] ~/workspace % cd dwarfs
 0 [16:04] ~/workspace/dwarfs@main % mkdir build
 0 [16:04] ~/workspace/dwarfs@main % cd build
 0 [16:04] workspace/dwarfs@main/build % cmake .. -DWITH_TESTS=1
-- The C compiler identification is GNU 10.2.1
-- The CXX compiler identification is GNU 10.2.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Setting build type to 'Release'
CMake Error at cmake/version.cmake:35 (message):
  missing version files
Call Stack (most recent call first):
  CMakeLists.txt:61 (include)


-- Configuring incomplete, errors occurred!
See also "/home/palaiologos/workspace/dwarfs/build/CMakeFiles/CMakeOutput.log".
 1 [16:04] workspace/dwarfs@main/build %

Attaching my CMakeOutput.log. I'm using the following to build dwarfs:

 129 [16:08] workspace/dwarfs@main/build % cmake --version
cmake version 3.18.4

CMake suite maintained and supported by Kitware (kitware.com/cmake).
 0 [16:08] workspace/dwarfs@main/build % git --version
git version 2.32.0.rc0
 0 [16:08] workspace/dwarfs@main/build % gcc --version
gcc (Debian 10.2.1-6) 10.2.1 20210110
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

 0 [16:08] workspace/dwarfs@main/build % clang --version
Debian clang version 11.0.1-2
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
 0 [16:08] workspace/dwarfs@main/build %

The release bundles don't seem to contain dependencies

I've just downloaded both 0.2.2 and 0.2.3 tar.gz bundles from GitHub (the releases pane on the right), and when trying to configure it with cmake it complains that it fails to find CMakeLists.txt in both folly and fbthrift. Inspecting those folders they are empty.

Doing a git clone and submodules update does seem to fix the issue.

Thus it would be a good idea to include in the bundles those two dependencies, or update the readme to point out to users how to populate those folders.

arch does not match x86_64 — It seems like architecture doesn't always defined correctly

There are bug fix which disables SSE2/AVX2 compile flags for LtHash SIMD code when arch does not match x86_64

-- arch does not match x86_64, skipping setting SSE2/AVX2 compile flags for LtHash SIMD code

In a certain configurations autodetecting does not work as expected :)
It disabled in Intel Skylake (Gentoo)

uname -a
Linux RCEngine 5.11.0-pf6-RarogCmex #1 SMP PREEMPT Fri Apr 2 10:42:06 +05 2021 x86_64 Intel(R) Core(TM) i5-6600K CPU @ 3.50GHz GenuineIntel GNU/Linux

cat /proc/cpuinfo

cpuinfo.txt

Comparison with wimlib

DwarFS seems pretty nice. File access within the archive is quick. Main feature seems to be deduplication. Judging by resulting file sizes, I'm guessing this is based on whole file deduplication rather than being block-based?

Downside... DwarFS seems slow to make, compared to both wimlib wimcapture and squashfs.

Testing with a copy of every released Wine version, extracted by doing for tag in $(git tag); do git archive --prefix=$tag/ $tag | tar -xC /mnt/wine; done (requires, naturally, the Wine git repository):

$ time mkdwarfs -i wine -o wine.dwarfs
10:04:50.260867 scanning wine
10:04:59.692484 waiting for background scanners...
14:10:05.911222 assigning directory and link inodes...
14:10:06.312963 finding duplicate files...
14:10:11.475558 saved 53.69 GiB / 64.85 GiB in 2907224/3117413 duplicate files
14:10:11.475645 ordering 210189 inodes by similarity...
14:10:11.642889 210189 inodes ordered [167.2ms]
14:10:11.642926 assigning file inodes...
14:10:11.644702 building metadata...
14:10:11.644753 building blocks...
14:10:11.644802 saving names and links...
14:10:12.103973 updating name and link indices...
14:43:56.247557 waiting for block compression to finish...
14:43:56.247884 saving chunks...
14:43:56.275000 saving directories...
14:43:58.693785 waiting for compression to finish...
14:43:58.813425 compressed 64.85 GiB to 183.4 MiB (ratio=0.00276233)
14:43:59.328251 filesystem created without errors [1.675e+04s]
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
waiting for block compression to finish
scanned/found: 362904/362904 dirs, 0/0 links, 3117413/3117413 files
original size: 64.85 GiB, dedupe: 53.69 GiB (2907224 files), segment: 6.309 GiB
filesystem: 4.847 GiB in 311 blocks (1460981 chunks, 210189/210189 inodes)
compressed filesystem: 311 blocks/183.4 MiB written
█████████████████████████████████████████████████████████████████████████▏100% /

real	279m9.270s
user	26m17.945s
sys	3m53.332s
$ time mksquashfs wine wine.squashfs -comp zstd
Parallel mksquashfs: Using 12 processors
Creating 4.0 filesystem on wine.squashfs, block size 131072.
[=================================================================================|] 3284743/3284743 100%

Exportable Squashfs 4.0 filesystem, zstd compressed, data block size 131072
	compressed data, compressed metadata, compressed fragments,
	compressed xattrs, compressed ids
	duplicates are removed
Filesystem size 2074564.87 Kbytes (2025.94 Mbytes)
	3.04% of uncompressed filesystem size (68204545.10 Kbytes)
Inode table size 31817047 bytes (31071.33 Kbytes)
	28.29% of uncompressed inode table size (112449867 bytes)
Directory table size 28385936 bytes (27720.64 Kbytes)
	41.57% of uncompressed directory table size (68284423 bytes)
Number of duplicate files found 2907225
Number of inodes 3480317
Number of files 3117413
Number of fragments 47404
Number of symbolic links  0
Number of device nodes 0
Number of fifo nodes 0
Number of socket nodes 0
Number of directories 362904
Number of ids (unique uids + gids) 1
Number of uids 1
	chungy (1000)
Number of gids 1
	chungy (1000)

real	153m19.319s
user	89m22.676s
sys	2m14.197s
$ time wimcapture --unix-data --solid wine wine.wim
Scanning "wine"
64 GiB scanned (3117413 files, 362904 directories)    
Using LZMS compression with 12 threads
Archiving file data: 11 GiB of 11 GiB (100%) done

real	79m20.722s
user	42m30.817s
sys	1m37.350s
$ du wine.*
184M	wine.dwarfs
2.0G	wine.squashfs
173M	wine.wim

wimlib is significantly faster to create this massive archive than DwarFS, and the resulting file size is marginally smaller. Git itself stores the Wine history in about 310MB, though that's not the fairest of comparisons given git's delta-based storage and the inclusion of every interim commit between the releases too.

DwarFS still beats out this particular WIM archive for performance as a mounted file system, because I used solid compression and random access in wimlib is not fast in this circumstance. I also think (correct me if I'm wrong!) that a solid archive was the better comparison, since DwarFS seems to group like files together and compress them as one unit (311 blocks in this particular file system). wimcapture's mode compresses each stream individually and the archive size balloons up to 2.4GB while making random access much quicker.

Question about read-only vs read-write

Hi,

DwarFS looks absolutely brilliant. But are there any plans to make it read-write, or is the plan to keep it as a read-only file system?

I would love to try it out for checking out several large (and similar, but different) git repositories, and then building them. We have several 300GB repositories at work, but most developers only have a 1T disk, so it quickly fills up.

Would you recommend an overlay filesystem on top of DwarFS for this use case, for now?

Thanks!

Extra non-flag arguments are ignored (instead of issuing a warning or error)

At least for mkdwarfs if one sets an extra argument like for example mkdwarfs -i ... -o ... Z (i.e. the Z) it just ignores it without failing.

Although this is not a major issue, especially in case of wrapper scripts and automation tools, a small bug in the user's code could fail for example to prepend -S before the user input, and thus that input would just be silently ignored by the tool.

Segfault encountered while creating a large DwarFS image

I was trying to create an image containing the latest tip of the CDNJS repository (https://github.com/cdnjs/cdnjs), and I encountered the following error twice:

waiting for block compression to finish
scanned/found: 544213/544213 dirs, 121/121 links, 7511842/7511842 files
original size: 235.1 GiB, dedupe: 77.02 GiB (6225274 files), segment: 63.22 GiB
filesystem: 94.81 GiB in 6068 blocks (9108158 chunks, 1286568/1286568 inodes)
compressed filesystem: 6068 blocks/9.235 GiB written
ERROR: std::out_of_range: _Map_base::at
Command exited with non-zero status 1

(If you want to checkout the repository yourself, I strongly suggest to use a shallow checkout, --depth 1 --single-branch, and prepare for a lot of waiting...) :)

build failed with boost-1.77.0: block_compressor.cpp:400:10: error: no type named 'mutex' in namespace 'std'

build.log

Tail of error:

/var/tmp/portage/sys-fs/dwarfs-0.5.6-r1/work/dwarfs-0.5.6/src/dwarfs/block_compressor.cpp:400:10: error: no type named 'mutex' in namespace 'std'
    std::mutex mx_;
    ~~~~~^
/var/tmp/portage/sys-fs/dwarfs-0.5.6-r1/work/dwarfs-0.5.6/src/dwarfs/block_compressor.cpp:433:15: error: no type named 'mutex' in namespace 'std'
  static std::mutex s_mx;
         ~~~~~^
/var/tmp/portage/sys-fs/dwarfs-0.5.6-r1/work/dwarfs-0.5.6/src/dwarfs/block_compressor.cpp:386:22: error: expected ';' after expression
      std::lock_guard lock(mx_);
                     ^
                     ;
/var/tmp/portage/sys-fs/dwarfs-0.5.6-r1/work/dwarfs-0.5.6/src/dwarfs/block_compressor.cpp:386:12: error: no member named 'lock_guard' in namespace 'std'
      std::lock_guard lock(mx_);
      ~~~~~^

It happens after boost update

[I] dev-libs/boost
Available versions: 1.76.0-r1(0/1.76.0)^t (~)1.77.0-r2(0/1.77.0)^t {bzip2 context debug doc icu lzma mpi +nls numpy python static-libs +threads tools zlib zstd ABI_MIPS="n32 n64 o32" ABI_S390="32 64" ABI_X86="32 64 x32" PYTHON_TARGETS="python3_8 python3_9 python3_10"}
Installed versions: 1.77.0-r2(0/1.77.0)^t(16:50:23 09/18/21)(bzip2 context icu lzma nls zlib zstd -debug -doc -mpi -numpy -python -tools ABI_MIPS="-n32 -n64 -o32" ABI_S390="-32 -64" ABI_X86="64 -32 -x32" PYTHON_TARGETS="python3_9 -python3_8 -python3_10")
Homepage: https://www.boost.org/
Description: Boost Libraries for C++

FUSE graceful exit on initialization error

I've tried to use -o mlock=must and (as it should) failed due to the per user limits.

However the dwarfs (both v2 and v3 FUSE drivers) aborted with an exception, and failed to properly unmount the file-system.

I think the FUSE driver first should try to execute all initialization steps, and if all succeed, only then to try to mount the filesystem.

Add an option for static executables

After I've successfully built 0.2.0 I've tried to see how many dynamic libraries does dwarfs depend on. Unfortunately there are quite a few libraries most of which aren't installed by default...

I would be lovely to have an option to build a static linked executable that can be easily moved from one Linux instance to another.

For example, one important use-case for dwarfs, would be server deployments where one could build a static image of the application (imagine a large Python / Ruby virtual-env and application with lots of files, assets).

$ readelf -d ./dwarfs

 0x0000000000000001 (NEEDED)             Shared library: [libboost_date_time.so.1.74.0]
 0x0000000000000001 (NEEDED)             Shared library: [libboost_filesystem.so.1.74.0]
 0x0000000000000001 (NEEDED)             Shared library: [libboost_program_options.so.1.74.0]
 0x0000000000000001 (NEEDED)             Shared library: [libboost_system.so.1.74.0]
 0x0000000000000001 (NEEDED)             Shared library: [libfmt.so.7]
 0x0000000000000001 (NEEDED)             Shared library: [libdouble-conversion.so.3]
 0x0000000000000001 (NEEDED)             Shared library: [libgflags.so.2.2]
 0x0000000000000001 (NEEDED)             Shared library: [libglog.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libunwind.so.8]
 0x0000000000000001 (NEEDED)             Shared library: [liblz4.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [liblzma.so.5]
 0x0000000000000001 (NEEDED)             Shared library: [libzstd.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libfuse3.so.3]
 0x0000000000000001 (NEEDED)             Shared library: [libpthread.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libstdc++.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libgcc_s.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
$ readelf -d ./mkdwarfs

 0x0000000000000001 (NEEDED)             Shared library: [libboost_date_time.so.1.74.0]
 0x0000000000000001 (NEEDED)             Shared library: [libboost_filesystem.so.1.74.0]
 0x0000000000000001 (NEEDED)             Shared library: [libboost_program_options.so.1.74.0]
 0x0000000000000001 (NEEDED)             Shared library: [libboost_system.so.1.74.0]
 0x0000000000000001 (NEEDED)             Shared library: [libfmt.so.7]
 0x0000000000000001 (NEEDED)             Shared library: [libdouble-conversion.so.3]
 0x0000000000000001 (NEEDED)             Shared library: [libgflags.so.2.2]
 0x0000000000000001 (NEEDED)             Shared library: [libglog.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libcrypto.so.1.1]
 0x0000000000000001 (NEEDED)             Shared library: [libunwind.so.8]
 0x0000000000000001 (NEEDED)             Shared library: [liblz4.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [liblzma.so.5]
 0x0000000000000001 (NEEDED)             Shared library: [libzstd.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libstdc++.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libgcc_s.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libpthread.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]

unbundling

Please provide a way to unbundle folly, fbthrift, fsst and parallel-hashmap. I'm packaging this for gentoo and I'll really appreciate

Speeding up mount times without sacrificing compression

So, I've been using DwarFS for a while now, and I'm loving it - I've had success with compressing my huge multi-terabyte backups with version 0.5.5, and all is well, except for one kind of massive problem: The mount times.... are atrocious!

I have a directory containing 1.5 TiB worth of separate DwarFS archives, but I'm going to focus on just one of them specifically for this example, 3TBDRV-PartC-05-Jun-2021.dwarfs, which is 129.1 GiB large, and 191.5 GiB uncompressed, with 452854 files. And oh boy, look at this:

I 17:17:22.057967 file system initialized [4460s]

That took over an hour to mount! (Ryzen 5 3600, 64 GB of RAM, DwarFS archive was stored on a Toshiba PC P300 3 TB hard disk)

In fact, this kind of extremely slow mounting time was consistent with all of the other archives too, even some stored on other drives:

I 16:03:03.018227 file system initialized [709.9ms]
I 16:03:03.849098 file system initialized [1.53s]
I 16:03:33.590178 file system initialized [31.27s]
I 16:03:55.387080 file system initialized [53.07s]
I 16:04:05.065723 file system initialized [62.75s]
I 16:04:29.313615 file system initialized [87s]
I 16:04:31.894768 file system initialized [89.57s]
I 16:04:37.869410 file system initialized [95.55s]
I 16:04:47.800754 file system initialized [105.5s]
I 16:07:38.591166 file system initialized [276.3s]
I 16:10:38.510593 file system initialized [456.2s]
I 16:11:29.503998 file system initialized [507.2s]
I 16:13:05.969340 file system initialized [603.7s]
I 16:26:54.248746 file system initialized [1432s]
I 16:27:27.300756 file system initialized [1465s]
I 16:30:56.814199 file system initialized [1675s]
I 16:45:29.011485 file system initialized [2547s]
I 16:45:40.813002 file system initialized [2558s]
I 16:46:33.700585 file system initialized [2611s]
I 17:12:16.627675 file system initialized [4154s]
I 17:13:47.702170 file system initialized [4245s]
I 17:17:22.057967 file system initialized [4460s]
I 17:27:46.640421 file system initialized [5084s]

I found this in the mkdwarfs documentation:

The metadata has been optimized for very little redundancy and leaving it uncompressed, the default for all levels below 7, has the benefit that it can be mapped to memory and used directly. This improves mount time for large file systems compared to e.g. an lzma compressed metadata block. If you don't care about mount time, you can safely choose lzma compression here, as the data will only have to be decompressed once when mounting the image.

If I'm reading this right, for all compression levels above or equal to 7, DwarFS is taking all of the metadata and decompressing it all at once at mount time. Is there any way this could be improved? Decompressing everything at once seems to be kind of a bad idea. I don't want to outright disable metadata compression, as there's kind of a lot of it and I get the feeling it would benefit from being compressed, but these mount times really are excessive - it actually kind of makes SquashFS preferable to DwarFS for the goal of having a compressed read-only filesystem that mounts quickly and is accessible quickly.

I'll admit I don't know much about DwarFS' actual internals, but how about this: what if the metadata was compressed in multiple separate chunks/blocks of a fixed size, and only the blocks that are actually needed get decompressed at any given time? Perhaps this could be made optional, or even the default at level 7 while 8 and 9 could compress the metadata all at once?

I'm not entirely sure on the specifics of how these mount times could be improved, but I feel like if level 7 is going to be the default then it should at least try to optimize the metadata for fast access (or at least, faster than this) somehow, without completely disabling compression, as compressing the metadata probably helps a lot with DwarFS' excellent space efficiency.

Or maybe compressing the metadata isn't worth it? The statement "The metadata has been optimized for very little redundancy" in the documentation seems to imply that compressing the metadata doesn't really help that much, are there any comparisons we can make between uncompressed and compressed metadata? How worthwhile even is compressing it? Should it continue to be enabled by default?

Static build fails for dwarfsextract

Static build fails for dwarfsextract due to unresloved references from libarchive.a
I guess it happens because libarchive from focal binary package is statically linked to more dependencies that are specified in static_link.sh (as per https://launchpad.net/ubuntu/focal/+source/libarchive)

So dwarfs static build requires either custom libarchive.a that matches supported formats or larger set of dependencies

[430/431] Linking CXX executable dwarfsextract FAILED: dwarfsextract : && /bin/bash /mnt/d/Projects/5.Projects/tebako/deps/src/_dwarfs/cmake/static_link.sh dwarfsextract CMakeFiles/dwarfsextract.dir/src/dwarfsextract.cpp.o && : ... /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_write_set_format_xar.o): in function compression_code_bzip2':
(.text+0xc5d): undefined reference to BZ2_bzCompress' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_write_set_format_xar.o): in function xar_options':
(.text+0x11b1): undefined reference to lzma_cputhreads' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_write_set_format_xar.o): in function compression_end_bzip2':
(.text+0x17ec): undefined reference to BZ2_bzCompressEnd' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_write_set_format_xar.o): in function xar_compression_init_encoder':
(.text+0x20fa): undefined reference to BZ2_bzCompressInit' /usr/bin/ld: (.text+0x2261): undefined reference to lzma_stream_encoder_mt'
/usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_cryptor.o): in function aes_ctr_encrypt_counter': (.text+0x52): undefined reference to nettle_aes_set_encrypt_key'
/usr/bin/ld: (.text+0x6d): undefined reference to nettle_aes_encrypt' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_cryptor.o): in function pbkdf2_sha1':
(.text+0x2a1): undefined reference to nettle_pbkdf2_hmac_sha1' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_digest.o): in function __archive_nettle_sha512final':
(.text+0x11): undefined reference to nettle_sha512_digest' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_digest.o): in function __archive_nettle_sha384update':
(.text+0x32): undefined reference to nettle_sha512_update' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_digest.o): in function __archive_nettle_sha512init':
(.text+0x49): undefined reference to nettle_sha512_init' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_digest.o): in function __archive_nettle_sha384final':
(.text+0x71): undefined reference to nettle_sha384_digest' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_digest.o): in function __archive_nettle_sha384init':
(.text+0x89): undefined reference to nettle_sha384_init' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_digest.o): in function __archive_nettle_sha256final':
(.text+0xb1): undefined reference to nettle_sha256_digest' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_digest.o): in function __archive_nettle_sha256update':
(.text+0xd2): undefined reference to nettle_sha256_update' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_digest.o): in function __archive_nettle_sha256init':
(.text+0xe9): undefined reference to nettle_sha256_init' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_digest.o): in function __archive_nettle_sha1final':
(.text+0x111): undefined reference to nettle_sha1_digest' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_digest.o): in function __archive_nettle_sha1update':
(.text+0x132): undefined reference to nettle_sha1_update' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_digest.o): in function __archive_nettle_sha1init':
(.text+0x149): undefined reference to nettle_sha1_init' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_digest.o): in function __archive_nettle_ripemd160final':
(.text+0x171): undefined reference to nettle_ripemd160_digest' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_digest.o): in function __archive_nettle_ripemd160update':
(.text+0x192): undefined reference to nettle_ripemd160_update' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_digest.o): in function __archive_nettle_ripemd160init':
(.text+0x1a9): undefined reference to nettle_ripemd160_init' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_digest.o): in function __archive_nettle_md5final':
(.text+0x1d1): undefined reference to nettle_md5_digest' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_digest.o): in function __archive_nettle_md5update':
(.text+0x1f2): undefined reference to nettle_md5_update' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_digest.o): in function __archive_nettle_md5init':
(.text+0x209): undefined reference to nettle_md5_init' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_digest.o): in function __archive_nettle_sha512update':
(.text+0x232): undefined reference to nettle_sha512_update' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_hmac.o): in function __hmac_sha1_init':
(.text+0x92): undefined reference to nettle_hmac_sha1_set_key' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_hmac.o): in function __hmac_sha1_final':
(.text+0x4d): undefined reference to nettle_hmac_sha1_digest' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_hmac.o): in function __hmac_sha1_update':
(.text+0x6e): undefined reference to nettle_hmac_sha1_update' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_write_set_format_7zip.o): in function compression_code_bzip2':
(.text+0x32d): undefined reference to BZ2_bzCompress' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_write_set_format_7zip.o): in function compression_end_bzip2':
(.text+0xc0c): undefined reference to BZ2_bzCompressEnd' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_write_set_format_7zip.o): in function _7z_compression_init_encoder':`

Ability to mount images by offset

Would be useful (for the situations when a dwarfs image is a part of another file) to have an ability to mount dwarfs images by offset. Like:

dwarfs -o offset=123 file mountpoint

Enhance `mkdwarfs` to support file permission normalization

Just like --no-owner and --no-time are useful to build "generic" images, it would also be useful to have an option that normalizes the file-system permissions. (At the moment they are take verbatim.)

Perhaps the easiest solution is the following:

  • add an option like --perms-norm that basically only cares if any executability bit is set (be it user, group or others), and thus creates entries like r-x r-x r-x or r-- r-- r--;
  • add another option like --perms-umask that takes a octal value and basically caps the permissions; for example --perms-umask 007 would only generate r-x r-x --- or r-- r-- ---;
  • (one could use each option independently of each-other;)

Bug report: dwarfs fails on tiny window sizes

I'm having a consistent problem with all filesystems created with -W values smaller than 8. When I try to copy the mounted filesystem to another location, or read several files sequentially, the process shortly gets paused indefinitely, and the process manager shows dwarfs at 0% CPU usage. I'm attaching a small sample file created with options -S 26 -B 8 -W 4
I'm using dwarfs (v0.5.6-16-g7345578, fuse version 35)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.