Giter VIP home page Giter VIP logo

pete4abw / lrzip-next Goto Github PK

View Code? Open in Web Editor NEW
42.0 42.0 8.0 3.6 MB

Long Range Zip. Updated and Enhanced version of ckolivas' lrzip project. Lots of new features. Better compression. Actively maintained.

Home Page: https://github.com/pete4abw/lrzip-next

License: GNU General Public License v2.0

C 61.75% Shell 1.16% C++ 31.44% Assembly 4.29% Makefile 0.52% M4 0.84%
bzip2 bzip3 c compression gzip lrzip lrzip-fe lrzip-next lzma lzma-sdk lzo rzip zpaq zstd zstd-zstandard-compress-decompress

lrzip-next's Introduction

Peter Hyman's Github

About Me

Program development has mostly been a hobby, not an avocation. I am expert in C. spent some years with Java, and program and configure databases.

Linux and me

My journey to Linux began when the alternative Windows Desktop called DesqView/X by Quarterdeck Systems. then with SCO/Xenix, and finally, my distro of choice, Slackware in the late '90s which I continue to use and support to this day. I run Windows only when I have to, and that, under a VM.

Professional

I have been a consultant where I used C and designed and configured databases in a variety of industry verticals:

  • Television, Film, Media, and Advertising
  • Recording arts
  • Banking, Insurance, and Financial Services
  • Healthcare and Life Sciences

Some career highlights:

  • Developed one of the first Ratings Analysis Systems for the Cable TV Industry
  • Debugged and reprogrammed a global stock market index database that was providing inaccurate results by 200 basis points
  • Designed Executive Compensation Systems used by a multi-national in 63 countries
  • Designed and Developed a Music Video Licensing System
  • Was Product Manager at an Enterprise Quality Management Software company
  • More recently, have been involved in Regulatory Compliance for Medical Device, Biotech, and Pharma Companies designing and configuring reporting systems for Adverse Events, Device Registrations, CAPA, etc.

Github Activities

lrzip-next is the project I work on the most. It is a detached fork of the lrzip long range data compression program by Con Kolivas. I began contributing to that project in 2007 and eventually, my modifications became too divergent to manage two forks. So, lrzip-next was born in 2019 and continues to this day. Features include:

  • Updated APIs for LZMA and ZPAQ
  • Enhanced and more readable print and info output
  • Updated x86 ASM routines for LZMA
  • Additions of new compression backends, bzip3 and zstd

See the FEATURES file for more info.

bitpacker is a C project that will pack 7-bit ASCII strings into 8-bit bytes. This is accomplished by successively shifting bytes 1 bit at a time, rotating throughout the array of bytes. The result is a string which is obfuscated with unprintable characters. While not encryption, this result will be made printable by using Base64 encoding. So a string like password would become unreadable and with base64 readable again and can be used. It is useful because unlike you can control what the password will be and it is still packed and then obfuscated. Even if someone decoded the base64 string, your password would still not be readable. This was really a small mental exercise for an old-school programmer!

$ ./bitp p password
Base64 encoding of password is 4YefPvv5ZA==

$ ./bitp u 4YefPvv5ZA==
base64 Decoded packed password is: password

makesbld is a shell program to create Slackware Build files. Similar to Gentoo ebuilds, it uses a library of shell functions to systemetize package automation.

Kernel-Install is a shell script to automate upgrading and installing kernels.

lrzip-fe is a shell script using Dialog to launch lrzip-next with all options menu-selectable (although this project is a little behind right now).

Other projects are also available for perusal but are not as actively maintained. ps2lrz is an intereting one because it has the capability to rewrite lrzip and lrzip-next headers or just decode them. Interesting stuff!

Personal

I majored in Piano Performance and minored in Music Theory at Oberlin Conservatory of Music. Earned a Masters in Cable Communications at New York University. I enjoy travel. A lot.

Contact

I may be reached through the Issues and Discussions tabs on any repo or by email: Send Email

lrzip-next's People

Contributors

areading avatar ckolivas avatar cspiegel avatar danieldjewell avatar emallickhossain avatar ghost avatar haneefmubarak avatar ib avatar irrequietus avatar jaalto avatar kata198 avatar lr4d avatar maeyanie avatar orthographic-pedant avatar patterner avatar pete4abw avatar zetok avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

lrzip-next's Issues

:lady_beetle: [Outputted text during decompression of a ZPAQ compressed file overlaps other text]

lrzip-next Version

lrzip version 0.8.8

lrzip-next command line

lrzip -U -T -z -p 1 -L 9 -vv

What happened?

NOTE: lrzip-next is symlinked as lrzip.

When I decompress any ZPAQ compressed file I get...

Validating file for consZPAQncy.1:30% pressing...

The main issue is easily spotted. As for below, "ZPAQ: ..%" will appear in the middle or somewhere between the 1: ..% 2: ..%, and etc.

There's another issue that's related to this, except it happens when decompressing with multiple threads. I had a file that'll show this, however I deleted it as I'm reworking the file. Once I have the text showing this issue, I will add it here, so please don't close. I will try to get that bit added asap. I assume its reproducible? My file was relatively big, however the original data was 50GB+ so it happens when using max amount of threads basically.

What was expected behavior?

ZPAQ text should be on the right side of the validation text, or below.

Steps to reproduce

I usually see this with ZPAQ only, I have not gotten this issue with LZMA (I have not used any other algorithm).

Relevant log output

No response

Please provide system details

OS Distro:
Ubuntu (Server)

Kernel Version (uname -a):
Linux server 5.4.0-91-generic #102-Ubuntu SMP Fri Nov 5 16:31:28 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

System ram (free -h): 8GB RAM, 4GB Swap (zram not configured)

Additional Context

No response

:lady_beetle: ZPAQ Compression/Decompression Broken

lrzip-next Version

0.7.63+ Maybe earlier

lrzip-next command line

lrzip-next -d | -t file.tar.lrz

What happened?

Decompressing a ZPAQ lrzip-next archive fails with an MD5 mismatch. Stored MD5s are OK. The decompressed file is broken. This came out of the blue and is a new issue.

What was expected behavior?

MD5s should match. Stored MD5s are correct.

Steps to reproduce

Create a ZPAQ lrzip-next archive
Decompress or Test

Relevant log output

Detected lrzip version 0.7 file.
100%    1363.42 /   1363.42 MB  1:100%  2:100%  3:100%  4:100%  5:100%  6:100%  7:100%  8:100%  9:100%  
Average DeCompression Speed: 19.471MB/s

MD5 CHECK FAILED.
Stored:134a2942867678fa1c3d4284c8b738b2
Output file:901042f7a62b3b7493fc3e885c60af47
Fatal error - exiting

Please provide system details

Slackware-x86_64 current
Kernel Version (uname -a): 5.13.2+
System ram (free -h): $ free -h
total used free shared buff/cache available
Mem: 15Gi 895Mi 11Gi 341Mi 3.0Gi 13Gi
Swap: 15Gi 654Mi 15Gi

Additional Context

Trying to pin down when this broke. ZPAQ code has not changed for a while. Made some changes to the temp buffer prior to writing out.

lrzip-next 0.8,8 - autogen fails

with macOSX 12.3 (developer beta), autogen fails with the following:

zsh: ./autogen.sh: bad interpreter: /bin/env: no such file or directory

I do no for a fact that it did work before, so there is a regression somewhere.

Bring liblrzip up to date

With the enhancements in version 0.7x of lrzip, these options are omitted from the lrzip library, liblrzip. In particular, pre-compression filtering, and the ability to override default dictionary sizes. The following options need to be accounted for:

--x86
--arm
--armt
--ppc
--sparc
--ia64
--delta [1..32]

and

--dictsize 12-30 (expressed as 2^ds)

:lady_beetle: Regression test fails with lrzip-next

lrzip-next Version

whats-next branch post 0.8.10

lrzip-next command line

test/regressiontest.sh

What happened?

Since lrzip-next uses file validation methods AND changes have been made to the failure methods, the regression test output is wrong and the program fails.

What was expected behavior?

Regression test should pass

Steps to reproduce

  1. run test/regressiontest.sh

Relevant log output

> MD5:bf44bc2b3112da6502aecd17955cda5f
> MD5:bf44bc2b3112da6502aecd17955cda5f
> MD5:bf44bc2b3112da6502aecd17955cda5f
> MD5:bf44bc2b3112da6502aecd17955cda5f
> MD5:06facaf41e7ea1fdae6ae3c6f321a57d
48c287,289
< OK
---
> Output filename is: testfile.lrz
> Invalid chunk data
> Fatal error - exiting


### Please provide system details

OS Distro: Slackware 64 current
Kernel Version (uname -a): 5.16.14
System ram (free -h): 16GB


### Additional Context

New tests will be needed.
ETA unknown.
LRZIP environment variable must be set to NOCONFIG
or
no lrzip.conf file should exist.

Large chunks cause segfault in final MD5 computation

In testing larger files with very low compression level, a bug was uncovered when finalizing md5 computation. This occurs when the chunk size was greater than the per-thread buffer size. Notice in this example, the chunk size is 9GB+, but the per thread buffer size is ~300MB. This would result in a segfault or at worst, an incorrect MD5 computation!

$ lrzip-next -vvf -L1 root.tar
Using configuration file /home/peter/.lrzip/lrzip.conf
The following options are in effect for this COMPRESSION.
Threading is ENABLED. Number of CPUs detected: 8
Detected 16557715456 bytes ram
Nice Value: 19
Show Progress
Max Verbose
Overwrite Files
Temporary Directory set as: ./
Compression mode is: LZMA. LZ4 Compressibility testing enabled
Compression level 1
RZIP Compression level 1
Initial LZMA Dictionary Size: 65536
Heuristically Computed Compression Window: 105 = 10500MB
Storage time in seconds 1376342898
Output filename is: root.tar.lrz
Warning, unable to set owner on root.tar.lrz
File size: 9204787200
Enabling sliding mmap mode and using mmap of 5519237120 bytes with window of 9204787200 bytes
Succeeded in testing 5519237120 sized mmap for rzip pre-processing
Will take 1 pass
Chunk size: 9204787200
Byte width: 5
Per Thread Memory Overhead is 7061504
Succeeded in testing 2823172778 sized malloc for back end compression
Using up to 9 threads to compress up to 306624361 bytes each.
Beginning rzip pre-processing phase
hashsize = 131072.  bits = 17. 2MB
....
Starting lzma back end compression thread 7...
Total: 99%  Chunk: 99%
Starting thread 8 to compress 306624361 bytes from stream 1
lz4 testing OK for chunk 306624361. Compressed size = 77.03% of test size 10485760, 1 Passes
Starting lzma back end compression thread 8...
87380 total hashes -- 2 in primary bucket (0.002%)
Malloced 5519237120 for checksum ckbuf
Compthread 3 seeking to 1960824561 to store length 5
Compthread 3 seeking to 2045033667 to write header
Thread 3 writing 178302266 compressed bytes from stream 1
Compthread 3 writing data at 2045033683
Compthread 4 seeking to 2045033678 to store length 5
Compthread 4 seeking to 2223335949 to write header
Thread 4 writing 80158001 compressed bytes from stream 1
Compthread 4 writing data at 2223335965
Compthread 5 seeking to 2223335960 to store length 5
Compthread 5 seeking to 2303493966 to write header
Thread 5 writing 108708283 compressed bytes from stream 1
Compthread 5 writing data at 2303493982
Compthread 6 seeking to 2303493977 to store length 5
Compthread 6 seeking to 2412202265 to write header
Thread 6 writing 102632830 compressed bytes from stream 1
Compthread 6 writing data at 2412202281
Compthread 7 seeking to 2412202276 to store length 5
Compthread 7 seeking to 2514835111 to write header
Thread 7 writing 95655657 compressed bytes from stream 1
Compthread 7 writing data at 2514835127
Segmentation fault

libzpaq detects macOS as Windows

It seems that the #ifdef in libzpaq.cpp is not extensive enough to work on macOS compilers.

libzpaq.cpp#L31 has the #ifdef and here is an article I found about the issue: http://nadeausoftware.com/articles/2012/01/c_c_tip_how_use_compiler_predefined_macros_detect_operating_system#UNIX


After resolving this, make still fails with the following error:

/bin/sh ../../../libtool  --tag=CXX   --mode=compile clang++ -DHAVE_CONFIG_H -I. -I../../..     -g -O2  -O3 -c -o libzpaq.lo libzpaq.cpp
libtool: compile:  clang++ -DHAVE_CONFIG_H -I. -I../../.. -g -O2 -O3 -c libzpaq.cpp  -fno-common -DPIC -o .libs/libzpaq.o
libzpaq.cpp:87:25: error: use of undeclared identifier 'MEM_RELEASE'
      VirtualFree(p, 0, MEM_RELEASE);
                        ^
libzpaq.cpp:98:37: error: use of undeclared identifier 'MEM_RESERVE'
    p=(U8*)VirtualAlloc(0, newsize, MEM_RESERVE|MEM_COMMIT,
                                    ^
libzpaq.cpp:98:49: error: use of undeclared identifier 'MEM_COMMIT'
    p=(U8*)VirtualAlloc(0, newsize, MEM_RESERVE|MEM_COMMIT,
                                                ^
libzpaq.cpp:99:25: error: use of undeclared identifier 'PAGE_EXECUTE_READWRITE'
                        PAGE_EXECUTE_READWRITE);
                        ^
4 errors generated.
make[2]: *** [libzpaq.lo] Error 1
make[1]: *** [all-recursive] Error 1
make: *** [all] Error 2

:lady_beetle: Compression ratio incorrect when using tar -I

lrzip-next Version

0.8.12-2-78272f8

lrzip-next command line

tar -I lrzip-next -cf /tmp/gc.tar.lrz .

What happened?

When using tar -I lrzip-next... the computed compression ratio, bpb are incorrect.

Compression Ratio: 95.108. bpb: 0.084. Average Compression Speed: 21.150MB/s.
Total time: 00:00:19.93

$ lrzip-next -vvi /tmp/gc.tar.lrz
...
Decompressed file size: 443,729,920
Compressed file size: 111,626,277
Compression ratio: 3.975x, bpb: 2.013

What was expected behavior?

Either show correct ratio and bpb or show N/A. Since tar -I lrzip-next pipes data to lrzip-next and lrzip-next pipes data to the tar.lrz file, the input size should be accumulated properly or not at all.

Steps to reproduce

Run tar -I lrzip-next with some verbosity followed by lrzip-next -i on the resulting file.

Relevant log output

No response

Please provide system details

OS Distro: SlackWare 15+
Kernel Version (uname -a): 5.17.5
System ram (free -h): 16G

Additional Context

I thought I had this problem nailed down. Still an open item.

Allow multiple files in lrzip

A long missed feature of lrzip is the ability to zip more than one file at a time. While tar --use-compress-program=lrzip can work, the compression window gets reduced as a result. lrzip accepts wildcards for compression or decompression, each file is compressed individually. A better way would be to construct a list of files selected - whether with wildcards or multiple file entries on the command line - and add that list of files to an lrz file chaining them one by one, chaining them just like streams blocks are chained.

:lady_beetle: BZIP3 Test | Decompression fails with segfault

lrzip-next Version

9.2+ (bzip3_poc branch)

lrzip-next command line

lrzip-next -t | d file.lrz

What happened?

When compressing with -B file compresses fine, but fails on test | decompression

What was expected behavior?

File passes test or decompressess

Steps to reproduce

Obvious.

Relevant log output

$ lrzip-next -vvt file.lrz
Using configuration file /home/peter/.lrzip/lrzip.conf
The following options are in effect for this INTEGRITY TEST.
Threading is ENABLED. Number of CPUs detected: 8
Detected 16,558,223,360 bytes ram
Nice Value: 19
Show Progress
Max Verbose
Test file integrity
Temporary Directory set as: /tmp/
Malloced 5,519,405,056 for tmp_outbuf
Detected lrzip version 0.9 file.
SHA256 being used for integrity testing.
Validating file for consistency...[OK]
Detected lrzip version 0.9 file.
Decompressing...
Reading chunk_bytes at 20
Expected size: 2,063,063,040
Chunk byte width: 4
Reading eof flag at 21
EOF: 1
Reading expected chunksize at 22
Chunk size: 2,063,063,040
Reading stream 0 header at 27
Reading stream 1 header at 40
Reading ucomp header at 180,964,164
Fill_buffer stream 0 c_len 7,163,384 u_len 14,178,163 last_head 0
Starting thread 0 to decompress 7,163,384 bytes from stream 0
Thread 0 decompressed 14,178,163 bytes from stream 0
Taking decompressed data from thread 0
Reading ucomp header at 53
Fill_buffer stream 1 c_len 6,481,206 u_len 33,550,336 last_head 6,481,245
Starting thread 1 to decompress 6,481,206 bytes from stream 1
Reading ucomp header at 6,481,272
Fill_buffer stream 1 c_len 9,253,807 u_len 33,550,336 last_head 15,735,065
Starting thread 2 to decompress 9,253,807 bytes from stream 1
Reading ucomp header at 15,735,092
Fill_buffer stream 1 c_len 6,955,531 u_len 33,550,336 last_head 22,690,609
Starting thread 3 to decompress 6,955,531 bytes from stream 1
Reading ucomp header at 22,690,636
Fill_buffer stream 1 c_len 6,890,209 u_len 33,550,336 last_head 29,580,831
Starting thread 4 to decompress 6,890,209 bytes from stream 1
Reading ucomp header at 29,580,858
Fill_buffer stream 1 c_len 5,434,365 u_len 33,550,336 last_head 35,015,209
Starting thread 5 to decompress 5,434,365 bytes from stream 1
Reading ucomp header at 35,015,236
Fill_buffer stream 1 c_len 5,287,005 u_len 33,550,336 last_head 40,302,227
Starting thread 6 to decompress 5,287,005 bytes from stream 1
Reading ucomp header at 40,302,254
Fill_buffer stream 1 c_len 6,031,949 u_len 33,550,336 last_head 46,334,189
Starting thread 7 to decompress 6,031,949 bytes from stream 1
Reading ucomp header at 46,334,216
Fill_buffer stream 1 c_len 6,107,371 u_len 33,550,336 last_head 52,441,573
Starting thread 8 to decompress 6,107,371 bytes from stream 1
Reading ucomp header at 52,441,600
Fill_buffer stream 1 c_len 5,147,508 u_len 33,550,336 last_head 57,589,094
Starting thread 9 to decompress 5,147,508 bytes from stream 1
Segmentation fault


### Please provide system details

OS Distro: Slackware x64 Current
Kernel Version (uname -a): 6.0.0
System ram (free -h): 
           total        used        free      shared  buff/cache   available

Mem: 15Gi 1.4Gi 4.0Gi 477Mi 10Gi 13Gi
Swap: 15Gi 34Mi 15Gi



### Additional Context

Includes contribution by Kamila Szewczyk (@kspalaiologos) for bzip3 integration.

ANN: It's Here! Multiple Hash Algos and AES256 Encryption

A new branch called WhatsNext will be uploaded soon. It contains a LOT of changes that include the ability to select from 13 different hash algorithms!

  • MD5
  • RIPEMD
  • SHA256
  • SHA384
  • SHA512
  • SHA3_256
  • SHA3_512
  • SHAKE128_16 (16 Byte output XOF Function)
  • SHAKE128_32 (32 Byte output)
  • SHAKE128_64 (64 Byte output)
  • SHAKE256_16 (16 Byte output)
  • SHAKE256_32 (32 Byte output)
  • SHAKE256_64 (64 Byte output)

Encryption options now include

  • AES128 (currently used)
  • AES256

Hashes can be selected using the -H# option and the encryptions can be selected with -E#.

Verbose output has been updated. Check for changes often as this is a work in progress. Not in the main branch.

This branch's encryption method will be INCOMPATIBLE with any past encrypted archives.

Any plans to release a pre-compiled binary?

You were helping me in the main LRzip repo.

I found this when looking at your account and it looks real interesting.

Any plans to release a pre-compiled version (or even better a dockertized version combined with lrzip-fee)?

I have no idea where to start for compiling, have not done it in years and then was a clear case of failing a few dozen times and not having a clue why it finally worked, then deciding it was just not worth the hassle lol.

Would this version net me any real world gains in compression / speed for my 2.5TB file using lmza compression?

lrzip-fe also looks FANTASTIC! While I will happily run scripts and such from CLI, give me a GUI for anything that I don't do repeatedly any day of the week.

Thanks for the help BTW. I love learning.

Should we get rid of bzip and gzip?

lzo is fastest
zpaq is slowest but could have best compression
lzma is what lrzip-next is all about

Question is: does anyone use bzip2 or gzip compression with lrzip-next? If not, why keep it?

Thoughts?

BZIP3 Wrapper functions and state locking not necessary

Testing the removal of all the BZIP3 wrapper functions and variables in stream.c. Threads are themselves locked when calling compression/decompression functions. Locking all threads and bzip3 variables a second time is unnecessary and a waste of memory, Calling bz3_new and bz3_free and calling bz3_encode_block and bz3_decode_block from within the bz3 compression/decompression functions is enough. @kspalaiologos , interested in your review.

Changes pushed to bzip3_poc tree.

lrzip-next -i should work even if file is encrypted

While there are some difficulties currently in dealing with piped encrypted lrz files, there should be no reason why the command

lrzip-next -i file.lrz

should not work, even if the file is encrypted.

Asking for the password, or accepting it on the command line, -efoo or --encrypt foo, should work equally well as a decryption.

Something to look at...

For consideration. Consider adding a comments field to be stored.

This would mean a header change, but I am considering adding a comments field. Maybe a max of 64 bytes, but it could contain descriptive information of the compressed file or command line options. The field length would be variable and maybe even bit-packed to save more space. Just a thought for now. Comments?

:lady_beetle: tar -I lrzip-next -x|t will fail (Compression works fine)

lrzip-next Version

0.9.0

lrzip-next command line

tar -I lrzip-next -tvvf file.tar.lrz

What happened?

tar -I lrzip-next fails when extracting or testing (listing) files returns an error due to faulty pipeing.

$ tar -I lrzip-next -tvvf lrzip-next.tar.lrz
Using configuration file /home/peter/.lrzip/lrzip.conf
The following options are in effect for this DECOMPRESSION.
Threading is ENABLED. Number of CPUs detected: 8
Detected 16,558,071,808 bytes ram
Nice Value: 19
Show Progress
Verbose
Temporary Directory set as: ./
Unknown hash, falling back to CRC
Outputting to stdout.
Detected lrzip version 0.9 file.
(null) being used for integrity testing.
tar: Child died with signal 11
tar: Error is not recoverable: exiting now

What was expected behavior?

File extraction or listing would succeed

Steps to reproduce

Execute command as above.

Relevant log output

No response

Please provide system details

OS Distro: Slackware
Kernel Version (uname -a): 5.18.10
System ram (free -h):

$ free -h
               total        used        free      shared  buff/cache   available
Mem:            15Gi       1.3Gi        12Gi       421Mi       2.0Gi        13Gi
Swap:           15Gi          0B        15Gi

Additional Context

Note to self: review lrzip.c in the decompress_file function to see why TEST_ONLY and STDOUT are missing the mark.

ITMT, output lrzip-next to tar like this this:

$ lrzip-next -d -o - lrzip-next.tar.lrz | tar -tvvf -
Using configuration file /home/peter/.lrzip/lrzip.conf
The following options are in effect for this DECOMPRESSION.
Threading is ENABLED. Number of CPUs detected: 8
Detected 16,558,071,808 bytes ram
Nice Value: 19
Show Progress
Verbose
Output Filename Specified: -
Temporary Directory set as: ./
Outputting to stdout.
Detected lrzip version 0.9 file.
SHA256 being used for integrity testing.
Validating file for consistency...[OK]
Detected lrzip version 0.9 file.
drwxr-xr-x peter/users       0 2022-06-22 08:17 lrzip-next/
-rw-r--r-- peter/users    4112 2021-03-05 06:31 lrzip-next/README-NOT-BACKWARD-COMPATIBLE
-rw-r--r-- peter/users     431 2021-03-05 06:31 lrzip-next/.gitignore
drwxr-xr-x peter/users       0 2022-06-21 11:16 lrzip-next/man/
-rw-r--r-- peter/users    5010 2022-06-21 09:03 lrzip-next/man/lrznunzip.1

Unable to compile on macOS Ventura

lrzip-next Version

latest master branch

lrzip-next command line

Not applicable

What happened?

YIn order to get lrzip-next to work, I read in a previous issue (by @demhademha) that at least version 5 of bash is required.
I modified autogen.sh and gitdesc.sh to use /opt/homebrew/bin/bash

What was expected behavior?

Although configure works, multitude of errors are produced with make occurred, due to The missing version information, exactly the same as the previous issue

Steps to reproduce

Mentioned above

Relevant log output

Running autoreconf -if...
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
glibtoolize: putting auxiliary files in '.'.
glibtoolize: copying file './ltmain.sh'
glibtoolize: putting macros in AC_CONFIG_MACRO_DIRS, 'm4'.
glibtoolize: copying file 'm4/libtool.m4'
glibtoolize: copying file 'm4/ltoptions.m4'
glibtoolize: copying file 'm4/ltsugar.m4'
glibtoolize: copying file 'm4/ltversion.m4'
glibtoolize: copying file 'm4/lt~obsolete.m4'
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
fatal: No names found, cannot describe anything.
./util/gitdesc.sh: line 63: [: -gt: unary operator expected
configure.ac:18: installing './compile'
configure.ac:17: installing './missing'
src/Makefile.am: installing './depcomp'

Please provide system details

Apple Silicon M1, Mac running macOS Ventura, 13.1

Additional Context

Not applicable

Determine Optimum Delta Offset

Let's come up with a way to determine which delta offset might be the best fit for compression. I'm thinking to use a method similar to the lzo test in lrzip. It would only have to be done once and iterate from say 1-16. Right now, delta offset defaults to 1, but can extend all the way to 256 as described in the docs. See this post for a more detailed delta analysis.

windows x64 binary

lrzip-next Version

lastest verison

Feature Suggestion

could you make binary version for windows? i don't know how to compile codes.

Steps to reproduce

No response

Relevant log output

No response

Please provide system details

OS Distro:
Kernel Version (uname -a):
System ram (free -h):

Additional Context

No response

Use libgcrypt instead of standalone AES and SHA code

I realize that libgcrypt recently had a big bug in 1.9.0, but I would like to look at using its code to replace AES and SHA code snippets. Comments?

EDIT: libgcrypt can also be used for MD5 and CRC. This will eliminate the need for multiple source files and provide a generalized interface for all hash/encryption activities.

EDIT: Issues. Compatibility with existing archives. For example, cannot use SHA3 or alternate hash because archive hashes are fixed. Decrypt will be impossible.

BUT otoh using a generalized library will allow for user-selected hash and encryption.

This will take some work and lots of investigation.

:bulb: "Split" archive support.

lrzip-next Version

lrzip version 0.8.8-3-9f2257d

Feature Suggestion

(I have proposed it on ZPAQ forums and got rejected.)

After deduplication and before compression, instead of outputting a single file, it outputs a "metadata-only" file (data.lrz.m) and a directory (data.lrz.d) of blocks (with the filenames describing the block number and state (compressed/uncompressed/incompressible), like 0-12345_Uncompressed.dat).
When needed, do a stream compression of each block and delete the uncompressed ones.
This is similar to RAR volumes but the block size is not fixed and incompressible data is taken care of.

The reason:

On large (more than a few gibibytes) datasets, compression would take a long time to complete (especially when using ZPAQ). Currently, it is nearly impossible to pause or resume this process, let alone keeping it running between reboots.
With this, after deduplication and filtering, you can just compress the blocks separately. If you have to reboot, only the blocks that are currently compressing have to be restarted, instead of the whole dataset.
Also, since the blocks are standalone, you can compress them on different locations and different computers, with different methods, and making storage and backups easier.

An example:

To compress:

$ lrzip-next --split-archive -n -o ./output/data.lrz.m ./data
$ lrzip-next --split-archive --lzma ./output/data.lrz.m

(The second command should ignore missing blocks since they may be elsewhere.)

To extract:

$ lrzip-next --split-archive -d -o ./data ./output/data.lrz.m

Steps to reproduce

No response

Relevant log output

No response

Please provide system details

OS Distro: Debian Sid
Kernel Version (uname -a): Linux localhost 5.17.3 #6 SMP PREEMPT Tue Apr 19 00:23:50 +08 2022 x86_64 GNU/Linux
System ram (free -h):

               total        used        free      shared  buff/cache   available
Mem:           125Gi       1.5Gi       122Gi       7.1Mi       1.2Gi       123Gi
Swap:             0B          0B          0B

Additional Context

No response

STDOUT storing expected filesize fails when #chunks >1

In stream.c a Magic header is written once after all streams in a chunk have been processed by a backend compressor. At that point, control->st_size is written along with the magic header, but it only shows the initial chunk size. Subsequent chunks are not written to the magic header. All is well when there is one chunk. But multiple chunks? Zero gets written.

1525	if (!ctis->chunks++) {
1526		int j;
1527
1528		if (TMP_OUTBUF) {
1529			lock_mutex(control, &control->control_lock);
1530			if (!control->magic_written)
1531				write_magic(control);
1532			unlock_mutex(control, &control->control_lock);
1533
1534			if (unlikely(!flush_tmpoutbuf(control))) {
1535				print_err("Failed to flush_tmpoutbuf in compthread\n");
1536				goto error;
1537			}
1538		}

It's clear we can store expected size of a STDOUT file. The question is how for larger files exceeding one chunk.

lrzuntar fails on m1 macOS

LRZIP-next version: 0.8.9:
command used: lrzuntar -vv *.lrz
output:

The following options are in effect for this DECOMPRESSION.
Threading is ENABLED. Number of CPUs detected: 8
Detected 8,589,934,592 bytes ram
Nice Value: 19
Show Progress
Max Verbose
Overwrite Files
Temporary Directory set as: /var/folders/f1/lghqjk9d30g0g7c1bbs438ww0000gn/T/
Output filename is: usr.tar

Malloced 2,863,300,608 for tmp_outbuf
Detected lrzip version 0.8 file.
MD5 being used for integrity testing.
Validating file for consistency...[OK]
Detected lrzip version 0.8 file.
Decompressing...
Reading chunk_bytes at 18
Expected size: 563,200
Chunk byte width: 3
Reading eof flag at 19
EOF: 1
Reading expected chunksize at 20
Chunk size: 563,200
Reading stream 0 header at 24
Reading stream 1 header at 34
Reading ucomp header at 44
Fill_buffer stream 0 c_len 6,790 u_len 17,968 last_head 0

note that: LRZIP-next -d *.lrz does not work as well.
regards

LzmaDecOpt.asm segfaults in copy_match

See FIXME in LzmaDecOpt.asm file.

Thread 2 "lrzip" hit Breakpoint 1, 0x0000000000459840 in _LzmaDec_DecodeReal_3 ()
(gdb) c
Continuing.

Thread 2 "lrzip" received signal SIGSEGV, Segmentation fault.
0x000000000045a2aa in copy_match.out ()

Here is the section from copy_match where the error occurs. Seatch FIXME.

; *** FIXME***
; after some iterations, invalid memory address in RDI t0_R
; Dump of assembler code for function copy_match.out:
;   0x000000000045a2a7 <+0>:     add    %r12,%rdi
;=> 0x000000000045a2aa <+3>:     movzbl (%rdi),%ebx    
;   0x000000000045a2ad <+6>:     add    %rdx,%rdi
;   0x000000000045a2b0 <+9>:     neg    %rdx
(gdb) i r r12 rdi ebx rdx
;r12            0x7ffff00008c0   140737219922112
;rdi            0x7ffefa146981   140733094062465 *** this shows error "Cannot access memory at 0x#####
;ebx            0x0      0
;rdx            0x2      2
; *** END FIXME ***
        add     t0_R, dic
        movzx   sym, byte [t0_R]
        add     t0_R, cnt_R
        neg     cnt_R
        ; lea     r1, [dicPos - 1]
copy_common:
        dec     dicPos
        ; cmp   LOC rep0, 1
        ; je    rep0Label

        ; t0_R - src_lim
        ; r1 - dest_lim - 1
        ; cnt_R - (-cnt)

        IsMatchBranch_Pre
        inc     cnt_R
        jz      copy_end

Possible undefined behaviour

Discussed in https://github.com/pete4abw/lrzip-next/discussions/41

Originally posted by demhademha August 16, 2021
So, i pipe lrzip-next to tar, like this:
tar -cf - bootstrap/* | lrzip-next -vv -z -L9 -p1 -w 80 -o bootstrap.tar.lrz
However, tar produces the following output:

Bad checksum: 0x66adc716 - expected: 0xfff82df8 e, check file directly.Decompressing...

Any ideas why this is happening?
regards

Add bzip3 as a submodule

Adding bzip3 as a git submodule (as explained e.g. here) would be a better idea than copying the entire source code to the trunk - it would allow for easier updates and code management.

:lady_beetle: Formatted i64 output incorrect

lrzip-next Version

v0.8.8

lrzip-next command line

lrzip-next -vt file.lrz

What happened?

When performing lrzip-next -t file.lrz and there is not adequate space, space needed shows as negative #.
$ lrzip-next -vt file.lrz
Using configuration file /home/peter/.lrzip/lrzip.conf
The following options are in effect for this INTEGRITY TEST.
Threading is ENABLED. Number of CPUs detected: 8
Detected 16,556,953,600 bytes ram
Nice Value: 19
Show Progress
Verbose
Test file integrity
Temporary Directory set as: ./
Detected lrzip version 0.8 file.
Inadequate free space to test file. Space needed: 1,270,085,632. Space available: -1,831,952,384.
Try setting TMP=dirname and select a larger volume.
Fatal error - exiting

This is a result of using a 32 bit formatting macro for a 64 bit data type. Apparently, there are several data type mismatches throughout the code. Even using int data types instead of unsigned for all large numbers is limiting for large values.

All output formatting is being reviewed and will be pushed when complete.

Correcting it looks like this:

$ /tmp/lrzip-next/src/lrzip-next -vt file.lrz
Using configuration file /home/peter/.lrzip/lrzip.conf
The following options are in effect for this INTEGRITY TEST.
Threading is ENABLED. Number of CPUs detected: 8
Detected 16,556,953,600 bytes ram
Nice Value: 19
Show Progress
Verbose
Test file integrity
Temporary Directory set as: ./
Detected lrzip version 0.8 file.
Inadequate free space to test file. Space needed: 5,565,052,928. Space available: 2,463,014,912.
Try setting TMP=dirname and select a larger volume.
Fatal error - exiting

What was expected behavior?

long int or 64 bit ints should be printed properly.

Steps to reproduce

lrzip-next -t file.lrz

Relevant log output

see above

Please provide system details

OS Distro: Slackware
Kernel Version: 5.15.11+
System ram (free -h): 16GB

Additional Context

No response

:lady_beetle: bzip3_poc possible data type mismatch

lrzip-next Version

0.9.3

lrzip-next command line

N/A

What happened?

bzip3 expects size data to be compressed and returns size of compressed to be int32_t yet lrzip-next cthread structure stores compressed and uncompressed data sizes as int64_t. While unlikely that the size of a single block will exceed 2GB, it is not impossible, lrzip-next supports only 64-bit processors now. For bzip3 to work consistently, either type casting or error checking has to be introduced. It appears bzip2 compression/decompression may also be impacted, although it appears to use uint32_t. Some re-evaluation of the use of int64_t in compression/decompression may need to be done.

  74 static struct compress_thread {
  75         uchar *s_buf;   /* Uncompressed buffer -> Compressed buffer */
  76         uchar c_type;   /* Compression type */
  77         i64 s_len;      /* Data length uncompressed */
  78         i64 c_len;      /* Data length compressed */
  79         cksem_t cksem;  /* This thread's semaphore */
  80         struct stream_info *sinfo;
  81         int streamno;   
  82         uchar salt[SALT_LEN];
  83 } *cthreads;
109 /* ** LOW LEVEL APIs ** */                                                                                                                       
110 
111 /**
112  * @brief Encode a single block. Returns the amount of bytes written to `buffer'.
113  * `buffer' must be able to hold at least `bz3_bound(size)' bytes. The size must not
114  * exceed the block size associated with the state.
115  */
116 BZIP3_API int32_t bz3_encode_block(struct bz3_state * state, uint8_t * buffer, int32_t size);
117 
118 /**
119  * @brief Decode a single block.
120  * `buffer' must be able to hold at least `orig_size' bytes. The size must not exceed the block size
121  * associated with the state.
122  * @param size The size of the compressed data in `buffer'
123  * @param orig_size The original size of the data before compression.
124  */
125 BZIP3_API int32_t bz3_decode_block(struct bz3_state * state, uint8_t * buffer, int32_t size, int32_t orig_size);

What was expected behavior?

No data type mismatches

Steps to reproduce

Review source in lrzip_private.h, stream.c.

Relevant log output

No response

Please provide system details

64 bit systems

Additional Context

Type matching is important to prevent possible data overflows.

Finish removing stale library code

There's still some work to remove all vestiges of the abandoned lrzip-next library code. Callback functions and data variable members of the control structure have to be purged as well as some defined functions. Most of these have been identified and will be cleared at some point. Just want to test a lot to make sure nothing else breaks!

Examine lrzip-next memory usage defaults

Currently, when lrzip or lrzip-next are running on a file, it will allocate 2/3 of total ram for the compression window and 1/3 for backend compression and rzip. This, regardless of file size.

When getting data from STDIN and output to a file,, lrzip-next will allocate half that.
When getting data from STDIN and output to STDOUT lrzip-next will allocate a quarter that.

These defaults are hard coded and sometimes results in swap space being used.

The question here is can a more optimal way of ram usage be devised?

Are all these MD5 tests really necessary?

HAS_MD5, NO_MD5, a confusing mish mash of code.

include/lrzip_private.h:260:#define NO_MD5              (!(HASH_CHECK) && !(HAS_MD5))
include/lrzip_private.h:334:#define HAS_MD5             (control->flags & FLAG_MD5)
lrzip.c:155:    if (!NO_MD5)
lrzip.c:1113:   if (ofs >= infile_size - (HAS_MD5 ? MD5_DIGEST_SIZE : 0))
lrzip.c:1233:   if (HAS_MD5) {
lrzip.c:1537:   if (NO_MD5)
lrzip.c:1539:   if (HAS_MD5)
runzip.c:175:   if (!HAS_MD5)
runzip.c:178:   if (!NO_MD5)
runzip.c:241:           if (!HAS_MD5)
runzip.c:244:           if (!NO_MD5)
runzip.c:376:   if (!HAS_MD5) {
runzip.c:410:   if (!NO_MD5) {
runzip.c:456:   if (!NO_MD5) {
runzip.c:460:           if (HAS_MD5) {
runzip.c:518:                   if (!HAS_MD5)
rzip.c:588:     if (!NO_MD5)
rzip.c:758:                     if (!NO_MD5)
rzip.c:765:             if (!NO_MD5)
rzip.c:962:     if (!NO_MD5) {
rzip.c:1211:    if (!NO_MD5) {

Not sure how this ever made it into the code. Maybe it had to do with the 2011 note about Apple not supporting MD5.

In advance of allowing user-selected hashing, the question arises: Is hashing necessary at all? The backend compression algos all report errors when a chunk can't be decompressed. The first step is to make hashing a default and get rid of all this cruft code. Then, we can decide if hashing is needed at all.

Thoughts?

BUG: yasm does not compile. -g -F options fail.

Irzip-next is indeed very fast. I had one problem compiling the ASM parts with yasm

I have some test vectors or sample files
if would you like to try : https://drive.google.com/file/d/1I-fBRiTEgAEQ6yCW155My1R9fWk3b39T/view

compile output with yasm parameter fail

/bin/sh ../../../libtool --tag=CC --mode=link gcc -D_REENTRANT -D_7ZIP_LARGE_PAGES -I../../../src -I../include -D_LZMA_DEC_OPT -g -O2 -pthread -o liblzma.la Alloc.lo Bra86.lo Bra.lo BraIA64.lo CpuArch.lo Delta.lo LzFind.lo LzFindMt.lo LzmaDec.lo LzmaEnc.lo LzmaLib.lo Threads.lo -lpthread -lgcrypt -lgpg-error -llz4 -llzo2 -lbz2 -lz -lm -lpthread
libtool: link: ar cr .libs/liblzma.a .libs/Alloc.o .libs/Bra86.o .libs/Bra.o .libs/BraIA64.o .libs/CpuArch.o .libs/Delta.o .libs/LzFind.o .libs/LzFindMt.o .libs/LzmaDec.o .libs/LzmaEnc.o .libs/LzmaLib.o .libs/Threads.o
libtool: link: ranlib .libs/liblzma.a
libtool: link: ( cd ".libs" && rm -f "liblzma.la" && ln -s "../liblzma.la" "liblzma.la" )
make[3]: Leaving directory '/home/bit/source/lrzip-next/src/lzma/C'
Making all in ASM
make[3]: Entering directory '/home/bit/source/lrzip-next/src/lzma/ASM'
yasm -Dx64 -f elf64 -g -F dwarf -I ./x86/ -o LzmaDecOpt.o /home/bit/source/lrzip-next/src/lzma/ASM/x86/LzmaDecOpt.asm
yasm -Dx64 -f elf64 -g -F dwarf -I ./x86/ -o LzFindOpt.o /home/bit/source/lrzip-next/src/lzma/ASM/x86/LzFindOpt.asm
yasm: yasm: option -g' needs an argument!option -g' needs an argument!

yasm: yasm: warning: unrecognized option -F'warning: unrecognized option -F'

yasm: yasm: warning: can open only one input file, only the last file will be processedwarning: can open only one input file, only the last file will be processed

make[3]: *** [Makefile:606: LzFindOpt.lo] Error 1
make[3]: *** Waiting for unfinished jobs....
make[3]: *** [Makefile:600: LzmaDecOpt.lo] Error 1
make[3]: Leaving directory '/home/bit/source/lrzip-next/src/lzma/ASM'
make[2]: *** [Makefile:433: all-recursive] Error 1
make[2]: Leaving directory '/home/bit/source/lrzip-next/src/lzma'
make[1]: *** [Makefile:510: all-recursive] Error 1
make[1]: Leaving directory '/home/bit/source/lrzip-next'
make: *** [Makefile:421: all] Error 2
[
I did not look into it further

Originally posted by @chargen in #69 (comment)

Memory overruns

When running the p7zip-16.02 branch, swap space is used. Can't quite figure out why. However, if swap is disabled, the program runs fine also. Help wanted!

ZPAQ TODO: Get show_progress to work on compression

In libzpaq.cpp: Compressor::compress(int n)

/* TODO
 * Need to show progress
      // ver 7.15 uses read instead of get
      // need to show progresss in this way
      if (!(i % 128))
              show_progress(i);
*/

show_progress is defined in libzpaq.h in the lrzip block at the end.

Documentation not installed under lrzip-next directory

Looks like I messed up the installation of docs. Instead of all being under lrzip-next-version..., they are scattered about. Not known when I introduced this bug. It will be fixed. Been focusing on the larger elements, like the new branch For Dictionary Size

ll -R usr/doc
usr/doc:
total 188
-rw-r--r-- 1 root root   913 Apr 25 15:58 AUTHORS
-rw-r--r-- 1 root root   219 Apr 25 15:58 BUGS
-rw-r--r-- 1 root root 18092 Apr 25 15:58 COPYING
-rw-r--r-- 1 root root 53258 Apr 25 15:58 ChangeLog
-rw-r--r-- 1 root root  4112 Apr 25 15:58 README-NOT-BACKWARD-COMPATIBLE
-rw-r--r-- 1 root root  1363 Apr 25 15:58 README.Assembler
-rw-r--r-- 1 root root  6972 Apr 25 15:58 README.NEW.BENCHMARK.ALGO.md
-rw-r--r-- 1 root root  4158 Apr 25 15:58 README.SDK19_COMPARISON.md
-rw-r--r-- 1 root root  6823 Apr 25 15:58 README.benchmarks
-rw-r--r-- 1 root root  1159 Apr 25 15:58 README.filters
-rw-r--r-- 1 root root  5714 Apr 25 15:58 README.lzo_compresses.test.txt
-rw-r--r-- 1 root root  6997 Apr 25 15:58 README.md
-rw-r--r-- 1 root root   718 Apr 25 15:58 TODO
-rw-r--r-- 1 root root 23128 Apr 25 15:58 WHATS-NEW
drwxr-xr-x 2 root root  4096 Apr 25 15:58 lrzip-next-0.7.50
-rw-r--r-- 1 root root  2017 Apr 25 15:58 lrzip.conf.example
drwxr-xr-x 2 root root  4096 Apr 25 15:58 lzma
-rw-r--r-- 1 root root  3297 Apr 25 15:58 magic.header.txt
drwxr-xr-x 2 root root  4096 Apr 25 15:58 zpaq

usr/doc/lrzip-next-0.7.50:
total 72
-rw-r--r-- 1 root root   558 Apr 25 15:57 AUTHORS.gz
-rw-r--r-- 1 root root   192 Apr 25 15:57 BUGS.gz
-rw-r--r-- 1 root root  6832 Apr 25 15:57 COPYING.gz
-rw-r--r-- 1 root root 19556 Apr 25 15:57 ChangeLog.gz
-rw-r--r-- 1 root root  5838 Apr 25 15:57 INSTALL.gz
-rw-r--r-- 1 root root  8489 Apr 25 15:57 OLDREADME.md.gz
-rw-r--r-- 1 root root  1646 Apr 25 15:57 README-NOT-BACKWARD-COMPATIBLE.gz
-rw-r--r-- 1 root root  3295 Apr 25 15:57 README.md.gz
-rw-r--r-- 1 root root   449 Apr 25 15:57 TODO.gz
-rw-r--r-- 1 root root   108 Apr 25 15:57 VERSION.gz

usr/doc/lzma:
total 100
-rw-r--r-- 1 root root  3127 Apr 25 15:58 Methods.txt
-rw-r--r-- 1 root root  1848 Apr 25 15:58 README
-rw-r--r-- 1 root root  1220 Apr 25 15:58 README-Alloc
-rw-r--r-- 1 root root   368 Apr 25 15:58 README.ASMDecompress
-rw-r--r-- 1 root root   401 Apr 25 15:58 README.threading
-rw-r--r-- 1 root root 13750 Apr 25 15:58 lzma-history.txt
-rw-r--r-- 1 root root 12452 Apr 25 15:58 lzma-sdk.txt
-rw-r--r-- 1 root root 35585 Apr 25 15:58 lzma-specification.txt
-rw-r--r-- 1 root root 10060 Apr 25 15:58 lzma.txt

usr/doc/zpaq:
total 216
-rw-r--r-- 1 root root   2741 Apr 25 15:58 COPYING
-rw-r--r-- 1 root root   3791 Apr 25 15:58 readme.txt
-rw-r--r-- 1 root root 209486 Apr 25 15:58 zpaq206.pdf

:bulb: Improve stream_bufsize for stream.c:open_stream_out

lrzip-next Version

0.8.5

Feature Suggestion

The open_stream_out() function sets the optimum segment size per chunk. For zpaq, it cannot be larger than the zpaq block size itself - 4096 (see #46 ). But for lzma the segment size is too small.If control->overhead x number of threads is less than the limit imposed by control->usable_ram/testbufs. Then the overhead should be the maximum segment size per thread. It's already computed. So, the solution resides in this block.

1095                 if (ZPAQ_COMPRESS && (limit/control->threads > 0x100000<<control->zpaq_bs))
1096                         stream_bufsize = round_up_page(control, MAX((0x100000<<control->zpaq_bs)-0x1000, STREAM_BUFSIZE));
1097                 else
1098                         stream_bufsize = round_up_page(control, MAX(limit/control->threads, STREAM_BUFSIZE));
1099
1100                 if (control->threads > 1)
1101                         print_maxverbose("Using up to %'d threads to compress up to %'"PRId64" bytes each.\n",
1102                                 control->threads, stream_bufsize);
1103                 else
1104                         print_maxverbose("Using only 1 thread to compress up to %'"PRId64" bytes\n",
1105                                 stream_bufsize);

So some work is ongoing at line 1097 to compute there can be a specific lzma adjustment for maximum segment size. The thought is that control->overhead should be the actual segment size per thread since it is already computed.

More later

Steps to reproduce

N/A

Relevant log output

N/A

Please provide system details

N/A

Additional Context

Work on open_stream_out() has been one of the earliest motivators for lrzip-next. Maximizing its effectiveness has been a priority. See Wiki Article.

lrzip-next -i emits incorrect MD5 when file encrypted

In encrypted archives, even the MD5 sum is encrypted. lrzip -i does not decrypt the MD5 as is done when decrypting or testing. The fix is trivial and I will push soon.

$ lrzip-next -tvv file.lrz (encrypted)
Decrypting data        
MD5: c2138c19760399ef091417968fb4186e
[OK]

$ lrzip-next -ivv file.lrz (encrypted)
Decompressed file size: Unavailable
Compressed file size: 2158099168
Compression ratio: Unavailable
MD5 used for integrity testing
MD5: 89eb8d585fde29e83742c78567007c0c

Code fragment missing from get_fileinfo

if (ENCRYPT)
   // pass decrypt flag
   if (unlikely(!lrz_decrypt(control, md5_stored, MD5_DIGEST_SIZE, control->salt_pass, LRZ_DECRYPT)))
      return -1;

:bulb: Upgrade lzma sdk to 21.07

lrzip-next Version

v0.8.8

Feature Suggestion

LZMA 21.07 is released.

Steps to reproduce

No response

Relevant log output

No response

Please provide system details

No response

Additional Context

Currently evaluating relevance to lrzip-next. Since a lot of the sdk is used to support 7-zip and xz and Windows, we have to see if any of these changes are meaningful. If so, a new branch will open.

:lady_beetle: lrzip-next fails when using -t option in write-protected dir

lrzip-next Version

0.9.1

lrzip-next command line

lrzip-next -t file.lrz

What happened?

This is a low-probability bug. When running lrzip-next -t on a file and the user is in a write-protected directory, lrzip-next will fail: This happens whether or not file.lrz is in the write-protected dir, or a non-write-protected dir. If TMP variable is NOT set, the temporary file lrzip-next tries to create will fail in the current dir.

WARNING: Failed to create out tmpfile: ./lrzipout.4GQLS0, will fail if cannot perform compression entirely in ram
Detected lrzip version 0.9.1 file.
MD5 being used for integrity testing.
Decompressing...
Failed to write literal buffer of size 339
Bad file descriptor
Fatal error - exiting

Even though the warning indicates a temp file may not be necessary.

What was expected behavior?

Testing of lrz file can continue if it can be decompressed entirely in ram OR
lrzip-next should fail outright

Steps to reproduce

In a write-protected directory, run command
lrzip-next -t file.lrz

Relevant log output

$ /share/software/Kernel/linux-5.x$ lrzip-next -vvt /tmp/v*lrz
Using configuration file /home/peter/.lrzip/lrzip.conf
The following options are in effect for this INTEGRITY TEST.
Threading is ENABLED. Number of CPUs detected: 8
Detected 16,558,260,224 bytes ram
Nice Value: 19
Show Progress
Max Verbose
Test file integrity
Temporary Directory set as: ./
WARNING: Failed to create out tmpfile: ./lrzipout.4dl460, will fail if cannot perform compression entirely in ram

Malloced 5,519,417,344 for tmp_outbuf
Failed to fstatvfs in decompress_file
Fatal error - exiting

Please provide system details

OS Distro: Slackware64-current
Kernel Version (uname -a): 5.19.1
System ram (free -h):

               total        used        free      shared  buff/cache   available
Mem:            15Gi       1.3Gi        11Gi       483Mi       2.2Gi        13Gi
Swap:           15Gi          0B        15Gi

Additional Context

When TMP environment variable is set, this error does not occur. Temp file will be created in TMP directory. However, when TMP is not set, temp file will be created in current directory. Also, the warning message that says the test can be run entirely in RAM is not occurring. This warning should either fail completely or some coding needs changing to allow running in RAM. This error also occurs in the main branch of lrzip version 0.651.

Not sure the best way to address this.

:lady_beetle: Crash when writing >2GB memory to fd_out

lrzip-next Version

0.9

lrzip-next command line

lrzip-next -t | -d file.lrz

What happened?

In function write_fdout(), large > 2GB buffers fail on first write. This should not happen. write() function has a limit of 2 GB but returns number of bytes written. If no error, write_fdout() should continue using ret value to reduce number of bytes remaining to write. Lines 588 and 589 are not correct to have a fatal outcome. Also, the use of formatting parameter PRId32 is not correct since ssize_t is a 64 bit int. Also, fatal call is not correct unless an error is set. Otherwise, it should just report bytes written or nothing and continue.

589                     if (unlikely(ret != nmemb))
590                             fatal("Failed to write %'"PRId32" bytes to fd_out in write_fdout\n", nmemb);
581     bool write_fdout(rzip_control *control, void *buf, i64 len)
582     {
583             uchar *offset_buf = buf;
584             ssize_t ret, nmemb;
585
586             while (len > 0) {
587                     nmemb = len;
588                     ret = write(control->fd_out, offset_buf, (size_t)nmemb);
589                     if (unlikely(ret != nmemb))
590                             fatal("Failed to write %'"PRId32" bytes to fd_out in write_fdout\n", nmemb);
591                     len -= ret;
592                     offset_buf += ret;
593             }
594             return true;
595     }

What was expected behavior?

write() function should continue until all bytes are written. fatal call is not correct.

Steps to reproduce

Decompress file > 2GB uncompressed.

Relevant log output

Taking decompressed data from thread 2
 52%    2794.18 /   5326.77 MB
Unable to decompress entirely in ram, will use physical files
Failed to write 1,224,443,427 bytes to fd_out in write_fdout
Deleting broken file /tmp/bzip3_poc/kernel.5.x.tar.L9
Fatal error - exiting


### Please provide system details

OS Distro:  Slackware
Kernel Version (uname -a): 6.0.2
System ram (free -h): 
               total        used        free      shared  buff/cache   available
Mem:            15Gi       848Mi       7.5Gi       353Mi       7.0Gi        13Gi
Swap:           15Gi       1.0Gi        14Gi



### Additional Context

lots of corrections coming including use of 32-bit protection code `write_1g`, incorrect formatting using `PRId32`.

This should not affect quality of compressed archives.

:bulb: [Get info about deduplication]

lrzip-next Version

N/A

Feature Suggestion

If I'm not mistaken, I know ZPAQ (by itself) shows the new size after deduplication. So, maybe we could have a feature where we can know how much data is "removed." This could be a feature where it'll be printed out with everything else, like -v for example, or maybe run it by itself with no compression. I'm not totally sure if its even a good thing to do, but I'm sure it'd be nice to see if deduplication will even get rid of a significant amount. If my assumption is correct, if there's not much to deduplicate, maybe using any method won't make a difference. I have some files that probably don't need compression but could benefit a bit with deduplication, for example, 1-5GB. Hopefully you get my point? Anyway, I love this fork, much more info and better utilization of LZMA dictionaries :P

Steps to reproduce

No response

Relevant log output

No response

Please provide system details

No response

Additional Context

No response

nasm 2.11 fails in make due to order of options

Reported by @Clingto

Apparently, nasm 2.15 is more flexible in the order of command line options. Testing with much earlier nasm 2.11 illuminates a few problems using -g and -I.

nasm -Dx64 -g -F dwarf -f elf64 -I ./x86 -o 7zCrcOpt_asm.o /tmp/build_bin/src/lzma/ASM/x86/7zCrcOpt_asm.asm
nasm: fatal: unrecognized debug format `dwarf' for output format `bin'

Reordering options AND adding a trailing slash to the -I./x86 option is successful.

nasm -Dx64 -f elf64 -g -F dwarf -I ./x86/ -o 7zCrcOpt_asm.o /tmp/build_bin/src/lzma/ASM/x86/7zCrcOpt_asm.asm
nasm -Dx64 -f elf64 -g -F dwarf -I ./x86/ -o LzmaDecOpt.o /tmp/build_bin/src/lzma/ASM/x86/LzmaDecOpt.asm
/bin/sh ../../../libtool  --tag=CC   --mode=link gcc  -g -O2 -pthread   -o liblzmaASM.la    7zCrcOpt_asm.lo LzmaDecOpt.lo -lgcrypt -lgpg-error -llz4 -llzo2 -lbz2 -lz -lm -lpthread 
libtool: link: ar cru .libs/liblzmaASM.a .libs/7zCrcOpt_asm.o .libs/LzmaDecOpt.o 
libtool: link: ranlib .libs/liblzmaASM.a
libtool: link: ( cd ".libs" && rm -f "liblzmaASM.la" && ln -s "../liblzmaASM.la" "liblzmaASM.la" )

Try these patches:

diff --git a/configure.ac b/configure.ac
index 682d81f..916acbe 100644
--- a/configure.ac
+++ b/configure.ac
@@ -67,11 +67,11 @@ if test x"$ASM" = x"yes"; then
 ## only for x86 and x86_64
        case $host in
                i?86-*)
-                       ASM_OPT="-g -F dwarf -f elf"
+                       ASM_OPT="-f elf -g -F dwarf"
                        USE_64=no
                        ;;
                x86_64-*)
-                       ASM_OPT="-Dx64 -g -F dwarf -f elf64"
+                       ASM_OPT="-Dx64 -f elf64 -g -F dwarf"
                        USE_64=yes
                        ;;
                *) 
diff --git a/src/lzma/ASM/Makefile.am b/src/lzma/ASM/Makefile.am
index d48723d..8d91204 100644
--- a/src/lzma/ASM/Makefile.am
+++ b/src/lzma/ASM/Makefile.am
@@ -5,7 +5,7 @@ ABSSRC = @abs_srcdir@/x86
 ASM_7z = 7zCrcOpt_asm
 ASM_H  = $(SRC)/7zAsm.asm
 ASM_S  = $(SRC)/$(ASM_7z).asm
-ASM_OPT += -I $(SRC)
+ASM_OPT += -I $(SRC)/
 ## For LZMA Assembler Decompressor
 if USE_X64
   ASM_De = LzmaDecOpt

lrzip_private.h does this make any sense?

193 #ifdef leto32h
194 # define le32toh(x) leto32h(x)
195 # define le64toh(x) leto64h(x)
196 #endif
197
198 #ifndef le32toh
199 # if __BYTE_ORDER == __LITTLE_ENDIAN
200 #  define htole32(x) (x)
201 #  define le32toh(x) (x)
202 #  define htole64(x) (x)
203 #  define le64toh(x) (x)
204 # elif __BYTE_ORDER == __BIG_ENDIAN
205 #  define htole32(x) bswap_32 (x)
206 #  define le32toh(x) bswap_32 (x)
207 #  define htole64(x) bswap_64 (x)
208 #  define le64toh(x) bswap_64 (x)
209 #else
210 #error UNKNOWN BYTE ORDER
211 #endif
212 #endif

if leto32h is defined, then the error "UNKNOWN BYTE ORDER" will show and abort compilation. Is the first test #ifdef leto32h even needed? Can we remove and just go to testing Endianness?

199 #if __BYTE_ORDER == __LITTLE_ENDIAN
200 #  define htole32(x) (x)
201 #  define le32toh(x) (x)
202 #  define htole64(x) (x)
203 #  define le64toh(x) (x)
204 #elif __BYTE_ORDER == __BIG_ENDIAN
205 #  define htole32(x) bswap_32 (x)
206 #  define le32toh(x) bswap_32 (x)
207 #  define htole64(x) bswap_64 (x)
208 #  define le64toh(x) bswap_64 (x)
209 #endif

I don't see why not? Comments? Am I missing something?

compat: -c,-C options

While I see merit in trying to make options similar to gzip (and many are already), the idea of having -c and -C is not good IMO. Since we are in development versions, instead of having synonyms for options, maybe in this branch redefine check as just a long option only since it has no parallel in gzip. Until we ever get to version 1.0 (production), we do have some flexibility in option naming, changing. JM2C.

I do like the --fast and --best options.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.