For some reason, it looks like the version of LZ4 you have implemented seems not compa

LZ4 format not compatible? about swcompression HOT 8 CLOSED

tsolomko commented on June 19, 2024

LZ4 format not compatible?

from swcompression.

Comments (8)

tsolomko commented on June 19, 2024 1

Hmm, I see. I will think about adding such functionality in the future.

I would have thought that it would be pretty easy to just "toggle" appending/prepending the frame metadata

Generally, it is not. As I mentioned before one input may result in creation of several blocks. This can happen if the input is "sufficiently large", the precise definition of which depends on your uncompressed block size settings. In SWCompression by default the block size is 4MB, which incidentally is the strict upper limit of the reference implementation on the block size. I do wonder what the aforementioned LZ4_compress_default function from lz4.h does if the input is larger than 4MB. It is also unclear to me how does it handle cases of non-compressible inputs.

Meanwhile, assuming that size of your input is always smaller than 4MB and the input is always compressible, I can suggest the following workaround to extract only block data:

let data = // ... data.count must be smaller than 4 * 1024 * 1024
let compressedData = LZ4.compress(data: data)
let block = compressedData[(compressedData.startIndex + 11)..<(compressedData.endIndex - 8)]

from swcompression.

tsolomko commented on June 19, 2024

I don't know anything about Apple's LZ4_RAW, but it definitely should be compatible with the reference implementation. Do you have any specific examples of incompatibility?

from swcompression.

mickeyl commented on June 19, 2024

Ok, here's some details. Consider the following example using upstream lz4.c and lz4.h:

-(void)testCompression {

    auto string = std::string("HALLO HALLO HALLO HALLO HALLO HALLO HALLO HALLO HALLO HALLO");
    uint8_t buffer[100];

    auto compressed = ::LZ4_compress_default(string.c_str(), (char*)buffer, int(string.end() - string.begin()), sizeof(buffer));
    
    for (int i = 0; i < compressed; ++i) {
        printf("%02X, ", buffer[i]);
    }
    printf("\n");
}

This prints out 6F, 48, 41, 4C, 4C, 4F, 20, 06, 00, 1D, 50, 48, 41, 4C, 4C, 4F, .

A corresponding Swift program:

import SWCompression
import Foundation

let string = "HALLO HALLO HALLO HALLO HALLO HALLO HALLO HALLO HALLO HALLO"
let data = string.data(using: .utf8)!
let compressed = LZ4.compress(data: data)
for byte in compressed {
    print(String(format: "%02X, ", byte), terminator: "")
}
print("")

emits 04, 22, 4D, 18, 64, 70, B9, 10, 00, 00, 00, 6F, 48, 41, 4C, 4C, 4F, 20, 06, 00, 1D, 50, 48, 41, 4C, 4C, 4F, 00, 00, 00, 00, 4B, B9, 24, 1F,.

It looks like your algorithm implementation is appending/prepending some metadata.

from swcompression.

tsolomko commented on June 19, 2024

Right, the function LZ4_compress_default from the reference implementation produces compressed blocks whereas SWCompression produces compressed frames which consist of blocks (in principle, more than one) and various metadata that is required to decompress these blocks. You may note that the output of LZ4_compress_default is contained in its entirety within SWCompression's output, which confirms this.

To quote LZ4 manual:

Blocks are different from Frames (doc/lz4_Frame_format.md).
Frames bundle both blocks and metadata in a specified manner.
Embedding metadata is required for compressed data to be self-contained and portable.
Frame format is delivered through a companion API, declared in lz4frame.h.

As such, to obtain the exact same output from the reference implementation, you would need to use a function from lz4frame.h.

from swcompression.

mickeyl commented on June 19, 2024

I see. Would it be possible to add a parameter to your implementation so that we can choose the output format between frames and blocks?

from swcompression.

tsolomko commented on June 19, 2024

I am not sure there is any value in choosing between frames and blocks. There are various configurable parameters of the LZ4 algorithm (block size, block dependency, dictionary compression) that can result in different outputs (which may even include different amount of blocks themselves!) for the same input. So to decompress any isolated block you would still need to provide some external knowledge to the decompressor about how these blocks were compressed. At this point you would be reimplementing (at least, partially) the LZ4 frame format but in some other form.

So my question to you, why do you require the capability to compress into blocks? If you would like to reduce the size of the output, you can have a look at this function. By default, blockChecksums and contentSize parameters are false, while contentChecksum is true, so you can additionally set the latter to false to further reduce the amount of metadata.

from swcompression.

mickeyl commented on June 19, 2024

I'm using LZ4 to compress/uncompress binary protocol data that is sent to a micro-controller (ESP32) via BLE. Every saved byte is valuable in this scenario. The central (iOS) would be using SWCompression, the peripheral (FreeRTOS) would be using upstream lz4. I would have thought that it would be pretty easy to just "toggle" appending/prepending the frame metadata, but if it's not, I might as well be using upstream lz4 on iOS.

from swcompression.

mickeyl commented on June 19, 2024

Awesome, that's a quick fix that will do it for now. Thanks a lot.

from swcompression.

LZ4 format not compatible? about swcompression HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent