Giter VIP home page Giter VIP logo

Comments (8)

tsolomko avatar tsolomko commented on June 19, 2024 1

Hmm, I see. I will think about adding such functionality in the future.

I would have thought that it would be pretty easy to just "toggle" appending/prepending the frame metadata

Generally, it is not. As I mentioned before one input may result in creation of several blocks. This can happen if the input is "sufficiently large", the precise definition of which depends on your uncompressed block size settings. In SWCompression by default the block size is 4MB, which incidentally is the strict upper limit of the reference implementation on the block size. I do wonder what the aforementioned LZ4_compress_default function from lz4.h does if the input is larger than 4MB. It is also unclear to me how does it handle cases of non-compressible inputs.

Meanwhile, assuming that size of your input is always smaller than 4MB and the input is always compressible, I can suggest the following workaround to extract only block data:

let data = // ... data.count must be smaller than 4 * 1024 * 1024
let compressedData = LZ4.compress(data: data)
let block = compressedData[(compressedData.startIndex + 11)..<(compressedData.endIndex - 8)]

from swcompression.

tsolomko avatar tsolomko commented on June 19, 2024

I don't know anything about Apple's LZ4_RAW, but it definitely should be compatible with the reference implementation. Do you have any specific examples of incompatibility?

from swcompression.

mickeyl avatar mickeyl commented on June 19, 2024

Ok, here's some details. Consider the following example using upstream lz4.c and lz4.h:

-(void)testCompression {

    auto string = std::string("HALLO HALLO HALLO HALLO HALLO HALLO HALLO HALLO HALLO HALLO");
    uint8_t buffer[100];

    auto compressed = ::LZ4_compress_default(string.c_str(), (char*)buffer, int(string.end() - string.begin()), sizeof(buffer));
    
    for (int i = 0; i < compressed; ++i) {
        printf("%02X, ", buffer[i]);
    }
    printf("\n");
}

This prints out 6F, 48, 41, 4C, 4C, 4F, 20, 06, 00, 1D, 50, 48, 41, 4C, 4C, 4F, .

A corresponding Swift program:

import SWCompression
import Foundation

let string = "HALLO HALLO HALLO HALLO HALLO HALLO HALLO HALLO HALLO HALLO"
let data = string.data(using: .utf8)!
let compressed = LZ4.compress(data: data)
for byte in compressed {
    print(String(format: "%02X, ", byte), terminator: "")
}
print("")

emits 04, 22, 4D, 18, 64, 70, B9, 10, 00, 00, 00, 6F, 48, 41, 4C, 4C, 4F, 20, 06, 00, 1D, 50, 48, 41, 4C, 4C, 4F, 00, 00, 00, 00, 4B, B9, 24, 1F,.

It looks like your algorithm implementation is appending/prepending some metadata.

from swcompression.

tsolomko avatar tsolomko commented on June 19, 2024

Right, the function LZ4_compress_default from the reference implementation produces compressed blocks whereas SWCompression produces compressed frames which consist of blocks (in principle, more than one) and various metadata that is required to decompress these blocks. You may note that the output of LZ4_compress_default is contained in its entirety within SWCompression's output, which confirms this.

To quote LZ4 manual:

Blocks are different from Frames (doc/lz4_Frame_format.md).
Frames bundle both blocks and metadata in a specified manner.
Embedding metadata is required for compressed data to be self-contained and portable.
Frame format is delivered through a companion API, declared in lz4frame.h.

As such, to obtain the exact same output from the reference implementation, you would need to use a function from lz4frame.h.

from swcompression.

mickeyl avatar mickeyl commented on June 19, 2024

I see. Would it be possible to add a parameter to your implementation so that we can choose the output format between frames and blocks?

from swcompression.

tsolomko avatar tsolomko commented on June 19, 2024

I am not sure there is any value in choosing between frames and blocks. There are various configurable parameters of the LZ4 algorithm (block size, block dependency, dictionary compression) that can result in different outputs (which may even include different amount of blocks themselves!) for the same input. So to decompress any isolated block you would still need to provide some external knowledge to the decompressor about how these blocks were compressed. At this point you would be reimplementing (at least, partially) the LZ4 frame format but in some other form.

So my question to you, why do you require the capability to compress into blocks? If you would like to reduce the size of the output, you can have a look at this function. By default, blockChecksums and contentSize parameters are false, while contentChecksum is true, so you can additionally set the latter to false to further reduce the amount of metadata.

from swcompression.

mickeyl avatar mickeyl commented on June 19, 2024

I'm using LZ4 to compress/uncompress binary protocol data that is sent to a micro-controller (ESP32) via BLE. Every saved byte is valuable in this scenario. The central (iOS) would be using SWCompression, the peripheral (FreeRTOS) would be using upstream lz4. I would have thought that it would be pretty easy to just "toggle" appending/prepending the frame metadata, but if it's not, I might as well be using upstream lz4 on iOS.

from swcompression.

mickeyl avatar mickeyl commented on June 19, 2024

Awesome, that's a quick fix that will do it for now. Thanks a lot.

from swcompression.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.