Giter VIP home page Giter VIP logo

Comments (4)

milesgranger avatar milesgranger commented on June 12, 2024

It seems like we can do that; and this is where the correct use of decompress_len as mentioned in #35 belongs. When we add this functionality we will have the following scenario:

  1. User provided bytearray (with or without output_len, as we can also use decompress_len for the estimate):
    • We can de/compress then resize the resulting bytearray to the actual size if needed. Super!
  2. User provided bytes
    • If they also provided output_len we're good for de/compression.
    • It appears decompress_len gives an exact answer; so long as that is successful we can do the single allocation for decompression
    • max_compress_len gives the max compressed size, in this case, they would likely get trailing null bytes back

and the _into addition for raw obviously doesn't matter for these points, so all good there.

For the second part, cramjam tries to follow the layout from the Rust crate it uses for snappy. For example snappy.de/compress_raw uses the de/encoders from the snap::raw module.

from cramjam.

martindurant avatar martindurant commented on June 12, 2024

I'll just give a 👍 in here - snappy-raw is the most important one to get fast from parquet's point of view. I'm happy about the naming convention.

from cramjam.

milesgranger avatar milesgranger commented on June 12, 2024

I don't know if we can get de/compress_raw functions to support output_len, as the raw de/compression functions there only output a new buffer, unlike the others which can take any writeable object. I can dig into the src later and see if it would be possible even, but suspect the one who wrote it has a good reason as the other portions of the crate do implement reader/writer parameters for framed de/compression.

The PR referenced here does implement the de/compress_raw_into, so I hope that is good for you in the mean time.

Would also point out that, while I don't know what data sizes you're working with, in the benchmarks the current de/compress_raw variants are extremely close with python-snappy and even edge it out in a couple of cases.

from cramjam.

milesgranger avatar milesgranger commented on June 12, 2024

de/compress_raw_into now follows the same API as other variants from #45 , and de/compress_raw supports output_len as well.

from cramjam.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.