Giter VIP home page Giter VIP logo

zzlib's Introduction

zzlib

Copyright (c) 2019-2024 by François Galea (fgalea à free.fr)

This is a pure Lua implementation of a depacker for the zlib DEFLATE(RFC1951)/GZIP(RFC1952) file format. zzlib also allows the decoding of zlib-compressed data (RFC1950). Also featured is basic support for extracting files from DEFLATE-compressed ZIP archives (no support for encryption).

The implementation is pretty fast. It makes use of the built-in bit32 (PUC-Rio Lua) or bit (LuaJIT) libraries for bitwise operations. Typical run times to depack lua-5.3.3.tar.gz on a single Core i7-6600U are 0.87s with Lua ≤ 5.2, 0.50s with Lua 5.3, and 0.17s with LuaJIT 2.1.0.

zzlib is distributed under the WTFPL licence. See the COPYING file or http://www.wtfpl.net/ for more details.

Usage

There are various ways of using the library.

Stream from a GZIP file

-- import the zzlib library
zzlib = require("zzlib")

-- get the unpacked contents of the file in the 'output' string
local output,err = zzlib.gunzipf("input.gz")
if not output then error(err) end

Read GZIP/zlib data from a string

Read a file into a string, call the depacker, and get a string with the unpacked file contents, as follows:

-- import the zzlib library
zzlib = require("zzlib")
...

if use_gzip then
  -- unpack the gzip input data to the 'output' string
  output = zzlib.gunzip(input)
else
  -- unpack the zlib input data to the 'output' string
  output = zzlib.inflate(input)
end

Extract a file from a ZIP archive stored in a string

Read a file into a string, call the depacker, and get a string with the unpacked contents of the chosen file, as follows:

-- import the zzlib library
zzlib = require("zzlib")
...

-- extract a specific file from the input zip file
output = zzlib.unzip(input,"lua-5.3.4/README")

Process the list of files from a ZIP archive stored in a string

The zzlib.files() iterator function allows you to span the whole list of files in a ZIP archive, as follows:

for _,name,offset,size,packed,crc in zzlib.files(input) do
  print(string.format("%10d",size),name)
end

During such a loop, the packed boolean variable is set to true if the current file is packed. You may then decide to unpack it using this function call:

output = zzlib.unzip(input,offset,crc)

If the file is not packed, then you can directly extract its contents using string.sub:

output = input:sub(offset,offset+size-1)

zzlib's People

Contributors

rilipco avatar samhocevar avatar zerkman avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

zzlib's Issues

Add support for RFC1950?

I guess this library can only decompress RFC1952(aka gzip) and RFC1951(deflate) compressed data right?
How about adding RFC1950(zlib)?

Consideration of file data descriptor - workaround

Hi,
zzlib source code tells that data descriptor are not considered, This is problematic in some use case i met. Here after a proposal (for file zzlib-bit32.lua but applicable, i suppose, to zzlib-bwo.lua too), this proposal include the workaround about crc32 signed (i apologize for this late post knowing you update recently zzlib)

function zzlib.unzip(buf,filename)
local p = 1
local quit = false
while not quit do
local head = buf:int4le(p)
if head == 0x04034b50 then
-- local file header signature
local flag = buf:int2le(p+6)
local method = buf:int2le(p+8)
local crc = buf:int4le(p+14)
local csize = buf:int4le(p+18)
local namelen = buf:int2le(p+26)
local extlen = buf:int2le(p+28)
local name = buf:sub(p+30,p+29+namelen)
--ADD-1 >>
local NxtHdrPos=-1 -- init with out of range value
p = p+30+namelen+extlen
--ADD-1 <<
if bit.band(flag,1) ~= 0 then
error("no support for encrypted files")
elseif method ~= 8 and method ~= 0 then
error("unsupported compression method")
elseif bit.band(flag,8) ~= 0 then
--DEL-1 >>
-- error("no support for the data descriptor record")
--DEL-1 <<
--ADD-2 >>
-- SOURCES
-- [0] https://stackoverflow.com/questions/38437826/how-to-create-zip-file-with-data-descriptor-section
-- [1] https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT
-- [2] https://users.cs.jmu.edu/buchhofp/forensics/formats/pkzip.html
NxtHdrPos = string.find( buf, '\x50\x4b\x03\x04', p, true ) -- look ahead to get the signature that starts the next local header file
if( NxtHdrPos==nil ) then error('unable to find '..filename.. 'into the zip' )
else
local DataDescrPos = NxtHdrPos-12 -- jump back to DataDescriptor Head regarding its size (in bytes) see [1] paragraph 4.3.9
crc = buf:int4le(DataDescrPos) -- overwrite previous crc that is null, see [2] paragraph Data descriptor
csize = buf:int4le(DataDescrPos+4) -- overwrite previous size that is null, see [2] paragraph Data descriptor
end
--ADD-2 <<
end

  if name == filename then
    local result
    if method == 0 then
      -- no compression
      result = buf:sub(p,p+csize-1)
    else
      -- DEFLATE compression
      local bs = bitstream_init(buf)
      bs.pos = p
      result = inflate_main(bs)
    end

--ADD >>
local crc_res = crc32(result)
if( crc_res<0 ) then
local bitmax32 = 4294967296
crc_res = bitmax32 + crc_res
end
if crc ~= crc_res then
--ADD <<
--DEL >>
--if crc ~= crc32(result) then
--DEL <<
error("checksum verification failed")
end
return result
end
--DEL-2 >>
-- p = p+csize
--DEL-2 <<
--ADD-3 >>
if( bit.band(flag,8) ~= 0 )
then p = NxtHdrPos
else p = p+csize
end
--ADD-3 <<
else
-- other header: end of the list of files
quit = true
end
end
end

This "enhancement" allow usage of zzlib to decompress '.XLSX' file (that are zip file beyond extension) saved by Calc Application (from LibreOffice).
It worthly to note that '.XLSX' file saved by EXCEL application (from MSOffice) do not need this as they do not use data descriptor (different implement strategy granted by zip specification)

Feel free to adapt or reject this proposal it if you think is useless.

Support archives with comments.

You can reliably strip away comments (at least based off of my testing) thus allowing this project to work as intended.

local start = string.find(data, 'PK\5\6')
if start then
    data = string.sub(data, 1, start + 19)..'\0\0'
end

with data being the bitstream's buffer.

Lua 5.3 - no bit32

It looks like Lua 5.3 deprecates bit32 in favour of integer bitwise operators in 64 bits (unless compiled in compatibility mode). I took a quick look at doing something like

local bit = {
  band = function (a, b) { return a & b }
  ... etc ...
}

It looks like integer overflow might be handled differently (due to 64 / 32 bits) so I wasn't sure if that was safe, or if it might make this library faster to use all 64 bits, any thoughts on a good way to handle platforms without bit32?

Unkonwn files inside ZIP file

First of all thank you very much for your effort invested in zzlib library.

How to read the list of files inside ZIP. The method zzlib.unzip(inputbuffer,filename) returns the content of given filename. But here I have to know the filename in advance. How to read what filenames exist inside the ZIP archive?

TIA

Lua5.1 with ZeroBrane does not work with zzlib, workaround propose

Hi,

  • Firstly thank you for the library you did.
  • Secondly, i am using Zerobrane IDE 1.90 [ZB, for concision] for development.
    My code consists to analyze files contained into a zip one,
    it works fine with zzlib for Lua5.3 and Lua5.2 interpreters provided by ZB
    it fails with Lua5.1 interpreters

Using ZB debugger, problem occurs on zzlib-bit32.lua (it is logic regarding the interpreter in use).
More specifically in the crc cross comparison at your code > if crc ~= crc32(result) <
crc coming from the zip is a positive integer while the one computed : crc32(result ) returns a negative integer in my use case, so comparaison fails logically.

In fact crc32(result) is binary identical to crc from bit 0 to 32 and differ after (due to the sign propagation that cause the negative value).

i do not know why difference occurs but i use the following hacks as a workaround (if it helps)
....
result = inflate_main(bs)
end
-- >> workaround >>
local crc_res = crc32(result)
if( crc_res<0 ) then
local bitmax32 = 4294967296
crc_res = bitmax32 + crc_res
end
if crc ~= crc_res then
-- if crc ~= crc32(result) then
-- << workaround <<
....

  • Thirdly, minor observation between zzlib-bit32.lua(line 298) and zzlib-bwo.lua(line297) about crc computation. in one case i read: (c xor crc) and 0xff while on other side i read : c xor (crc and 0xff)
    i think it is equivalent but surprising at first sight when looking for the root cause of previous problem ^^

thanks again.

[Feature] Compress data

Would you accept a PR including a new function used to compress data?

Thank you for this project!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.