ymnk / jzlib Goto Github PK
View Code? Open in Web Editor NEWre-implementation of zlib in pure Java
Home Page: http://www.jcraft.com/jzlib/
License: Other
re-implementation of zlib in pure Java
Home Page: http://www.jcraft.com/jzlib/
License: Other
Current stream inflate code doesn't handle unwrapped data.
There a several corner cases. zlib doesn't always return Z_STREAM_END. It doesn't always set the DONE state.
I tried to put together a patch, but haven't been able to cover all the cases of block size and compressible/incompressible.
I ported my native C zlib code that uses Inflater directly and it works, but my use case is more restrictive (a single block of compressed data) so I don't have to handle all the cases that the inflate stream does.
I don't have an adequate patch but the updated test code at
includes the corner cases that fail.
I have not been allowed to open his or her name, but here is a quote from a message from anonymous
I'm exploring jzlib (version 1.1.1) as a replacement for the JDK deflater.
It seems that under high load (many threads deflating), there is a lot of
lock contention on the static synchronized method
com.jcraft.jzlib.Tree#gen_codes().
This is unexpected to me. I would not expect a single point of contention
in a compression/decompression library, as subsequent compression calls
should be completely independent.
Rather than throwing exceptions indicating the nature of the problem Deflater conflated 6 error conditions into one error code (-2).
This should be 6 different exceptions.
Could you add Z_BLOCK mode for inflate? It would enable implementing RandomAccess as described in https://github.com/madler/zlib/blob/master/examples/zran.c
Thanks
I could not find a way to get the compression level of a given incoming gzip-compressed data stream. Perhaps I missed something?
io = new GzipInputStream(my data);
io.readHeader();
io.????; // how to query the stream for compression level?
I think you have the same bug as detailed here:
adamhathcock/sharpcompress#118
While there is a constructor on ZInputStream that takes a nowrap argument, it is not used to actually configure the underlying InflaterInputStream to support headerless compressed data.
In com.jcraft.jzlib.JZlib:
// compression strategy
static final public int Z_FILTERED=1;
static final public int Z_HUFFMAN_ONLY=2;
static final public int Z_DEFAULT_STRATEGY=0;
zlib defines more strategies:
Is it possible to add the missing ones?
JZLib 1.1.3 currently fails to detect the zlib error "invalid distance too far back", which can lead to corrupt results when inflating, without throwing an exception. This specifically seems to happen with Inflaters created with a "nowrap == true" paramterer that also require a custom dictionary.
The bug seems to result from an ommited distance check in the method "inflate_fast" in the class "InfCodes.java". The pertinent code currently reads:
...
// copy all or what's left
if(q-r>0 && c>(q-r)){
do{
s.window[q++] = s.window[r++];
}
while(--c!=0);
}
else {
System.arraycopy(s.window, r, s.window, q, c);
q += c;
r +=c;
c=0;
}
break;
...
Changing this code to something like the following seems to fix the issue:
...
// copy all or what's left
if(q-r>0 && c>(q-r)){
do{
s.window[q++] = s.window[r++];
}
while(--c!=0);
}
else {
if (q - r < 0) {
mode = BADCODE;
z.msg = "invalid distance too far back";
return Z_DATA_ERROR;
}
System.arraycopy(s.window, r, s.window, q, c);
q += c;
r +=c;
c=0;
}
break;
...
Atsuhiko Yamanaka:
currently netty uses custom build of jzlib with some fixes:
https://github.com/netty/netty/tree/master/common/src/main/java/io/netty/util/internal/jzlib
do you think you will be able to accept pull request to incorporate these fixes
and release updated jzlib to maven central?
thank you.
Andrei.
Hello,
We encountered an error when trying to migrate our app from old version of jzlib (1.0.2) to latest one (1.1.3): latest version of jzlib is unable to parse some inputs which are perfectly parsed by both old version of jzlib and native Java implementation of zlib.
Here is the sample of such input:
private static final byte[] INVALID = {120, -38, 98, 102, -30, 112, 76, 78, -50, 47, -51, 43, 41, 102, 100, 96, 96, -80, -25, -11, 77, 45, 46, 78, 76, 79, 117, 78, -51, 43, 73, 45, 98, 100, -80, -29, -25, 101, -51, -55, 79, -49, -52, 99, 100, 20, 41, 73, 73, -116, -49, -51, 79, -118, 79, -52, 75, 41, -54, -49, 76, -47, 51, 51, -41, -77, -112, 116, -124, 112, 20, -126, 93, -68, 21, -110, 74, 51, 115, 74, 20, -46, -14, -117, 20, 42, 44, -52, 88, 77, -12, 76, -12, -116, -22, -21, -103, 67, -4, -125, -21, 89, 83, -117, -77, -13, -117, 88, 10, -127, 96, 127, 123, 125, 61, 64};
public void testJzlib() throws IOException {
final byte[] source = INVALID;
ByteArrayInputStream input = new ByteArrayInputStream(source);
InputStream stream = new com.jcraft.jzlib.ZInputStream(input);
byte[] bytes = new byte[512];
int read = stream.read(bytes);
System.out.println("Read: " + read);
System.out.println("Bytes: " + Arrays.toString(bytes));
}
public void testNative() throws IOException {
final byte[] source = INVALID;
ByteArrayInputStream input = new ByteArrayInputStream(source);
InputStream stream = new java.util.zip.InflaterInputStream(input);
byte[] bytes = new byte[512];
int read = stream.read(bytes);
System.out.println("Read: " + read);
System.out.println("Bytes: " + Arrays.toString(bytes));
}
After running this code on jzlib 1.0.2 we got 118 bytes read (the same result as for native Java). But if we switch to jzlib 1.1.3 ZInputStream throws EOFException (while correctly read the data):
java.io.EOFException: Unexpected end of ZLIB input stream
at com.jcraft.jzlib.InflaterInputStream.fill(InflaterInputStream.java:186)
at com.jcraft.jzlib.InflaterInputStream.read(InflaterInputStream.java:106)
at com.jcraft.jzlib.ZInputStream.read(ZInputStream.java:92)
at java.io.FilterInputStream.read(FilterInputStream.java:107)
According to zlib manual:
The windowBits parameter is the base two logarithm of the window size (the size of the history buffer). It should be in the range 8..15 for this version of the library.
However jzlib supports only in the range 9..15 as in this line:
jzlib/src/main/java/com/jcraft/jzlib/Deflater.java
Lines 109 to 111 in a21be20
If windowbit is 8 it returns an error which is incorrect.
Please see https://code.google.com/p/android/issues/detail?id=63873 because JZlib seems to have the same issue.
For jruby/jruby#4835 we are trying to modularize JRuby. This requires most dependencies to also be modularized.
We are willing to help jzlib add at least an automatic module name, or a full module-info. We are waiting to hear the status of the library and whether it should be moved to a new maintainer.
src/main/java/com/jcraft/jzlib/Deflate.java:1292: error: [IdentityBinaryExpression] A binary expression where both operands are the same is usually incorrect; the value of this expression is equivalent to window[++scan] == window[++match]
.
} while (window[++scan] == window[++match] &&
^
(see http://errorprone.info/bugpattern/IdentityBinaryExpression)
src/main/java/com/jcraft/jzlib/Deflate.java:1703: error: [SelfAssignment] Variable assigned to itself
dest.d_buf = dest.d_buf;
^
(see http://errorprone.info/bugpattern/SelfAssignment)
Did you mean to remove this line?
Are these actual issues that should be fixed? Happy to submit a PR.
We are planning to use JZlib in our project. I wanted to know the project status as there is not much activity on issues or code updates. Can you let me know the status of this project?
Filed upstream as JENKINS-19473. We switched to JZlib to avoid an apparent livelock in java.util.zip
. Unfortunately this has caused errors when sending certain (large) files over a GZip-compressed stream using JZlib 1.1.2 (or 1.1.1).
The root issue appears to be this exception:
java.lang.ArrayIndexOutOfBoundsException: 677
at com.jcraft.jzlib.Tree.d_code(Tree.java:149)
at com.jcraft.jzlib.Deflate.compress_block(Deflate.java:696)
at com.jcraft.jzlib.Deflate._tr_flush_block(Deflate.java:902)
at com.jcraft.jzlib.Deflate.flush_block_only(Deflate.java:777)
at com.jcraft.jzlib.Deflate.deflate_slow(Deflate.java:1200)
at com.jcraft.jzlib.Deflate.deflate(Deflate.java:1586)
at com.jcraft.jzlib.Deflater.deflate(Deflater.java:140)
at com.jcraft.jzlib.DeflaterOutputStream.deflate(DeflaterOutputStream.java:129)
at com.jcraft.jzlib.DeflaterOutputStream.write(DeflaterOutputStream.java:102)
at com.jcraft.jzlib.DeflaterOutputStream.write(DeflaterOutputStream.java:85)
dist >= 0x8000
causes this because of
_dist_code[256+((dist)>>>7)]
but the definition of that distance
((pending_buf[d_buf+lx*2]<<8)&0xff00)|
(pending_buf[d_buf+lx*2+1]&0xff)
could presumably produce values up to 0xffff
. Somehow that does not happen normally—only for certain files.
The originally reported bug is actually a different stack trace:
java.lang.ArrayIndexOutOfBoundsException: 65536
at com.jcraft.jzlib.Deflate._tr_tally(Deflate.java:635)
at com.jcraft.jzlib.Deflate.deflate_slow(Deflate.java:1177)
at com.jcraft.jzlib.Deflate.deflate(Deflate.java:1586)
at com.jcraft.jzlib.Deflater.deflate(Deflater.java:140)
at com.jcraft.jzlib.DeflaterOutputStream.deflate(DeflaterOutputStream.java:129)
at com.jcraft.jzlib.DeflaterOutputStream.write(DeflaterOutputStream.java:102)
which I can reproduce inside Jenkins but not in a standalone test case (perhaps due to differences in buffering?). In this code
pending_buf[l_buf+last_lit] = (byte)lc;
pending_buf
is of length 0x10000
, yet l_buf
is 0xc000
and last_lit
is 0x4000
.
The bug only seems to affect certain highly repetitive, large files. For example, ubuntu-13.04-server-amd64.iso
at 702Mb ran through fine as long as I let it go (for several minutes).
I am able to reproduce the problem in the form of a JUnit test (in Java, sorry, not brushing up on Scala just for this!):
package com.jcraft.jzlib;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.io.PipedInputStream;
import java.io.PipedOutputStream;
import java.util.zip.CheckedOutputStream;
import java.util.zip.Checksum;
import java.util.zip.CRC32;
import java.util.zip.CheckedInputStream;
import org.junit.Test;
import static org.junit.Assert.*;
public class DeflateTest {
@Test public void JENKINS_19473() throws Exception {
PipedOutputStream pos = new PipedOutputStream();
InputStream pis = new PipedInputStream(pos);
Checksum csOut = new CRC32();
OutputStream gos = new GZIPOutputStream(pos);
final OutputStream cos = new CheckedOutputStream(gos, csOut);
Thread t = new Thread() {
@Override public void run() {
try {
InputStream fis = new FileInputStream("…/jzlib.fail");
try {
int c;
while ((c = fis.read()) != -1) {
cos.write(c);
}
} finally {
fis.close();
}
cos.close();
} catch (IOException x) {
x.printStackTrace();
}
}
};
t.start();
InputStream gis = new GZIPInputStream(pis);
Checksum csIn = new CRC32();
InputStream cis = new CheckedInputStream(gis, csIn);
while (cis.read() != -1) {/* discard */}
t.join();
assertEquals(csOut.getValue(), csIn.getValue());
}
}
The test file is available in compressed form here. (Use gunzip
before trying to use.)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.