Giter VIP home page Giter VIP logo

femtozip's People

Contributors

ehrmann avatar gtoubassi avatar mmank avatar tmthrgd avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

femtozip's Issues

Java Femtopzip gets into a nasty infinite loop on corrupt input data

I have a slightly proprietary example I can give you (contact me at ehrmann+1923 <at> gmail).

What happened was that I base 64 encoded a byte array after compressing it, converted it to lower case, decoded it back to a byte array, then tried to decompress it. compressionModel.decompress(data) took much longer than it should have, then the JVM ran out of memory. There's a chance Femtozip is correctly decoding what becomes a massive byte array, but it could also be a bug.

A nice workaround might be to have a maxExpectedSize parameter on decompress to guard against this.

Unable to use Femtozip on fresh Ubuntu install

Hello,

After installing the required dependencies on a fresh Ubuntu install I'm getting this error when attempting to use the fzip command line tool:

fzip: error while loading shared libraries: libfzip.so.0: cannot open shared object file: No such file or directory

C++ library and PHP PECL

Could not find an email address to contact "gtoubassi" so leaving a note here.

  1. The wiki mentions its in use in production at a PHP site but there is no PHP module. Is this built but not released?

  2. I'm not a C/C++ programmer and tried to tackle creating a PHP PECL extension but ran into the following problems:

A) The C++ library has no error checking or exception handling for all error cases.
B) Throws error on invalid model file (passing a non-existing file). This was fixed by checking file.good().
C) Throws error on broken model file (passing a file but not a valid model). I could not catch the error in C++ using try/catch so not sure what's going on. Again, I have little experience in C++.
D) Throws error on decompression of a string that is not a valid femtozip compressed string.

My half-baked PHP PECL module works only in a perfect environment where everything passed to it is perfect but would like the ability to handle real-world usage where bad data is throw at it and be able to recover and not crash due to errors.

Would love to have femtozip working in production and need some assistance.

CPP Femtozip Infinite Loop in IntSet.h

When provided with a large quantity of files/data, femtozip gets into an infinite loop. This issue can be tracked down to the insert(int, int*, int*, size_t) method in IntSet.h.

inline int insert(int n, int *b, int *end, size_t capacity) {
    int *p = b + (n % capacity);
    while (*p != -1) {
        if (*p == n) {
            return 0;
        }
        p++;
        if (p == end) {
            p = b;
        }
    }  
    *p = n;
    return 1;
}

The loop while (*p != -1) never completes as p is never equal to -1 or n.

Java generated CompressionModel in C++ Library

Hi,

first of all: Kudos go to the library developers.

I stumbled upon the following issue: I'm currently trying to load a CompressionModel, which I saved via the Java Library to the C++ Library. This doesn't seem to work. I guess that's due to the fact that Java stores bytes differently than C++.

Any hints how I can make this work?

Thanks a lot,
Simon

Unable to create sdch dictionary by "fzip ...."

Hi, when I finished building FemtoZip, I try to create sdch dictionary by " fzip --model /tmp/sdch-data/dict --build --dictonly --maxdict 96000 /tmp/sdch-data/train". Unfortunately, it failed with Segmentation fault (core dumped).The error message is as follows:
Program terminated with signal 11, Segmentation fault.
#0 0x00007f9ef217f41f in bsarray (buf=0x0, p=0x17ba1a0, n=0) at sarray.c:155
155 c = buf[n-1] << 8;
I don't know where things go wrong. Can you help me? Thanks a lot.

FemtoZipCompressionModel.encodeSubstring encodes a four-byte offset

Why does this method encode a four-byte offset when offset is only two bytes?

public void encodeSubstring(int offset, int length, Object context) {
        ...
        offset = -offset;
        if (offset < 1 || offset > (2<<15)-1) {
            throw new IllegalArgumentException("Offset " + offset + " out of range [1, 65535]");
        }
        encoder.encodeSymbol(offset & 0xf);
        encoder.encodeSymbol((offset >> 4) & 0xf);
        encoder.encodeSymbol((offset >> 8) & 0xf);
        encoder.encodeSymbol((offset >> 12) & 0xf);
        ...
}

Maven build

For build and unit testing purposes, I migrated the build of the java project from ant to maven.

Would you accept a PR?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.