Giter VIP home page Giter VIP logo

parquet's Issues

Update dependencies to fix CVE-2020-13949

In Apache Thrift 0.9.3 to 0.13.0, malicious RPC clients could send short messages which would result in a large memory allocation, potentially leading to denial of service.

Fix is in 0.14.0

Nested (not repeated) generated code is invalid

For now only flat parquet schemas are working correctly.

The code that was used to generate the code for writing values to the input struct needs to be brought back for cases when a field is not repeated.

Update go.mod to include dependency versions

This library does not support the use of the latest apache thrift library as they have broken their API by introducing new arguments into their exported functions. That event makes it impossible to install this library with go get because the older version is not defaulted to. By adding the specific required version (or by updating this library to be compatible while also still adding these versions to the go.mod) this library will become immediately usable again.

Concurrent usage

Hi, I have a use-case where I'm writing billions of data entries into a file. I couldn't find any information regarding this in the README, so I wanted to ask whether it's safe to use *ParquetWriter across multiple goroutines to speed up the whole process and if you have any recommendations for doing that. I'm not quite sure but maybe Add could be called concurrently and every once in a while Write should be called depending on the size of RowGroup.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.