Giter VIP home page Giter VIP logo

Comments (10)

dg3feiko avatar dg3feiko commented on June 22, 2024

this is the schema generated by parquet.js for a list of elements

{
  mylist:[{"foo":"abc", "bar":"abc"}, {"foo":"abc", "bar":"abc"} ]
}
message root {
  repeated group mylist {
    required binary foo (UTF8);
    required binary bar (UTF8);
  }
}

and expected schema for PrestoDB/Hive is

message root {
  required group mylist (LIST){
    repeated group list {
       required group element {
            required binary foo (UTF8);
            required binary bar (UTF8);
       }
    }
  }
}

from parquetjs.

shyim avatar shyim commented on June 22, 2024

Hey @dg3feiko,
have you found a working solution for that problem?

from parquetjs.

ZJONSSON avatar ZJONSSON commented on June 22, 2024

@shyim @dg3feiko Did you check out the #67 - might be related

from parquetjs.

shyim avatar shyim commented on June 22, 2024

I have installed your version like mentioned in the comment with

npm install zjonsson/parquetjs#07fb2fd8fc03bf2b57243531eaf91f2d60f5e460

Generated new files and copied that to the S3 bucket, still problems with the athena query..

from parquetjs.

ZJONSSON avatar ZJONSSON commented on June 22, 2024

there is also #43 you could try to install a fork that has all my outstanding PRs here merged to master (including the 43)

npm install zjonsson/parquetjs

from parquetjs.

shyim avatar shyim commented on June 22, 2024

I can select simple fields in the first tier, but when i select a struct Athena crashes with message: HIVE_CURSOR_ERROR: Can not read value at 0 in block 0 with your latest fork

from parquetjs.

bwisitero avatar bwisitero commented on June 22, 2024

i used 0.8.0 to convert a flat json file to parquet. Verified that im able to write and read it back. Uploaded it to s3 and used glue to create the athena table. Im unable to query the data for some reason though, getting a GENERIC_INTERNAL_ERROR: 0
Anybody else using this converter for athena?

from parquetjs.

justinsoliz avatar justinsoliz commented on June 22, 2024

I gave this a try recently in AWS with Athena + Presto using the latest from zjonsson/parquetjs.

Root level primitives worked but nested lists failed:

Expected LIST column column to only have one field, but has x fields

from parquetjs.

gbassan-br avatar gbassan-br commented on June 22, 2024

I gave this a try recently in AWS with Athena + Presto using the latest from zjonsson/parquetjs.

Root level primitives worked but nested lists failed:

Expected LIST column column to only have one field, but has x fields

+1
Anyone with a answer?

from parquetjs.

ZJONSSON avatar ZJONSSON commented on June 22, 2024

So I encountered the same issue and spend some time getting it to work. Here is a solution that seems to work at least for my case of lists with structs: ZJONSSON#34
Test case from parquetjs to Athena can be found here: https://github.com/ZJONSSON/parquetjs/blob/9cee1592ce41e8dbca088fa2330b48ceb2d1de1a/test/list.js

from parquetjs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.