Giter VIP home page Giter VIP logo

Comments (21)

pjebs avatar pjebs commented on May 14, 2024

Can you send me the file

from dataframe-go.

tanyaofei avatar tanyaofei commented on May 14, 2024

Can you send me the file
1.parquet.zip

from dataframe-go.

pjebs avatar pjebs commented on May 14, 2024

Can you create the DataFrame from this package, export it to paraquet and then try and import it back?

from dataframe-go.

tanyaofei avatar tanyaofei commented on May 14, 2024

Can you create the DataFrame from this package, export it to paraquet and then try and import it back?

I tried it at the first time, it seems like a error parquet file with content "PAR1"

func main() {
    df := dataframe.NewDataFrame(dataframe.NewSeriesString("A", nil, []string{"1", "2", "3"}))
    file, _ := os.Create("1.parquet")
    _ = exports.ExportToParquet(context.Background(), file, df)
}

image

from dataframe-go.

pjebs avatar pjebs commented on May 14, 2024

A Parquet file is not text based. Can you try importing the file back.

from dataframe-go.

tanyaofei avatar tanyaofei commented on May 14, 2024

A Parquet file is not text based. Can you try importing the file back.

    df := dataframe.NewDataFrame(dataframe.NewSeriesString("A", nil, []string{"1", "2", "3"}))
    file, _ := os.Create("1.parquet")
    _ = exports.ExportToParquet(context.Background(), file, df)

    fr, _ := local.NewLocalFileReader("1.parquet")
    df, err := imports.LoadFromParquet(context.Background(), fr)
    if err != nil {
        panic(err)
    }
    fmt.Println(df)
panic: seek 1.parquet: invalid argument

goroutine 1 [running]:
main.main()
        .../main.go:21 +0x465
Exiting.

Error at imports/parquet.go, line 40: pr, err := reader.NewParquetReader(src, nil, int64(runtime.NumCPU()))

from dataframe-go.

tanyaofei avatar tanyaofei commented on May 14, 2024

A Parquet file is not text based. Can you try importing the file back.

My parquet-go version is v1.6.2: github.com/xitongsys/parquet-go v1.6.2

from dataframe-go.

pjebs avatar pjebs commented on May 14, 2024

I tried opening your file and it worked:

package main

import	"github.com/xitongsys/parquet-go-source/local"
import	"github.com/rocketlaunchr/dataframe-go/imports"
import "fmt"
import "context"

var ctx = context.Background()

func main() {
	fr, _ := local.NewLocalFileReader("1.parquet")
	defer fr.Close()

	df, err := imports.LoadFromParquet(ctx, fr)
	if err != nil {
		panic(err)
	}

	fmt.Println(df)
}

OUTPUT:

+-----+--------+-------+---------+
|     |   A    |   B   |    C    |
+-----+--------+-------+---------+
| 0:  |   a    |   2   |   10    |
| 1:  |   b    |   3   |   20    |
| 2:  |   c    |   4   |   NaN   |
| 3:  |   d    |   1   |   NaN   |
+-----+--------+-------+---------+
| 4X3 | STRING | INT64 | FLOAT64 |
+-----+--------+-------+---------+

from dataframe-go.

tanyaofei avatar tanyaofei commented on May 14, 2024

I tried opening your file and it worked:

package main

import	"github.com/xitongsys/parquet-go-source/local"
import	"github.com/rocketlaunchr/dataframe-go/imports"
import "fmt"
import "context"

var ctx = context.Background()

func main() {
	fr, _ := local.NewLocalFileReader("1.parquet")
	defer fr.Close()

	df, err := imports.LoadFromParquet(ctx, fr)
	if err != nil {
		panic(err)
	}

	fmt.Println(df)
}

OUTPUT:

+-----+--------+-------+---------+
|     |   A    |   B   |    C    |
+-----+--------+-------+---------+
| 0:  |   a    |   2   |   10    |
| 1:  |   b    |   3   |   20    |
| 2:  |   c    |   4   |   NaN   |
| 3:  |   d    |   1   |   NaN   |
+-----+--------+-------+---------+
| 4X3 | STRING | INT64 | FLOAT64 |
+-----+--------+-------+---------+

Can you tell me your parquet-go version ?

from dataframe-go.

pjebs avatar pjebs commented on May 14, 2024
module main

go 1.18

require (
	github.com/rocketlaunchr/dataframe-go v0.0.0-00010101000000-000000000000
	github.com/xitongsys/parquet-go-source v0.0.0-20200509081216-8db33acb0acf
)

require (
	github.com/apache/thrift v0.0.0-20181112125854-24918abba929 // indirect
	github.com/goccy/go-json v0.7.6 // indirect
	github.com/golang/snappy v0.0.0-20180518054509-2e65f85255db // indirect
	github.com/google/go-cmp v0.4.0 // indirect
	github.com/guptarohit/asciigraph v0.5.1 // indirect
	github.com/juju/clock v0.0.0-20190205081909-9c5c9712527c // indirect
	github.com/juju/errors v0.0.0-20200330140219-3fe23663418f // indirect
	github.com/juju/loggo v0.0.0-20200526014432-9ce3a2e09b5e // indirect
	github.com/juju/utils/v2 v2.0.0-20200923005554-4646bfea2ef1 // indirect
	github.com/klauspost/compress v1.9.7 // indirect
	github.com/mattn/go-runewidth v0.0.7 // indirect
	github.com/olekukonko/tablewriter v0.0.4 // indirect
	github.com/rocketlaunchr/mysql-go v1.1.3 // indirect
	github.com/xitongsys/parquet-go v1.5.2 // indirect
	golang.org/x/crypto v0.0.0-20200820211705-5c72a883971a // indirect
	golang.org/x/exp v0.0.0-20200331195152-e8c3332aa8e5 // indirect
	golang.org/x/net v0.0.0-20200904194848-62affa334b73 // indirect
	golang.org/x/sync v0.0.0-20200317015054-43a5402ce75a // indirect
	gopkg.in/yaml.v2 v2.3.0 // indirect
)

from dataframe-go.

tanyaofei avatar tanyaofei commented on May 14, 2024

I use github.com/apache/thrift v0.0.0-20181112125854-24918abba929, github.com/xitongsys/parquet-go v1.5.2 and it works.

from dataframe-go.

pjebs avatar pjebs commented on May 14, 2024

In the release notes:

[v1.6.0](https://github.com/xitongsys/parquet-go/releases/tag/v1.6.0)
Big changes in the type. Not compatiable with before.

I may need to update package to use 1.6+ instead of 1.5.

No idea why it is not using v1.5 for you since it's registered in the go.mod file.

from dataframe-go.

tanyaofei avatar tanyaofei commented on May 14, 2024

In the release notes:

[v1.6.0](https://github.com/xitongsys/parquet-go/releases/tag/v1.6.0)
Big changes in the type. Not compatiable with before.

I may need to update package to use 1.6+ instead of 1.5.

No idea why it is not using v1.5 for you since it's registered in the go.mod file.

v1.5 works find, may be i installed parquet-go before installed dataframe-go, not sure about it.

from dataframe-go.

tanyaofei avatar tanyaofei commented on May 14, 2024

It seems the problem solved, I should close this issue

from dataframe-go.

pjebs avatar pjebs commented on May 14, 2024

Maybe you directly imported "github.com/rocketlaunchr/dataframe-go/imports" without importing "github.com/rocketlaunchr/dataframe-go". Since there is no go.mod file inside github.com/rocketlaunchr/dataframe-go/imports directory, it just downloaded and used the latest version of parquet-go

from dataframe-go.

tanyaofei avatar tanyaofei commented on May 14, 2024

Maybe you directly imported "github.com/rocketlaunchr/dataframe-go/imports" without importing "github.com/rocketlaunchr/dataframe-go". Since there is no go.mod file inside github.com/rocketlaunchr/dataframe-go/imports directory, it just downloaded and used the latest version of parquet-go

Here is my shell records

➜  go get -u github.com/rocketlaunchr/dataframe-go
go: downloading github.com/rocketlaunchr/dataframe-go v0.0.0-20211025052708-a1030444159b
go: downloading golang.org/x/exp v0.0.0-20200331195152-e8c3332aa8e5
go: downloading github.com/google/go-cmp v0.4.0
go: downloading github.com/guptarohit/asciigraph v0.5.1
go: downloading github.com/olekukonko/tablewriter v0.0.4
go: downloading golang.org/x/sync v0.0.0-20200317015054-43a5402ce75a
go: downloading github.com/olekukonko/tablewriter v0.0.5
go: downloading github.com/google/go-cmp v0.5.7
go: downloading github.com/mattn/go-runewidth v0.0.7
go: downloading github.com/mattn/go-runewidth v0.0.13
go: downloading golang.org/x/exp v0.0.0-20220328175248-053ad81199eb
go: downloading github.com/guptarohit/asciigraph v0.5.3
go: downloading github.com/rivo/uniseg v0.2.0
go: added github.com/google/go-cmp v0.5.7
go: added github.com/guptarohit/asciigraph v0.5.3
go: added github.com/mattn/go-runewidth v0.0.13
go: added github.com/olekukonko/tablewriter v0.0.5
go: added github.com/rivo/uniseg v0.2.0
go: added github.com/rocketlaunchr/dataframe-go v0.0.0-20211025052708-a1030444159b
go: added golang.org/x/exp v0.0.0-20220328175248-053ad81199eb
go: added golang.org/x/sync v0.0.0-20210220032951-036812b2e83c
➜  go get -u github.com/xitongsys/parquet-go/parquet                                     
go: downloading github.com/apache/thrift v0.16.0
go: upgraded github.com/apache/thrift v0.0.0-20181112125854-24918abba929 => v0.16.0
go: upgraded github.com/xitongsys/parquet-go v1.5.2 => v1.6.2
➜  go get -u github.com/xitongsys/parquet-go-source                                       
go: downloading github.com/xitongsys/parquet-go-source v0.0.0-20220315005136-aec0fe3e777c
go: upgraded github.com/xitongsys/parquet-go-source v0.0.0-20200817004010-026bad9b25d0 => v0.0.0-20220315005136-aec0fe3e777c

from dataframe-go.

pjebs avatar pjebs commented on May 14, 2024

You shouldn't have done the last 2 go gets since they don't have a go.mod file so it just assumed the latest version hence: go: upgraded github.com/xitongsys/parquet-go v1.5.2 => v1.6.2

from dataframe-go.

pjebs avatar pjebs commented on May 14, 2024

From Go's point of view, when you do that, it's an unrelated package.

from dataframe-go.

tanyaofei avatar tanyaofei commented on May 14, 2024

You shouldn't have done the last 2 go gets since they don't have a go.mod file so it just assumed the latest version hence: go: upgraded github.com/xitongsys/parquet-go v1.5.2 => v1.6.2

Get it, thanks a lot

from dataframe-go.

chippyash avatar chippyash commented on May 14, 2024

Hi - when is this lib going to be upgraded to use >= V1.6.2 of parquet-go please? having to fix on v1.5.4 just broke all the tagging I was using which assumed V1.6.2 :-(

from dataframe-go.

pjebs avatar pjebs commented on May 14, 2024

There is a backward-incompatible change in v1.6.2. Therefore I will need to explore it more deeply.

This package's go.mod is set to github.com/xitongsys/parquet-go v1.5.2 so it should work for you provided you don't try and indepdently go get the "github.com/rocketlaunchr/dataframe-go/imports" package.

Let the main package dictate the dependencies for the sub-packages.

from dataframe-go.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.