Comments (13)
I guess docs can be made clearer to specify that the key of DictateDataType is the name of column (case-sensitive)
from dataframe-go.
Hi @MrPowers,
I can see the issue is frm ur code usage...
First of all, I would really recommend u change ur variable name frm dataframe to somtn else maybe df
Secondly ur dataframe has only 2 series(columns). Indexing starts frm 0( for first )nd 1( for second)
So it shud be
s := df.Series[1]
from dataframe-go.
So try this code instead:
s := df.Series[1]
applyFn := dataframe.ApplySeriesFn(func(val interface{}, row, nRows int) interface{} {
return 2 * val.(int64)
})
dataframe.Apply(ctx, s, applyFn, dataframe.FilterOptions{InPlace: true})
fmt.Print(df.Table())
dataframe is supposed to be the name of the imported package.
import (
dataframe "github.com/rocketlaunchr/dataframe-go"
"github.com/rocketlaunchr/dataframe-go/imports"
)
from dataframe-go.
@propersam - thanks for the quick response. Really appreciate the help!
I will definitely make sure to get this in the README, so it's easier to follow:
import (
dataframe "github.com/rocketlaunchr/dataframe-go"
"github.com/rocketlaunchr/dataframe-go/imports"
)
I have another bug now, but it's separate, so I'll raise another issue. I'll send the README PR right now before I forget.
from dataframe-go.
we are very much grateful for your interest in contributing to this project.
i dropped a comment in ur PR.
we hope to get more highly interesting PRs from you in later times.
from dataframe-go.
Since this issue is solved..i guess it's safe to close this now...
from dataframe-go.
When you import, you need to use dictatedatatype to specify the that favorited number is an int64 and not a string. Then in Apply, you can multiply by 2.
from dataframe-go.
@pjebs - thanks for the hint. I see DictateDataType
in the code here.
Can you please provide me with a code snippet?
Let me know if you're open to updating this code snippet with DictateDataType
so we can show users how to get the column types correct.
Might be cool to add some schema inference :) That's what's used a lot in the Spark / Scala world.
from dataframe-go.
code snippet:
df, err := imports.LoadFromCSV(ctx, csvfile, imports.CSVLoadOptions{
DictateDataType: map[string]interface{}{
"firstName": "", // specify this column as string
"favoriteNumber": int64(0), // specify this column as int64
},
)
Note: firstName
nd favoriteNumber
are just sample names I use...the actual name you will putt in should be the exact column names in the csv files... it's Case sensitive
from dataframe-go.
Thank you both so much for the help!
@pjebs - I was thinking you might want to update this README example so folks can see how to use DictateDataType
. Either way is good with me!
from dataframe-go.
@MrPowers How is the package serving you? Any other feedback? What is your use case?
from dataframe-go.
@pjebs - thanks for reaching out.
I am a Spark developer and love DataFrames. I am exploring DataFrame options in different languages for fun!
I just published a blog post on DataFrames in Go. Let me know what you think! I am happy to edit it based on your feedback!
It'd be awesome to add dataframe-go to qbench so we can see how the dataframe-go performance compares with qframe and gota.
You have a great team of open source developers! Looking forward to collaborating with you and making a Go DataFrame library that supports the Apache Arrow memory format and Parquet files (especially column pruning for those juicy performance gains). Thanks again for all the help!!
from dataframe-go.
@MrPowers I had a quick read of your blog post and made some adjustments.
- In the
add-schema-inference
branch there is schema inference. It just needs to be tested. - context requirement is idiomatic go. It is very bad if context is not the first param in case you need to cancel the operation because it's taking too long.
- You can use the
NameToColumn
function (or in the new PR, theMustNameToColumn
function) to convert from name to column index. - Filtering for your example would be done like this:
filterFn := dataframe.FilterDataFrameFn(func(vals map[interface{}]interface{}, row, nRows int) (dataframe.FilterAction, error) {
if vals["is_even"] != nil && vals["is_even"].(int64) == 1{
return dataframe.KEEP, nil
}
return dataframe.DROP, nil
})
- The aims of the various packages are also different.
The primary aim of dataframe-go is to be flexible (and give the developer the power to do what ever they want). It also has a lot more features and probably more planned features (full time-series forecasting is coming, which you can see in one of the branches)
from dataframe-go.
Related Issues (20)
- CSV import does not support dictated type for fields with potential empty values HOT 2
- undefined: dataframe.LoadFromCSV HOT 5
- Reading from Parquet HOT 11
- Export to Parquet example HOT 1
- Draw graphs from columns of dataframe HOT 9
- Appending a dataframe with another one. HOT 1
- How to set_index with two columns? HOT 2
- LoadFromJSON Not Working HOT 8
- How to remove duplicate rows in DataFrame? HOT 1
- Problem getting the package HOT 4
- is group by supported? HOT 1
- Bad import, was an upstream dependency deleted? HOT 5
- Error to read parquet with latest parquet-go HOT 21
- Error to read csv encoding utf-8 with bom and export back to parquet HOT 28
- Indirect dependency `github.com/blend/go-sdk v1.1.1` does not exist HOT 6
- Error to import csv, raised parquet-go error HOT 4
- Progress for re-write of dataframe-go? HOT 14
- DF Practice
- Add equivalent of `pandas`.`read_html`
- how to achieve multi index ?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dataframe-go.