nl253 / dataframe Goto Github PK
View Code? Open in Web Editor NEWDataframe & Series library for statistics and tabular data manipulation (like pandas)
License: MIT License
Dataframe & Series library for statistics and tabular data manipulation (like pandas)
License: MIT License
Tests are failing because of this.
So that:
df[0] === df.cols[0]
This is a new feature and does not have any documentation.
Currently there is a draft implementation of the overloading of the in
operator but doesn't seem to work properly. I got it to work with a string dataframe but not numeric. You should be able to:
const hasRow = [1,2,3] in df;
You need to use the proxy that is already returned from the DataFrame.rows
getter.
Don't re-implement tests for functionality that is forwarded to the Column
. Test agg
, call
, matrix
and other core methods in DataFrame
.
Some properties should never be assigned to. Research which ones can be made read-only and Object.freeze
the object they are on.
Currently it's not clear where these default datasets come from and how you can plug into this framework by changing a DataFrame.opts
variable.
E.g.:
df.slice(/coordX/, /userNam/)
So that:
df.cols[0] === df.cols.firstName
df.toJSON
, df.toHTML
, df.toObj
and df.toCSV
have a lot in common. They all use the internal structure of the DataFrame
object to convert it to something if no args are supplied. If an argument is supplied then it's treated as the file name of where the result of conversion can be saved. The task is to generalise all of these methods and make them partials of the new method.
This should be done once a higher level of API stability is reached. Ie after all bugs have been dealt with and all features implemented.
The tests for the Column class have been written quite a while ago and need to be looked at.
You should be able to:
(df.cols[0] in df.cols) === true
Currently the API is clean in that there are very few methods with many if statements in them. However, if someone were to run them in a loop this would become costly. Ideally there would be a way to call selectByFunc
instead of call select
and check if isFunc(params[0])
.
In essence:
class DataFrame {
select(...params) {
// choose select* depending on value of params
// ...
}
selectByFunc(f) { ... }
selectByIdx(n, m) { ... }
}
Not all Column
functionality is covered with tests.
So that:
df.row[0] === [df.cols[0][0], df.cols[1][0], ...]
This should be done by returning a Proxy
which will allow to overload the indexing operator.
Currently the API understands null
as meaning "all columns" which may be confusing. "all"
would be better but may clash with actual column names.
This would be something along the lines of:
const SortOrder = {
ASCENDING: "asc",
DESCENDING: "des"
// ...
}
So that
df.cols.firstName = df.cols.firstName.map(name => name[0].toUpper().concat(name.slice(1)))
works.
I would prefer to avoid having to do this and have the error constructor do it all:
throw new Error(msg('you need to provide pairs of colId, newName (e.g. df.rename(1, "Width", -2, "Length"))'))
This also applies to other datatypes. Currently the convert method is not called (but should be) after a call to map
.
Depends on #24.
This will ensure that df.colNames
has a consistent type.
The constructor should accept filenames with the .zip
, .gzip
and .lzma
and .7z
extensions.
When the filename is passed to the constructor, the library should attempt to look for an existing (already un-archived) file and if it's not present, unarchive and place it in the same directory as the archive only without the archive extension. If it is present, then it's read as usual.
This of ideas how documentation can be accessed from the node repl.
Add a git / npm hook that runs the tests pre-push and pre-publish.
The new row-level select
method should take as args:
(val, idx) => bool
with an optional column id to ensure val
is the value of that column and not the whole rowSo that:
> df.summary()
{
"mean": 123.12312,
"var": 123.3112,
"min": 12,
"max": 111123,
"std": 11,11
}
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.