nteract / data-explorer Goto Github PK
View Code? Open in Web Editor NEWThe Data Explorer is nteract's automatic visualization tool.
Home Page: https://data-explorer.nteract.io/
License: BSD 3-Clause "New" or "Revised" License
The Data Explorer is nteract's automatic visualization tool.
Home Page: https://data-explorer.nteract.io/
License: BSD 3-Clause "New" or "Revised" License
Add docs for the Data Explorer PalettePicker
component.
I am new to the data-explorer and I noticed you can filter it by values in columns, like here:
But is it also possible to filter all values within a range, so not just speed_observation=10
, but something like speed_observation=[5,25]
. This would be a very, very helpful feature, google colab's data_table has it.
Also, it would be great if the data explorer table could get other advanced filtering options like there are for qgrid? - filter by a range of dates, but why not also by a list of strings, or a list of numerics, etc.
This is how filtering looks for grid. Thanks
Right now the views are suppressed if dimensions and metrics are not available to enable those views (for instance, Network Viz is not available unless you have two dims) instead it should show a message to the user saying "You can't display a chart like this unless your data has x,y & z"
Likewise, there should be upper limits on dataset size to not enable a view unless someone explicitly asks for it (like giant network charts).
Is there any way to run de data Explorer on a notebook placed on Google Colab??
When embedding the data-explorer in a page that uses URL params like the Datasette project (see here ), using the "filters" bar can clobber state that was set by the parent page. It would be useful to be able to let embedders of this component disable this behavior.
disableFilterControls
that defaults to false
, which hides the show/hide filter button.disableSetUrlParams
that defaults to false
, which enables state to be saved internally to the component, doesn't modify the parent page's URL parameters.Add Data Explorer docs.
Issue:
When the data in outputs are indexed with numbers, data-explorer can't render the output correctly.
"application/vnd.dataresource+json": {
"schema": {
"fields": [
{
"name": "name"
},
{
"name": "type"
},
{
"name": "note"
}
]
},
"data": [
{
"0": "aa",
"1": "bb",
"2": "cc"
}
]
}
}
Here is an example of the notebook. You can use this to reproduce the issue:
{
"metadata": {
"kernelspec": {
"name": "SQL",
"display_name": "SQL",
"language": "sql"
},
"language_info": {
"name": "sql",
"version": ""
}
},
"nbformat_minor": 2,
"nbformat": 4,
"cells": [
{
"cell_type": "code",
"source": [
"test"
],
"metadata": {
"azdata_cell_guid": "286d911b-c759-489c-b7f8-5490479dbddd",
"language": "sql",
"tags": []
},
"outputs": [
{
"output_type": "execute_result",
"metadata": {},
"execution_count": 4,
"data": {
"application/vnd.dataresource+json": {
"schema": {
"fields": [
{
"name": "name"
},
{
"name": "type"
},
{
"name": "note"
}
]
},
"data": [
{
"0": "aa",
"1": "bb",
"2": "cc"
}
]
}
}
}
],
"execution_count": 4
}
]
}
Note that if the data object uses the filed names, it will work correctly.
"data": [
{
"name": "aa",
"type": "bb",
"note": "cc"
}
Using the Data Explorer component, the onMetadataChange
prop is called when the user changes the selected UI configuration of the component (such as by switching the chart type). It appears that the nteract notebook UI persists this metadata in a dx
key in root of the output's metadata within the notebook file.
notebook.cells[0].outputs[0].metadata.dx
My question is: Is this the recommended place to store the Data Explorer metadata, or would it be better to store the metadata under the MIME type considering it applies to only that part of the output?
notebook.cells[0].outputs[0].metadata["application/vnd.dataresource+json"].dx
I don't know if there is any practical need for this, but it seemed like something to consider after seeing the following example.
https://nbformat.readthedocs.io/en/latest/format_description.html#display-data
"metadata" : {
"image/png": {
"width": 640,
"height": 480,
},
},
DATA_EXPLORER_NONE_DIM
and special characters inside) to preventnone
should check for this special string sequence instead.Repro: run the following in a cell
import pandas as pd
pd.set_option("display.html.table_schema", True)
class Cmd:
def __init__(self, name, params):
self.name = name
self.params = params
def __repr__(self):
return f'Cmd(name={self.name}, params={self.params})'
cell_payload = [
Cmd(name='foo', params={'bar', 'baz'}),
Cmd(name='foo', params={'bar', 'baz'})
]
pd.DataFrame({'param_session': [cell_payload]})
Then the following error appears (with a link to this error page, which mentions that the error was Objects are not valid as a React child (found: object with keys {name}). If you meant to render a collection of children, use an array instead.
)
For reference, this is how Pandas would normally render the cell, when setting pd.set_option("display.html.table_schema", False)
Finally, here's what the output looks like in the ipynb file when the error occurs
"application/vnd.dataresource+json": {
"schema": {
"fields": [
{
"name": "index",
"type": "integer"
},
{
"name": "param_session",
"type": "string"
}
],
"primaryKey": [
"index"
],
"pandas_version": "0.20.0"
},
"data": [
{
"index": 0,
"param_session": [
{
"name": "foo"
},
{
"name": "foo"
}
]
}
]
}
},
For next release:
Expose a couple buttons that export the chart as SVG and PNG.
Do you have this package available for Jupyter Lab?
Are there similar packages for Jupyter Lab?
Thank you!
Add docs for the Data Explorer VizControl
component.
main
Hello,
You're saying to go to the application folder here :
cd applications/jupyter-extension
pip install -e .
jupyter serverextension enable nteract_on_jupyter
but where is this folder ?
If this is the Applications folder from root, I don't have anything in it.
Thanks for your help :)
When we decide to drop support for node < 14, we should be able to bump d3-scale with no issues.
df.describe(include="all")
should run and be included as metadata for any dataframe that's being sent to the Data Explorer component.
Presently, only the root DataExplorer
is tested ( see https://github.com/nteract/data-explorer/blob/main/__tests__/index.spec.tsx ).
It will be easier to catch / guard against component specific issues components below the level of the root DataExplorer
if we tested specific visuals / components (Plot Picker, etc) individually.
Based on recent bugs, I think it would be valuable to implement basic tests for
This exercise will also help with writing component specific documentation.
Once this is done in a basic form for a few components, we'll have a good pattern in place that should be easy for new/first time contributors to mimic/add to.
I really like the data-explorer and would like to install it to work WITHIN jupyterlab, not jupyter nteract. Apparently, this is possible, as I see in this post, or just see image below:
But I find no installation instructions for integration into jupyterlab. Can you provide some? It seems that other people are interested in this functionality, too. Thanks
Is your feature request related to a problem? Please describe.
Some users want to be able to display multiple configurations of the Data Explorer for the same dataset. Currently, this requires outputting the DataFrame multiple times (in the same cell or multiple cells). Unfortunately, this approach duplicates the DataFrame schema/data, which bloats the notebook file. For larger datasets, this can cause the browser to struggle and significantly increases the load time of the notebook.
Describe the solution you'd like
One lightweight solution would be to allow multiple configurations of the Data Explorer in a single execution output by keeping track of an array of metadata configurations instead of a single configuration.
{
metadata: {
dx: { view: "bar" }
}
}
becomes
{
metadata: {
dx: [
{ view: "bar" },
{ view: "line" }
]
}
}
When there are multiple configurations, the UI would render multiple instances of the Data Explorer component instead of just one. Each instance would be passed the corresponding metadata along with the original the data/schema. Presumably they would be layed out vertically similar to what happens when you output the same DataFrame multiple times, though some treatment could be applied to separate them visually.
In terms of how the user is able to provide multiple configurations, this could be achieved through UI controls to add/remove and re-order configurations. You could also simplify such that this is only allowed via the programmatic configuration (see nteract/nteract#4377), especially as an early milestone.
I imagine the Data Explorer component that exists today wouldn't need to change much, but an additional wrapping component would be introduced. This could be done in user land, but we would need to align on the metadata standard.
With large datasets, prompt user to enable some set of standard sampling options with messaging about how they should do sampling better. So that for instance if you try to render a million points on the scatterplot it doesn't just say "Sorry, no" it says "Here we can show 50,000 of these points using one of the following three built-in sampling options (or you could sample the data yourself in a more effective way)".
Application or Package Used
Data Explorer
Describe the bug
Selecting a size for points in the Scatterplot chart type has no effect in Firefox. Works fine in Chrome.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
The points should resize in Firefox. It should look pretty close to how it looks in Chrome.
And any other scatterplot functionality.
In the table view, I'd like to be able to quickly filter rows by some string. An example of this behavior can be found in PowerShell's ogv
command:
Notice the filter bar at the top?
this is not the same as the filters that are already available in data explorer
This is something faster than that. When I search for "foo" I get all the rows that have "foo" in any of the columns.
Is your feature request related to a problem? Please describe.
The bundled size of Data Explorer is currently 1.5MB, which is too large to be a reasonable component pulled in by other libraries.
Describe the solution you'd like
Need to reduce the size of the package. Open to any suggestions for scoping the requirements down.
Describe alternatives you've considered
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.