sharmalab / datascope Goto Github PK
View Code? Open in Web Editor NEWInteractive linked visual query system for large datasets
Home Page: https://sharmalab.github.io/Datascope/
License: BSD 3-Clause "New" or "Revised" License
Interactive linked visual query system for large datasets
Home Page: https://sharmalab.github.io/Datascope/
License: BSD 3-Clause "New" or "Revised" License
This may be related to #27 , but having everything reanimate doesn't make sense here.
the join of two provided files, and the manually joined data (~ 1k rows) both gave an error which seems to suggest that the read/join operation is timing out.
There is likely more we can discover, and hopefully some fix we can use to prevent this.
I'm honestly not sure why we're testing against 0.1* when node is on 6./7.. That said, it looks like JSONStream isn't compatible with these versions.
My instinct is to change the node versions to the current ones, (I've added last stable already). Before I consider removing the old versions, I would like to know why they're there.
like https://bl.ocks.org/jasondavies/1341281
concern with data in browser, maybe use crossfilter
Add support for kaplan meier curves.
Example: https://codepen.io/clindsey/details/zBGaaR
This error is flooding the console log. It should be fixed or handled.
d3.transform @ d3.js:806
Error: attribute transform: Expected transform function, "null".
The x-axis labels should be diagonally positioned like we do in a histogram.
TitancSurvivors and newDataSourceConfig fail silently. Why is this, and what should be done or what should be logged.
Currently the statistics icon is present by default.
Even when there are no stats associated with an attribute like in the above example. We should only show the statistics icon when there are stats associated with the attribute. This also provides a visual cue for users about what attributes have statistics associated with them.
As part of at least one of our examples, we should have a bubbleChart.
The use case would be one where a user wants to do some simple number crunching on a column. Say splice data in a column and then do a unique count. Or maybe create a new column based on existing data.
This could be a major feature creep so we have to be careful. Maybe start with something simple and see how users respond to it.
We probably want a better documentation of the datascope api/routes
Since docs are spread across .md/html in repo and bitbucket wikis, we should consolidate them.
It looks like some of the code is formatted for jsdoc, so we should make sure that is consistent, and maybe add jsdoc to a release process.
We need to increase the size of the blue icon and also add some descriptive text that shows up on hover.
table does not pagnate properly as result
That is, launch with some variable already filtered, e.g. age between 30 and 40.
Calling a filter via http://localhost:3001/data/?filter={%22Age%22:[30,40]}&dataSourceName=main seems to allow this, but it's reset each filter selection.
The goal here is to have url parameters passed to datascope which are not reset with user interactions in the instance.
The table visualization does not support clickable rows. However the icon changes when one is browsing a table. This gives the user an impression that something is broken when they try to click on a row. Please fix asap
The DataTable is fairly slow on large datasets with response times 4-8 seconds. We need to look at optimized pagination strategy.
Table pagination and backend processing is fairly unoptimized right now since we make a call to .top(Infinity)
on one of the dimensions to get the entire data.
dc-js/dc.js#966 is a useful resource to solve this.
The idea is to used compressed encodings of the data that would be available in a dictionary format. Currently i'm using the dataDescription.json as the place to define the dictionary for each attribute. For example for the attribute Specimen_Type
the values 0
, 1
and 2
correspond to tumor_tissue
, normal_tissue
and tumor_blood
respectively:
{
"attributeName": "Specimen_Type",
"attributeType": [
"visual", "filtering"
],
"dictionary": {
"0": "tumor_tissue",
"1": "normal_tissue",
"2": "tumor_blood",
"3": "tumor_marrow"
}
}
This involves modifying the AppStore.jsx which is the front-end data store to encode and decode data. Decoding is done when the \data?filter={}
API is called to fetch data in encoded form from the server. Encoding is done on the filter
JSON object.
The fact webpack succeeds isn't alone a good indicator of a good commit/pr. Thus, we should at least unit test some core functionality.
Zooming out and in doesn't seem to change the grouping of items in marker maps.
Users may want to support the launch of a new entity using elements of a row or a cell. We should support this.
For certain visualizations, to allow for smoother interactions we might want to consider using "redo search in current area" or for our purposes "filter in current area".
So whenever a user zooms into say a parallel coordinate or a map they have the option to set the appropriate filters in the rest of the dashboard.
Right now the code for all the interactive filter types is in FilteringAttribute.jsx
[Source]
We should consider splitting it into separate react components as we've done for Visualizations. So we'll have a directory of react components with each component having a separate file, e.g. PieChart.jsx
, BarChart.jsx
etc.
https://github.com/sharmalab/Datascope/wiki/interactiveFilters.json
I don't think this example is right
Allow visualizations like parallel coordinates by showing at least some individual data.
On occasion, DataScope reverts to the old color scheme. Or a hybrid of the old and new.
It looks like bubbleChart may not actually be implemented:
If a certain filter has undergone significant changes due to adjustments in other filters, then we should find a way to highlight it.
Ideally support for both, but at least one of:
cell as link
row as link
This would involve
/data
request handler to perform the filtering on druid using something like Plywood.Error: ENOENT: no such file or directory, open 'config/dashboard.json'
at Error (native)
at Object.fs.openSync (fs.js:641:18)
at Object.fs.readFileSync (fs.js:509:33)
at exports.getDashboardConfig (/Users/birm/Desktop/git/Datascope/routes/rest.js:44:16)
at Layer.handle [as handle_request] (/Users/birm/Desktop/git/Datascope/node_modules/express/lib/router/layer.js:95:5)
at next (/Users/birm/Desktop/git/Datascope/node_modules/express/lib/router/route.js:137:13)
at Route.dispatch (/Users/birm/Desktop/git/Datascope/node_modules/express/lib/router/route.js:112:3)
at Layer.handle [as handle_request] (/Users/birm/Desktop/git/Datascope/node_modules/express/lib/router/layer.js:95:5)
at /Users/birm/Desktop/git/Datascope/node_modules/express/lib/router/index.js:281:22
at Function.process_params (/Users/birm/Desktop/git/Datascope/node_modules/express/lib/router/index.js:335:12)
Crossfilter 1.4 has added support for a new top() method that includes the offset: https://github.com/crossfilter/crossfilter/wiki/API-Reference#wiki-dimension_top. This makes pagination for us easier since we don't have to unfurl the entire data to slice everytime.
possibly using quadtree (https://bl.ocks.org/mbostock/4343214)
Include how to make new API and how to use existing
Node.js 8.0 is here ๐
Should be able to respond to different size screens better.
Subtasks:
We need to setup our filters and visualizations, so that we can exclude missing/invalid data
We need to be able to show both the original and filtered data in a way which describes the filter in the context of the data.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.