sharmalab / datascope Goto Github PK

View Code? Open in Web Editor NEW

16.0 16.0 12.0 85.46 MB

Interactive linked visual query system for large datasets

Home Page: https://sharmalab.github.io/Datascope/

License: BSD 3-Clause "New" or "Revised" License

JavaScript 94.53% Python 0.11% CSS 3.10% HTML 2.23% Dockerfile 0.03%

nci-qin tcia-dac

datascope's People

Contributors

Stargazers

Watchers

Forkers

wlstks7 birm lakhansamani lastlegion pseudonerd loghijiaha iyuroch barshana-banerjee esskay0000 vinayaksh42 kapseliboi nehaar

datascope's Issues

Filter Animations should not run on filter

This may be related to #27 , but having everything reanimate doesn't make sense here.

Data load fails for some cases

the join of two provided files, and the manually joined data (~ 1k rows) both gave an error which seems to suggest that the read/join operation is timing out.
There is likely more we can discover, and hopefully some fix we can use to prevent this.

Travis issue with old node versions: "Error: Cannot find module 'JSONStream'"

I'm honestly not sure why we're testing against 0.1* when node is on 6./7.. That said, it looks like JSONStream isn't compatible with these versions.
My instinct is to change the node versions to the current ones, (I've added last stable already). Before I consider removing the old versions, I would like to know why they're there.

rename ElasticX/ElasticY to Autoscale

Parallel Coordinates as Main Visualization

like https://bl.ocks.org/jasondavies/1341281

concern with data in browser, maybe use crossfilter

kaplan meier curve

Add support for kaplan meier curves.
Example: https://codepen.io/clindsey/details/zBGaaR

Investigate and Repair 'Error: <g> attribute transform: Expected transform function, "null".'

This error is flooding the console log. It should be fixed or handled.

d3.transform @ d3.js:806
Error: attribute transform: Expected transform function, "null".

Fix x-axis labels on horizontal bar-charts

The x-axis labels should be diagonally positioned like we do in a histogram.

Some examples silently exit

TitancSurvivors and newDataSourceConfig fail silently. Why is this, and what should be done or what should be logged.

Disable statistics icon if there are no stats available

Currently the statistics icon is present by default.

Even when there are no stats associated with an attribute like in the above example. We should only show the statistics icon when there are stats associated with the attribute. This also provides a visual cue for users about what attributes have statistics associated with them.

scatterPlot example

As part of at least one of our examples, we should have a bubbleChart.

Add a numeric console to DataScope

The use case would be one where a user wants to do some simple number crunching on a column. Say splice data in a column and then do a unique count. Or maybe create a new column based on existing data.
This could be a major feature creep so we have to be careful. Maybe start with something simple and see how users respond to it.

Better API Documentation

We probably want a better documentation of the datascope api/routes

Consolidate Documentation

Since docs are spread across .md/html in repo and bitbucket wikis, we should consolidate them.
It looks like some of the code is formatted for jsdoc, so we should make sure that is consistent, and maybe add jsdoc to a release process.

Increase size of icon that minimizes/maximizes the filters

We need to increase the size of the blue icon and also add some descriptive text that shows up on hover.

Start parameter in table next url ignored

table does not pagnate properly as result

Persisting "State" at launch Filter Variables

That is, launch with some variable already filtered, e.g. age between 30 and 40.

Calling a filter via http://localhost:3001/data/?filter={%22Age%22:[30,40]}&dataSourceName=main seems to allow this, but it's reset each filter selection.

The goal here is to have url parameters passed to datascope which are not reset with user interactions in the instance.

Crossfilter in Browser

Add Rest call for crossfilter
Add module for crossfilter
Add front-end for crossfilter

Map Visualization Documentation and Repair

Table visualization shows rows as clickable when they are not

The table visualization does not support clickable rows. However the icon changes when one is browsing a table. This gives the user an impression that something is broken when they try to click on a row. Please fix asap

DataTable pagination issues with large datasets

The DataTable is fairly slow on large datasets with response times 4-8 seconds. We need to look at optimized pagination strategy.

Table pagination and backend processing is fairly unoptimized right now since we make a call to .top(Infinity) on one of the dimensions to get the entire data.

dc-js/dc.js#966 is a useful resource to solve this.

Compressing data using data dictionary

The idea is to used compressed encodings of the data that would be available in a dictionary format. Currently i'm using the dataDescription.json as the place to define the dictionary for each attribute. For example for the attribute Specimen_Type the values 0, 1 and 2 correspond to tumor_tissue, normal_tissue and tumor_blood respectively:

     {
        "attributeName": "Specimen_Type",
        "attributeType": [
            "visual", "filtering"
        ],
        "dictionary": {
            "0": "tumor_tissue",
            "1": "normal_tissue",
            "2": "tumor_blood",
            "3": "tumor_marrow"
        }
    }

This involves modifying the AppStore.jsx which is the front-end data store to encode and decode data. Decoding is done when the \data?filter={} API is called to fetch data in encoded form from the server. Encoding is done on the filter JSON object.

More CI Tests

The fact webpack succeeds isn't alone a good indicator of a good commit/pr. Thus, we should at least unit test some core functionality.

Fix current tests
Add more tests

Marker Maps don't group items well

Zooming out and in doesn't seem to change the grouping of items in marker maps.

Add support for clickable cells/rows in Table Visualization

Users may want to support the launch of a new entity using elements of a row or a cell. We should support this.

`Redo search in this area` approach for certain visualizations

For certain visualizations, to allow for smoother interactions we might want to consider using "redo search in current area" or for our purposes "filter in current area".

So whenever a user zooms into say a parallel coordinate or a map they have the option to set the appropriate filters in the rest of the dashboard.

Backend Improvements

calls don't appear to be stateless
calls may be too specific to certain visualizations (e.g. table)
support for user queries on data through datascope
documentation

Refactor FilteringAttribute.jsx

Right now the code for all the interactive filter types is in FilteringAttribute.jsx [Source]

We should consider splitting it into separate react components as we've done for Visualizations. So we'll have a directory of react components with each component having a separate file, e.g. PieChart.jsx, BarChart.jsx etc.

wiki/interactiveFilters.json is wrong

https://github.com/sharmalab/Datascope/wiki/interactiveFilters.json

I don't think this example is right

Data in browser

Allow visualizations like parallel coordinates by showing at least some individual data.

Navbar overlaps top of screen when narrow.

Remove old color scheme

On occasion, DataScope reverts to the old color scheme. Or a hybrid of the old and new.

Implement/Fix bubbleChart

It looks like bubbleChart may not actually be implemented:

https://github.com/sharmalab/Datascope/blob/3ec120f41c9d182102796dbf2a8e18222aab629d/public/javascripts/src/components/Visualizations/Visualization.jsx

Highlight filters that undergo significant change

If a certain filter has undergone significant changes due to adjustments in other filters, then we should find a way to highlight it.

Some examples seem to be missing (or incorrectly require) data

UCD: ENOENT: no such file or directory, open 'data/ucd_joined.json
Druid: no such file or directory, open 'data/out.json' (Is this part of the druid project?)

Expand documentation for supporting custom shapes in GeoChoropleth maps

Image labels bigger for images

Support links in tables

Ideally support for both, but at least one of:
cell as link
row as link

Replacing crossfilter with druid

This would involve

Ingesting data into druid. We'll need to change the dataSource module to ingest data into druid instead of storing it in memory with crossfilter
We'll also need to modify how we process the filtering requests in /data request handler to perform the filtering on druid using something like Plywood.

dashboard.json should be optional, instead breaks

Error: ENOENT: no such file or directory, open 'config/dashboard.json'
at Error (native)
at Object.fs.openSync (fs.js:641:18)
at Object.fs.readFileSync (fs.js:509:33)
at exports.getDashboardConfig (/Users/birm/Desktop/git/Datascope/routes/rest.js:44:16)
at Layer.handle [as handle_request] (/Users/birm/Desktop/git/Datascope/node_modules/express/lib/router/layer.js:95:5)
at next (/Users/birm/Desktop/git/Datascope/node_modules/express/lib/router/route.js:137:13)
at Route.dispatch (/Users/birm/Desktop/git/Datascope/node_modules/express/lib/router/route.js:112:3)
at Layer.handle [as handle_request] (/Users/birm/Desktop/git/Datascope/node_modules/express/lib/router/layer.js:95:5)
at /Users/birm/Desktop/git/Datascope/node_modules/express/lib/router/index.js:281:22
at Function.process_params (/Users/birm/Desktop/git/Datascope/node_modules/express/lib/router/index.js:335:12)

Refactor datatables to use crossfilter dimension.top(k, offset) method

Crossfilter 1.4 has added support for a new top() method that includes the offset: https://github.com/crossfilter/crossfilter/wiki/API-Reference#wiki-dimension_top. This makes pagination for us easier since we don't have to unfurl the entire data to slice everytime.

Maps currently require scrolling to see whole map
Filter width seems quite constant

Handle Missing Data

We need to setup our filters and visualizations, so that we can exclude missing/invalid data

Contextual Filter Visualizations

We need to be able to show both the original and filtered data in a way which describes the filter in the context of the data.