rufuspollock-okfn / dataexplorer Goto Github PK
View Code? Open in Web Editor NEWView, visualize, clean and process data in the browser.
Home Page: http://explorer.okfnlabs.org
View, visualize, clean and process data in the browser.
Home Page: http://explorer.okfnlabs.org
Part of #35 (Scripts & Scripting)
Use CodeMirror here. Some connection with script execution (#46) as part of same UI?
Some nice extras:
Also: rename dataset view to project view.
Currently have "transformations". Let's turn this into full-on scripting in the form of full JS.
Switch to using full Recline multiview so we have full viewer including:
Don't have login as default screen
Have nice code editing using codemirror. Even better would be to have a Run button to try out the code.
Auto save script (after every key stroke, every 30s, after every run?) to local storage so that if browser crashes or you close window you can restore later.
As a User I want to create a Project and associate files to that project (online or local) so that I can reload that project later automatically
A project has:
Notes:
As a User I want to write scripts for a project so that I can re-run those scripts later and thereby recreate the results (e.g. a specific visualization)
As a User I want to create type information about an object
As a User I want to export my data to an online service such as the DataHub
Notes:
As a User I want to Share my project with others so that they can see what I have done
e.g. ?backend=csv&url=....
this link: https://github.com/login/oauth/authorize?client_id=2bab62e2f6b27c3ebe1f&scope=repo,%20user
redirects to /src/transformer/?code=58296c98c7388a0a5cfb
but it should redirect to ?code=58296c98c7388a0a5cfb
i think you have to change the github app settings for 2bab62e2f6b27c3ebe1f
Workflow: (google chrome)
Create a project, go on transform, be unhappy click: "My Projects" create a new one, click "Transform" -> Transform does not show, the list view stays.
Solution reload and then select the project from my projects
Need to save:
Already done this a bunch of times for recline so should not be too hard - if you are thinking of working on this check out the State related stuff in Recline!
Suggest using:
Just playing with data converter from a google spreadsheet. (https://docs.google.com/spreadsheet/ccc?key=0AlgwwPNEvkP7dGxsWFhoeWljWV9BNHVMbFRVRHQyZXc#gid=0) I would like to convert the tags to lower case. Therefore I run the following transform:
function(doc) {
doc['tags'] = doc['tags'].toLowerCase();
return doc;
}
If I click "Run on all records" it doesn't give me an indication the running is finished, can we have this?
For time being will just be the list of projects.
{ projects: [{ id: ... gist_id: ... # maybe the same state: active | deleted } .... ] }
Name: DataExplorerConfig.json
Boot sequence:
Persistence is automatic on each change ...
Project objects encapsulating a given activity around a dataset
Steps to reproduce (chrome)
javascript function(doc) { doc['tags'] = doc['tags'].toLowerCase(); return doc; }
This is a overview how user usually works with data (see attached diagram). There exists lots of formats and data services, therefore a modular architecture is needed to achieve most flexibility, that would result in most useful user experiences.
Data can be generally either serialized into file and stored somewhere, or accessed using APIs.
CSV uses memory backend, it is not a logically backend, just a format. Therefore having a file/document backend with a given format would be more flexible.
User should be bothered by need to provide additional input as little as possible. Reasonable defaults or auto-detection should be utilized. For exporting some live preview of part of data should be available.
Backends need data from user to specify credentials or format options. That data can be viewed simply as JSON object. Some general form building library like Alpaca (based on general JSON-Schema) or Backbone-Forms can be utilized. This solution have advantage that adding a new format or backend does not require to write UI related code.
Instead of just exporting and saving static data, it would be comfortable to provide option to share application state (stack of applied operations, queries, transformations, visualization optins, etc.) via URL. This encourages easy sharing and also if data are corrected in original source, all derived data would appear also corrected.
recline issues:
dataexplorer issues:
Have been intermittently getting an error where SlickGrid does not display and have in console: "Uncaught Error: Cannot find stylesheet." (slick.grid.min.js:39)
This is the primary work area. Want to get the layout right. Key principles:
Where do we display things e.g. grid and script editor together?
Want this so we can do login at any time without disturbing process in main window
When I click "My Projects" after login thru GitHub nothing going on. No errors appear in console.
Chrome v23.0.1271.97 m
Nice page giving a quick intro and overview of what is on offer
We should be able to save and load clean up scripts from github
Part of #35 (scripts & scripting)
Should look like a gist pretty much :-)
{ # aka name (but unique) id: ... # the content of the script content: language: javascript }
Possible for the future
# e.g. transform, standard ... type: ... # for remote scripts (i.e. ones you import and reuse) url:
Getting to the point where development will become unsustainable without tests ...
Doubt this is needed but worth recording anyway.
Not needed because you could just reload the source data and re-run the script ...
Assume accessible via CORs or on same domain
This would be awesome with gdocs as a backend!
Google now use OAuth. This is normally a PITA to support (witness the hassle to get login to github via oauth) but Google specifically support client side stuff:
The Google OAuth 2.0 Authorization Server supports JavaScript applications (JavaScript running in a browser). Like the other scenarios, this one begins by redirecting a browser (popup, or full-page if needed) to a Google URL with a set of query string parameters that indicate the type of Google API access the application requires. Google handles the user authentication, session selection, and user consent. The result is an access token. The client should then validate the token. After validation, the client includes the access token in a Google API request. 1
To find out we need the Google Docs on OAuth for Client Side Apps
For scripting to be really useful we need some standard functions
=> We need an ajax library - see #66
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.