Giter VIP home page Giter VIP logo

discovery-app's People

Contributors

andrewsu avatar everaldorodrigo avatar flaneuse avatar gtsueng avatar jmcmurry avatar juliamullen avatar marcodarko avatar newgene avatar nikkibytes avatar remoteeng00 avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

discovery-app's Issues

prepopulate dataset registration form based on a URL (to figshare, dataverse, NCBI GEO, etc.)

For people who have already registered their dataset in a repository (like figshare, dataverse, NCBI GEO, etc.), would be nice to pre-populate our form with matching fields. Note that in some cases (e.g., figshare, dataverse) they will already expose structured metadata via schema.org standards. In other cases (e.g., NCBI GEO), we'd need to provide the mapping leveraging the work we did with metadataplus.

Suggested NIAID/DDWG 2020-10-20

Quick edit functionality seems to be pulling the wrong schema

When I try to edit a dataset's metadata, for instance to fix author.affiliation to make it an object, I get a validation error that creator is a required field. According to the schema, creator isn't required: author is. Maybe pulling the incorrect schema for validation?

error message:

'creator' is a required property
Hint: name,description,creator,publisher,identifier

use an already-registered dataset as a template

Something like "create a new dataset record like this". Not sure whether to display that on the existing dataset details page, or have a dataset selector on the dataset registration page, or something else... This can be accomplished by downloading and importing the JSON, so this would be an easier way of accomplishing that...

Write help guide about ways to use guide

Based off of feedback from the 2020-10-20 NIAID DDWG working group demo, it seems like it'd be helpful to outline all the ways the /guide can be used. A few use cases:

  1. Manually type everything into guide
  2. Upload a .json for a single dataset (and possibly add/verify/augment fields)
  3. Upload a .csv for a single dataset (and possibly add/verify/augment fields)
  4. Upload from Github
  5. Upload multiple datasets in a .json (and possibly add/verify/augment fields): see #12
  6. Upload multiple datasets in a .csv (and possibly add/verify/augment fields): see #12
  7. Modify a single property for a registered dataset (change in registered dataset list)
  8. Modify a bunch of properties for a registered dataset (import in guide
  9. Use an existing dataset as a template; for instance, load everything, then change identifier and make minor adjustments to other fields: see #34
  10. Import dataset metadata from an NCBI repo and augment / conform to our schema: see #35

Bug: Attempting to derive a new schema from a non-schema.org schema throws a validation error.

Using the DDE playground to create a new schema seems to work fine when 'extending' it from a schema.org class. The schema generated in the preview can be saved to github, loaded into the schema playground and correctly visualized. Example schema.

This is not the case when you try to extend from a (non-schema.org) schema registered in the DDE playground such as the outbreak schema or even the BioMedicalDataset schema. Example 1: bte:BioMedicalDataset schema-derived class. Example 2: outbreak:Dataset schema-derived class.

The new schemas created from the registered class will give the following error when you try to load them. "Failed because:
field about in $validation is not correctly documented"
image

Minor UX: copy text input from measurementTechnique to "add custom value" input box when registering Dataset

When you search for a term against an ontology, it'd be nice to transfer the input search term to the "Didn't find what you are looking for? Enter value here:" input box.

Steps to reproduce:

  1. https://discovery.biothings.io/guide/niaid
  2. Add "metagenomics" as measurementTechnique
  3. Idealized behavior: "metagenomics" would already appear in "Didn't find what you are looking for? Enter value here:" so you can change or submit.

Improve controlled vocabulary selection

Suggestion from 2020-10-20 DDWG working group meeting.

When one enters a term that has a lot of hits, it's really hard to find what you're looking for in the long list of matching terms. For instance: if you enter "influenza a" for "infectiousAgent" in the NIAID dataset schema, the terms are cut off (since they're long):
Screen Shot 2020-10-20 at 1 17 32 PM

Suggest:

  1. Sorting the list by best match first; it looks like spaces are getting split in some cases (sorry, don't have an example), so you're getting partial matches for things like "influenza"
  2. Add a search box at the top to filter the results. For instance, if you typed "Washington", it'd only give you things like "Influenza A virus (A/Washington2958/2012(H1N1))"

improve ease of drag and drop for validation editor

Right now, it's difficult to drag and drop from the validation options to properties further down the list.
image

Either enable scrolling of the list itself (so user can scroll to property) and then do the drag and drop, or allow the drag/drop menu to float so that when user scrolls down the screen, the drag/drop menu doesn't move out of view.

error on dataset registration

I got to this error state when I click the "Register" button:

image

This is what I filled out for required fields (left all recommended ones blank):

image

Currently, it looks like there is a "silent" error -- I would expect that any issue would be flagged in the web interface...

Suggest already registered datasets to prevent duplicate datasets / use as a template

It's possible that users might try to register a dataset that already exists in the DDE, or they might want to work from a template of a similar dataset (see #34 ).

Similar to how Stack Overflow suggests you look at previously asked/answered questions rather than duplicating a question, it'd be cool as you enter data for it to suggest that the dataset might already be registered.

Suggested ways of detecting similarity:

  • name / title (exact / Jacobian similarity)
  • description
  • author/creator

Bug: new property on an extended schema throwing error when trying to visualize

Error message: "field analysisCode in $validation is not correctly documented"

Steps to reproduce:
Try to visualize https://raw.githubusercontent.com/SuLab/niaid-data-portal/master/schema/Demo-Extended-Dataset.jsonld on https://discovery.biothings.io/schema-playground

Steps to create .json schema

  1. https://discovery.biothings.io/registry
  2. Search for a dataset (NIAID); select NIAID dataset
  3. Create namespace, class
  4. Add custom property analysisCode of type schema:SoftwareSourceCode

Improve display of objects in Guide

  1. In the Guide, sometimes the description of the property isn't listed for nested objects, e.g. for NIAID dataset:

Screen Shot 2020-10-20 at 1 36 10 PM

Screen Shot 2020-10-20 at 1 35 09 PM

... and sometimes the description is given:

Screen Shot 2020-10-20 at 1 36 05 PM

  1. Add asterisks or similar to make explicit which properties are required for objects within a schema (e.g. funder.name, identifier are required; rest are optional for NIAID:funding)

  2. Specify the parent object in the field name somehow. For instance, NIAID has two descriptions: description and funding.description. When filling in the funding description, some users were confused about whether that was referring to the dataset description or the funding description

Based off of feedback from 2020-10-20 NIAID DDWG call.

Bug: When registering a dataset, children seem to be inheriting their parent's marginality

Bug: When registering a dataset, children seem to be inheriting their parent's marginality

Example: Why is url a required property on creator in NIAID? It should be optional according to the schema:
https://github.com/SuLab/niaid-data-portal/blob/master/schema/NIAIDDataset.json

Steps to reproduce:

  1. https://discovery.biothings.io/guide/niaid
  2. Add a creator; try to bypass url.

See also: funding:description, funding:url, funding:parentOrganization, ...

dataset registration form improvements

I have two suggestions to improve the dataset registration form:

  • Form asks for "The identifier of the dataset if available", but it's a required field and the form won't allow moving on unless that's filled. So suggest removing "if available".
  • It is unclear what "Reusable person definition" means.

image

Bug: need to de-duplicate `required` field when extending a schema.

πŸ› when extending a schema that has re-used a schema.org property, if you make one of those properties required, it’ll duplicate it in the required object. This will throw an error when you try to visualize the schema. Need to de-duplicate the required list.

Steps to replicate:

  1. https://discovery.biothings.io/registry
  2. Search for a dataset (NIAID); select NIAID dataset
  3. Enter namespace, class
  4. Choose "name" to be re-used and required
  5. Save .json, upload to GitHub
  6. Try to visualized on https://discovery.biothings.io/schema-playground

Allow removal or editing of the input type from the DDE schema editor interface

When creating a new property, the schema editor provides an easy-to-use interface, UNLESS, you happen to make a mistake at the input type step. Since there's no way to remove an erroneous input type, you have no choice but to submit the erroneous property, delete it, and start all over.
image

The temptation is to click in the box listing the input types and to hit backspace (in order to delete the erroneous input type), which doesn't delete it, but can send you back one page (in the browser) which can potentially cause the user to need to restart.

Schema playground: non-existing namespace show cached schemas

This issue seems having two behaviors:

  1. If accessing https://discovery.biothings.io/view/ctsa_blah/ for the first time, the page shows empty schema list, which is correct, but need a more explicit msg saying the namespace does not exist.

  2. If accessing https://discovery.biothings.io/view/ctsa_blah/ after I viewed another schema (e.g. by clicking one of example links), somehow, the first link shows the content from the previous schema. Looks like it picked up the cached data from the local storage.

Event Issue Tracking

When there's an API event (someone added a schema, deleted a dataset, ...),

, we send a notification through
class SchemaNotifier(Notifier):
to predefined channels
class N3CChannel(Channel):

What we are interested in next, is to make this one way operation two way, and enabling things like:

  1. record the response from the notification channel, and save it to the corresponding document's _meta field.
  2. some notification channels may support interaction, for example, slack notifications can define buttons for users to click.
  3. other long-live interaction sessions.

Note we are not planning on supporting anything but the first use case now, but ideally the structure should support further extension.

Nested properties on Schema viewer don't show required fields

In the NiaidDataset, funding.funder, funding.identifier, and funding.funder.name should be required:

"monetaryGrant": {
"type": "object",
"@type": "MonetaryGrant",
"description": "Funding that supports (sponsors) the collection of this dataset through some kind of financial contribution",
"properties": {
"funder": {
"description": "An organization associated with a creator or funder of a dataset",
"oneOf": [
{
"$ref": "#/definitions/funder"
},
{
"type": "array",
"items": {
"$ref": "#/definitions/funder"
}
}
]
},
"description": {
"type": "string",
"description": "description about the funding award / grant"
},
"url": {
"type": "string",
"description": "award / grant URL"
},
"identifier": {
"type": "string",
"description": "Unique identifier(s) for the grant(s) used to fund the Dataset"
}
},
"required": [
"funder",
"identifier"
]
}

The display doesn't show this:
Screen Shot 2020-12-11 at 12 29 24 PM

switch dates to ISO 8601

When registering a dataset (I think in the DataDistribution section), I see this dialog box:

image

To promote good data practices, suggest switching to YYYY-MM-DD to follow ISO 8601 standard.

Schema editor is pulling outdated versions of schema.org schemas

https://discovery.biothings.io/editor shows both variableMeasured (right) and variablesMeasured (wrong) when you try to extend schema:Dataset.

CreativeWork shows 89 inherited properties, but I count 103 on the current Dataset page (counting by hand so might be off by +/- 1).

Suggest pulling schema.org properties from their latest definition at https://schema.org/version/latest/schemaorg-current-http.jsonld (schema.org for developers)

modifications to Discovery Guide

Currently, the "Discovery Guide" link in the header goes to https://discovery.biothings.io/best-practices. A few possible suggestions:

  1. Is /best-practices the best location for this page? "Best practices" and "Discovery Guide" seem like two different things...

  2. At the top of that page, perhaps have a short text description of what a "Discovery Guide" is (not obvious from that phrase alone). Basically explain that it is a wizard that allows dataset registration according to a given schema.

  3. Now that we have three guides, have links on this page to all three of them

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.