Giter VIP home page Giter VIP logo

syngo2lego_data_conversion's Introduction

Build Status

syngo2lego_data_conversion

A repo for conversion of syngo JSON to GO-CAM OWL

syngo2lego_data_conversion's People

Contributors

dosumis avatar dustine32 avatar ftwkoopmans avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar

syngo2lego_data_conversion's Issues

Some SynGO evidence codes are no longer children of IDA

In the SynGO data, these evidence codes should be replaced so that they map up to IDA:

  • ECO:0000164 electrophysiology assay evidence > should be changed to ECO:0006006 electrophysiology assay evidence used in manual assertion
  • ECO:0001120 radioisotope assay evidence > should be changed to ECO:0001254 radioisotope assay evidence used in manual assertion
  • ECO:0005593 immunodetection assay evidence > ECO:0007719 immunodetection assay evidence used in manual assertion
  • ECO:0006065 in vitro cell based assay evidence used in manual assertion > obsolete, should be replaced by ECO:0007695
    name: cell-based assay evidence used in manual assertion

Tagging @pgaudet @dustine32

Update extensions structure in JSON to explicitly define order in models

Currently the JSON structure for terms under a relation in the extensions field only allow a list, which may be unordered. If we think we should change this spec to better represent the desired nesting structure of the SynGO GO-CAM models I have some ideas.

The existing structure for SYNGO_2082 is:

"extensions": [
                        {
                            "occurs_in": [
                                "UBERON:0001876",
                                "GO:0098982"
                            ]
                        }
                    ]

Perhaps, to represent the desired nesting order it should be:

"extensions": [
                        {
                            "occurs_in": {
                                "GO:0098982": {
                                    "part_of": "UBERON:0001876"
                                }
                            }
                        }
                    ]

where the value in each dictionary determines whether the chain should be terminated? Dictionary="keep going" whereas String="we're done"?

There's also the method used in the ontobio python library for parsing GAF/GPAD, which structures extensions as described here.

After adjusting @cmungall 's example a bit (replacing "union_of" key with "extensions"):

"extensions": [
   {'intersection_of':[ 
       {property:P1, filler:F1},
        ...
      ]
    },
    ...
]

And then putting in our example values:

"extensions": [
   {'intersection_of':[ 
       {property:"occurs_in", filler:"GO:0098982"},
        ...
      ]
    },
   {'intersection_of':[ 
       {property:"part_of", filler:"UBERON:0001876"},
        ...
      ]
    },
    ...
]

Hm, though now I see that the ontobio solution doesn't really specify order. Perhaps our use case in SynGO is so specialized maybe we shouldn't worry about the spec being changed and just handle this in code? What do you guys think @ftwkoopmans @thomaspd ?

Annotations in SynGO tool do not correspond to what gets exported

I am not sure whether this is the right tracker to raise this issue. Please do let me know, if I should raise this in a different tracker.

There appears to be a problem with:

  • either SynGO to GO-CAM conversion;
  • or GO-CAM to GPAD conversion.

What needs to be addressed:

  1. If SynGO VU curators made an annotation to "regulation of synapse disassembly/pruning", then this is what should be displayed in QuickGO and AmiGO (not direct annotations to "synapse disassembly/pruning".
  2. If a single annotation was made by a SynGO VU curator, this one single annotation should be displayed in QuickGO and AmiGO (not multiple copies of it, corresponding to multiple evidence codes).
  3. If annotations from 5 different PMIDs have the implementationReady status, then why do annotations from only 1 of these PMIDs get exported?

Details:

When you search for 'synapse disassembly' in the SynGO VU GO annotation tool, the search reveals that there are 8 annotations made to GO term 'regulation of synapse disassembly' (renamed to: 'regulation of synapse pruning'), and one direct annotation to 'synapse disassembly' (renamed to: 'synapse pruning'):

screen shot 2018-08-14 at 10 34 30
[Screenshot: SynGO VU GO annotation tool, accessed 13 Aug 2018].

They all have the 'implementationReady' status, which suggests to me that they all would have been converted to GO-CAMs, and so further to GPADs, then imported by QuickGO and AmiGO.

A search in QuickGO for the listed PMIDs, filtered for 'SynGO' and for 'GO:0098883 synapse pruning' reveals that:

  • Only some of these annotations have been exported;
  • Each single annotation that got exported from the SynGO VU annotation tool, then got translated into multiple annotations;
  • The annotation originally made by curators using the SynGO VU annotation tool were made to the "regulation of synapse disassembly/pruning" terms, whereas the publicly displayed annotations, are direct "synapse disassembly/pruning" annotations

Specifically, annotations based on PMID:17143272 made for Epha4 and Cdk5 get displayed, but the 'regulation of...' is lost:
screen shot 2018-08-14 at 10 59 04
[Screenshot: QuickGO browser, accessed 14 Aug 2018].

A search in AmiGO (filtered for 'SynGO' and for 'GO:0098883 synapse pruning') reveals the same outcome:

  • Only some of these annotations have been exported;
  • Each single annotation that got exported from the SynGO VU annotation tool, then got translated into multiple annotations;
  • The annotation originally made by curators using the SynGO VU annotation tool were made to the "regulation of synapse disassembly/pruning" terms, whereas the publicly displayed annotations, are direct "synapse disassembly/pruning" annotations.

The only difference between what is shown in QuickGO and AmiGO is that fewer multiples of each annotation get displayed in AmiGO:
screen shot 2018-08-14 at 11 09 42
[Screenshot: AmiGO2 browser, accessed 14 Aug 2018].

Btw. Substituting 'regulation of synapse pruning' for 'synapse pruning' reveals no SynGO annotations at all.

cc
@RLovering
@ftwkoopmans
@cmungall
@dustine32

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.