geneontology / syngo2lego_data_conversion Goto Github PK
View Code? Open in Web Editor NEWA repo for conversion of syngo JSON to GO-CAM OWL
A repo for conversion of syngo JSON to GO-CAM OWL
In the SynGO data, these evidence codes should be replaced so that they map up to IDA:
Tagging @pgaudet @dustine32
Currently the JSON structure for terms under a relation in the extensions field only allow a list, which may be unordered. If we think we should change this spec to better represent the desired nesting structure of the SynGO GO-CAM models I have some ideas.
The existing structure for SYNGO_2082 is:
"extensions": [
{
"occurs_in": [
"UBERON:0001876",
"GO:0098982"
]
}
]
Perhaps, to represent the desired nesting order it should be:
"extensions": [
{
"occurs_in": {
"GO:0098982": {
"part_of": "UBERON:0001876"
}
}
}
]
where the value in each dictionary determines whether the chain should be terminated? Dictionary="keep going" whereas String="we're done"?
There's also the method used in the ontobio python library for parsing GAF/GPAD, which structures extensions as described here.
After adjusting @cmungall 's example a bit (replacing "union_of" key with "extensions"):
"extensions": [
{'intersection_of':[
{property:P1, filler:F1},
...
]
},
...
]
And then putting in our example values:
"extensions": [
{'intersection_of':[
{property:"occurs_in", filler:"GO:0098982"},
...
]
},
{'intersection_of':[
{property:"part_of", filler:"UBERON:0001876"},
...
]
},
...
]
Hm, though now I see that the ontobio solution doesn't really specify order. Perhaps our use case in SynGO is so specialized maybe we shouldn't worry about the spec being changed and just handle this in code? What do you guys think @ftwkoopmans @thomaspd ?
I am not sure whether this is the right tracker to raise this issue. Please do let me know, if I should raise this in a different tracker.
There appears to be a problem with:
What needs to be addressed:
Details:
When you search for 'synapse disassembly' in the SynGO VU GO annotation tool, the search reveals that there are 8 annotations made to GO term 'regulation of synapse disassembly' (renamed to: 'regulation of synapse pruning'), and one direct annotation to 'synapse disassembly' (renamed to: 'synapse pruning'):
[Screenshot: SynGO VU GO annotation tool, accessed 13 Aug 2018].
They all have the 'implementationReady' status, which suggests to me that they all would have been converted to GO-CAMs, and so further to GPADs, then imported by QuickGO and AmiGO.
A search in QuickGO for the listed PMIDs, filtered for 'SynGO' and for 'GO:0098883 synapse pruning' reveals that:
Specifically, annotations based on PMID:17143272 made for Epha4 and Cdk5 get displayed, but the 'regulation of...' is lost:
[Screenshot: QuickGO browser, accessed 14 Aug 2018].
A search in AmiGO (filtered for 'SynGO' and for 'GO:0098883 synapse pruning') reveals the same outcome:
The only difference between what is shown in QuickGO and AmiGO is that fewer multiples of each annotation get displayed in AmiGO:
[Screenshot: AmiGO2 browser, accessed 14 Aug 2018].
Btw. Substituting 'regulation of synapse pruning' for 'synapse pruning' reveals no SynGO annotations at all.
Build fail
src/simple_validator.py:19: UserWarning: ['ECO:0005014'] is not of type 'string'
https://travis-ci.org/geneontology/syngo2lego_data_conversion/builds/251482116
Should be fixed by using json from here:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.