Comments (38)
@ukemi @loricorbani Awesome! I tested a one-off model for MGI:MGI:1917115 with this binding GPAD line:
MGI MGI:1917115 enables GO:0005515 MGI:MGI:5613094|PMID:24916387 ECO:0000353 PR:Q91WT8 20150211 MGI contributor=http://orcid.org/0000-0001-7476-6306
And the with/from
value PR:Q91WT8
at least appears to be in NEO since it resolves the ID to a label:
I'll run the ShEx minerva validator on this model to see if it is valid. Currently having trouble getting my Noctua instance "Use reasoner" option to work so heads up in case you try it too.
from gocamgen.
A few numbers from @mdolanme
Number of binding annotations using IPI and the 'with' column (non-chebi): 3968
Number whose values are already in the GPI file: 77
Number whose value will be in the GPI when we switch UniProtKB: to PR: : 3111
Number that are either non-mouse, don't have a PRO id or have typos: 780
Number of these that are UniProt: 726/780
Number of these that are Refseq: 37/780
Number of these that are PRO ids: 9/780
Number of these that are typos: 7/780
Number of these that are EMBL: 1/780
Number of these UniProt ids that are definitely non-mouse: 304/726
from gocamgen.
non_mouse_uniprot_with_ids.txt
Attached this list from @ukemi of non-mouse UniProtKB identifiers that appear in with/from
field in some MGI binding annotations. These are examples of task 4 in @ukemi's initial issue description, which should not be converted to has_input
edges and instead live in the with/from field of the evidence individual.
from gocamgen.
04/29/2020 : for dustin
David H : add any comments about what is special about this version
http://www.informatics.jax.org/downloads/custom/mgi.gpa
http://www.informatics.jax.org/downloads/custom/mgi.gpa.gz
from gocamgen.
Hi @dustine32. In this version we have replaced the 'with' field values in binding annotations with PRO ids. They should now be validate for the Shex checks since they will be in our GPI file and therefore in NEO. @loricorbani replaced several thousand automatically and I replaced close to a thousand manually. I'd like to see how many models are still failing the Shex checks to determine where I have missed any. We can discuss in more detail on the GO-CAM specs call.
from gocamgen.
@dustine32 let me know if you need help with noctua/minerva reasoner situation. Its been in flux, so if you are running from a dev branch you may need to adjust some command line params. Should be stabilizing soon...
from gocamgen.
Thanks @goodb ! Yeah, I figured I would eventually need to hit you up on gitter or something to sort that out. I can probably attack that after the noon call today.
from gocamgen.
05/15/2020 : for dustin
David H : add any comments about what is special about this version
http://www.informatics.jax.org/downloads/custom/mgi.gpa
http://www.informatics.jax.org/downloads/custom/mgi.gpa.gz
http://www.informatics.jax.org/downloads/custom/mgi.gpi
http://www.informatics.jax.org/downloads/custom/mgi.gpi.gz
from gocamgen.
Thanks @loricorbani. @dustine32, in this set of files we have changed the way we are retrieving PRO identifiers for proteins that correspond to MGI genes. I suggest that for further testing we use the gpi files supplied at this location.
from gocamgen.
05/22/2020 : for dustin
David H : add any comments about what is special about this version
http://www.informatics.jax.org/downloads/custom/mgi.gpa.gz
http://www.informatics.jax.org/downloads/custom/mgi.gpi.gz
These contain MGI/Production from 05/21.
Using PRO/GPI file.
With converted "contributor" values.
from gocamgen.
Hi @dustine32. In this version I have fixed the binding inputs for all of the MFs with the IPI evidnce code that you are converting to inputs. Once you have updated to the complete cell and anatomy ontology, could you run these through the Shex validation again and I will clean up any stragglers. There still may be a few. Thanks!!!
from gocamgen.
non_mouse_uniprot_with_ids.txt
Attaching newest list of non-mouse identifiers described above in #79 (comment).
from gocamgen.
Hi @dustine32. I just noticed another issue in the models with respect to item 4. If you look at the model titled MGI:MGI:1929601, you will notice that in some cases, human proteins have been converted to inputs and are valid presumably because they are in Neo. These should be treated like the ones above #79 (comment). Clearly, my trying to find these manually is not an optimal strategy. Maybe we should switch to converting only identifiers that are found in the mouse GPI file to inputs and leaving all the rest in the 'with' field. This will get around the hand-built list in the comment above and the few more that I found today. I am pretty convinced that I have caught most of the 'true' errors that existed and those identifiers have all been corrected as either valid MGI identifiers or PR identifiers from the mouse GPI file. We can chat about this if it's not clear.
from gocamgen.
06/08/2020 : for dustin
David H : add any comments about what is special about this version
http://www.informatics.jax.org/downloads/custom/mgi.gpa
http://www.informatics.jax.org/downloads/custom/mgi.gpa.gz
from gocamgen.
This version should have fixed the Shex errors.
from gocamgen.
06/09/2020 : for dustin
David H : add any comments about what is special about this version
http://www.informatics.jax.org/downloads/custom/mgi.gpa
http://www.informatics.jax.org/downloads/custom/mgi.gpa.gz
from gocamgen.
It should be very special. It should pass both the logic and Shex checks.
from gocamgen.
06/15/2020 : for dustin
David H : add any comments about what is special about this version
http://www.informatics.jax.org/downloads/custom/mgi.gpa
http://www.informatics.jax.org/downloads/custom/mgi.gpa.gz
from gocamgen.
Fixed a typo.
from gocamgen.
06/29/2020 : for dustin
David H : add any comments about what is special about this version
http://www.informatics.jax.org/downloads/custom/mgi.gpa
http://www.informatics.jax.org/downloads/custom/mgi.gpa.gz
from gocamgen.
08/05/2020 : for dustin
This is GPI version 2 from MGI
David H : add any comments about what is special about this version
http://www.informatics.jax.org/downloads/custom/mgi2.gpi
from gocamgen.
This version does not have values in column 8 yet and does not have protein complexes yet. Once we (GOC) decides exactly what is supposed to go into column 8, we (MGI) will populate it. Adding the complexes should also be straightforward and we can run a test with them at some point to make sure they are then available for curation in Noctua. Thanks @loricorbani !!!!!
from gocamgen.
08/12/2020 : for dustin
This is GPI version 2 from MGI
David H : add any comments about what is special about this version
http://www.informatics.jax.org/downloads/custom/mgi2.gpi
from gocamgen.
This version has mouse protein complexes from PRO along with all the other changes from the previous version.
from gocamgen.
09/11/2020 : for dustin
This is GPI version 2 and GPAD version 2 from MGI
David H : add any comments about what is special about this version
http://www.informatics.jax.org/downloads/custom/mgi2.gpi
http://www.informatics.jax.org/downloads/custom/mgi2.gpad
from gocamgen.
This GPAD2.0 file has everything including all of the properties that we will use for the initial import. We can go over the details on the call next Tuesday.
from gocamgen.
09/17/2020 : for dustin
This is GPI version 2 and GPAD version 2 from MGI
David H : add any comments about what is special about this version
http://www.informatics.jax.org/downloads/custom/mgi2.gpi
MGI-curated only
http://www.informatics.jax.org/downloads/custom/mgi2.gpad
from gocamgen.
This file has been filtered so it only contains the annotations made by MGI curators using the MGI editorial interface.
from gocamgen.
from gocamgen.
09/24/2020 : for dustin
This is GPI version 2 and GPAD version 2 from MGI
David H : add any comments about what is special about this version
MGI-curated only
http://www.informatics.jax.org/downloads/custom/mgi2.gpad
from gocamgen.
Hi @dustine32 and @dougli1sqrd
In this version we fixed the bug where contributes_to (RO:0002326) wasn't being added to the file correctly.
from gocamgen.
10/07/2020 : for dustin
This is GPI version 2 and GPAD version 2 from MGI
David H : add any comments about what is special about this version
MGI-curated only
http://www.informatics.jax.org/downloads/custom/mgi2.gpad
from gocamgen.
Note that both of these files are filtered for annotations made by MGI curators. The previous version were filtered for annotations made by MGI, but included ones made automatically by our orthology pipeline.
from gocamgen.
10/23/2020 : for dustin
This is GPI version 2 and GPAD version 2 from MGI
David H : add any comments about what is special about this version
MGI-curated only
http://www.informatics.jax.org/downloads/custom/mgi2.gpad
from gocamgen.
@dustine32 In this file I have (I hope) cleaned up all of the MGI annotations and @loricorbani has replaced all of the annotation extension relations with the new ones that we decided on. So this is a test for 'the real thing".
from gocamgen.
New versions
!gaf-version: 2.2
!gpa-version: 2.0
!gpi-version: 2.0
David H : add any comments about what is special about this version
MGI-curated only
http://www.informatics.jax.org/downloads/custom/go_cam_mgi.gpad
http://www.informatics.jax.org/downloads/custom/go_cam_gene_association.mgi
http://www.informatics.jax.org/downloads/custom/mgi.gpi
from gocamgen.
@dustine32. This version has fixed the bug where we were using commas instead of pipes. It should resolve a lot of the nesting issues that I saw with our annotations because those will now be separated. Thanks @loricorbani
from gocamgen.
I have installed an automated script that will generate/copy the go_cam files from our production database to:
http://www.informatics.jax.org/downloads/custom
on a daily basis. You can pick them up at your convenience/when you need fresh copies.
from gocamgen.
Related Issues (20)
- Update extension validation rules TSV
- Update extension validation rules for acts_o_population_of HOT 3
- Create test files for additional annotation metadata HOT 31
- No extension is an island HOT 4
- add taxon metadata for each model HOT 5
- Create test files for WB import HOT 12
- Write translated models out in N-Quads format HOT 1
- Proteoforms shouldn't be split into separate models HOT 14
- Handle pipe-separation in translation of with/from field HOT 1
- Handling interacting taxon data
- Add date and contributor to ALL annotation individuals? Not just evidence and Axiom? HOT 4
- Emit comment in annotation properties HOT 1
- Collapse comma-delimited objects of chain relations
- Add providedBy to all individuals HOT 2
- Add gene symbol to model title HOT 14
- Param to set modelstate HOT 1
- Comment missing some text from the GPAD HOT 1
- Set import model states to production so annotations are in GPAD outputs from dev HOT 5
- Processing annotation contributors for multiple GPAD lines with single annotation id HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gocamgen.