Comments (23)
@goodb - Do you need any additional examples or input from @ukemi or me on this?
from gocamgen.
@vanaukenk yes, we never saw a fully worked example. We need examples of gpad rows that use this structure and that express the more challenging cases discussed at the meeting with a link to a demo go-cam that fits the structure.
from gocamgen.
I'll provide some examples. I'm wading through the MGI GPAD file.
from gocamgen.
Should I make the example as a development model on the production site or on the development site?
from gocamgen.
@ukemi I think you should use the production site with the development status tag. That will make things more stable over time. Generally speaking, I think you should always use production if you intend to have your models stay around for any length of time. (I know @kltm concurs on that.)
from gocamgen.
Here is an example of a pipe-separated 'with' field and a comma-separated 'with' field (field 5; MGI:MGI:4127851|MGI:MGI:4440463). Pinging @vanaukenk for additional comments. The model is 5c4605cc00000457
The left hand side shows an annotation that has a pipe-separated 'with' field that is broken up into two evidence statements. The two right hand models show a comma-separated 'with' field in which all of the alleles support a single piece of evidence. There are two graphs for the latter annotation line because there is a pipe-separated annotation extension describing the role of Ctnnb1 in the specification of two different cell types.
-
pipe-separated values:
MGI | MGI:2442827 | acts_upstream_of_or_within | GO:0061512 | MGI:MGI:4439200|PMID:20159594 | ECO:0000315 | MGI:MGI:4127851|MGI:MGI:4440463 | | 20131001 | MGI | transports_or_maintains_localization_of(MGI:MGI:95728)
-- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- -
comma-separated values
MGI | MGI:88276 | acts_upstream_of_or_within | GO:0001708 | MGI:MGI:3578464|PMID:15866163 | ECO:0000315 | MGI:MGI:2148591,MGI:MGI:2148594,MGI:MGI:2450929 | | 20050728 | MGI | results_in_specification_of(CL:0000138)|results_in_specification_of(CL:0000062)
-- | -- | -- | -- | -- | -- | -- | -- | -- | -- | --
from gocamgen.
Thanks @ukemi we can do this. One question/thought. Right now, the pattern for comma-separated values produces an OWL statement like this:
lego:evidence-with "MGI:MGI:2148591,MGI:MGI:2148594,MGI:MGI:2450929"
It is simply a text field with the identifiers concatenated together just as they are in the gpad. The data is there, but is not accessible to the OWL reasoning system at all in this form. In the future, assuming the 'with' pattern is kept, we might want to generate semantically meaningful statements instead. e.g. something along the lines of lego:evidence-with OWL:Intersection[MGI:MGI:2148591 & MGI:MGI:2148594 & MGI:MGI:2450929].
from gocamgen.
Or lego:evidence-with OWL:Union[MGI:MGI:2148591 & MGI:MGI:2148594 & MGI:MGI:2450929]? This statement means that all of the identifiers need to be considered as qualifying the evidence. I'm never really certain how to handle intersection versus union in cases like this. I agree, the string including the commas is undesirable. @balhoff, @cmungall @kltm, I think we talked a bit about this at the hackathon.
from gocamgen.
If I understand the model, I don't think you would use union there as all of them are required for the statement to be true, so Intersection.
If we went down this path, we could model the pipe-separated case as one union block e.g. OWL:Union(MGI:MGI:4127851 | MGI:MGI:4440463) instead of creating multiple evidence statements.
from gocamgen.
This is a bit like what I am proposing for modeling complexes. Basically replacing a bunch of linked OWL individuals with logical statements in OWL. See geneontology/pathways2GO#34 (comment) for the Intersection model of a complex.
from gocamgen.
So in this case, what's the difference between the '|" and the '&'? I think you will need to explain this to us on a call.
from gocamgen.
I was using | to mean OR and & to mean AND
from gocamgen.
Makes sense to use OWL constructs here but remember that the arguments must be classes, fine for genes
Also this has to work all the way up the stack, in the GE and in ART and in Form
from gocamgen.
@cmungall is this a change that should be made now, before the migration, or something that should be punted and done as a project later on? My intuition is the former because it involves changes in Noctua and Minerva code - and, as you allude to, the problem that we are not confident that all referred to entities will be classes. Nothing insurmountable but a fair chunk of work. Let us know what you think.
from gocamgen.
from gocamgen.
This would also be an opportunity to materialize an ontology to capture the lego relationships. (And perhaps do away with "With" in favor of something a little more meaningful to less informed users and their computers ?)
I suggest pushing back after initial MOD conversion given timelines as I understand them. Could be a nice smallish hackathon group project at some point downstream.
from gocamgen.
from gocamgen.
That figure doesn't line up with what is currently coming out in the OWL export e.g. in http://noctua.geneontology.org/download/gomodel:5c4605cc00000457/owl I see
http://geneontology.org/lego/evidence-with where it ought to be RO:0002614 according to that model. and http://geneontology.org/lego/evidence where it should be RO:0002612 etc. I guess the first step here would be to update the stack to reflect the model there.
from gocamgen.
Actually this is more complicated than I thought as we are using string literals for with as well as for reference.
I would like to better understand the issues which keep us using strings for contributor IDs and references, etc. As far as I know it leads to a display problem, which maybe we could work around with some "hide" annotations on those nodes. But I bet this has been discussed before.
from gocamgen.
Ah yes, it was in my imagination we had transitioned the lego relation to RO
Yes, this should be easier when we start sending back inferred types. Should be easy to hide. Would also be good to change some assumptions in GE display and make it more activity-centric, cc @kltm
from gocamgen.
@cmungall @ukemi we need a decision here so that @dustine32 can finish his conversion script. I think I over-complicated this in the comment above where I mentioned that its just a bunch of strings.. Is it okay to leave them like that, as @ukemi , describes above for the purposes of the batch MOD conversion project ? This won't make them any less useful than they currently are. The conversion to a more useful OWL representation could be done as part of a separate project that can happen later.
from gocamgen.
Yes, keep as strings. We know we need to change the OWL representation anyway (e.g. use IRIs rather than string literals for publications)
from gocamgen.
For testing purposes, check the mec-3 (WBGene00003167) GO-CAM import.
There are currently three BP annotations, all with WBVariations as pipe-separated With/From values, that are not imported at all.
from gocamgen.
Related Issues (20)
- Update extension validation rules TSV
- Update extension validation rules for acts_o_population_of HOT 3
- Create test files for additional annotation metadata HOT 31
- No extension is an island HOT 4
- Resolve Shex failures in MGI annotations due to invalid identifiers for binding input HOT 38
- add taxon metadata for each model HOT 5
- Create test files for WB import HOT 12
- Write translated models out in N-Quads format HOT 1
- Proteoforms shouldn't be split into separate models HOT 14
- Handle pipe-separation in translation of with/from field HOT 1
- Handling interacting taxon data
- Add date and contributor to ALL annotation individuals? Not just evidence and Axiom? HOT 4
- Emit comment in annotation properties HOT 1
- Collapse comma-delimited objects of chain relations
- Add providedBy to all individuals HOT 2
- Add gene symbol to model title HOT 14
- Param to set modelstate HOT 1
- Comment missing some text from the GPAD HOT 1
- Set import model states to production so annotations are in GPAD outputs from dev HOT 5
- Processing annotation contributors for multiple GPAD lines with single annotation id HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gocamgen.