Comments (17)
This report will be generated based on the go-cam models meant to be as semantically correct with respect to Reactome objectives as possible. This means that it will not be based on models that have been adapted specifically to support curation objectives from the GO. For example, complexes will not be removed or decomposed as Reactome is interested in these.
from pathways2go.
Columns for report:
Reactome id
Reactome label
Reactome node type [complex, reaction, pathway]
Reactome asserted GO classifications (both accessions and terms)
Rule-based GO classifications
OWL-inferred GO classifications
from pathways2go.
Note that the report should have one row per unique reactome id..
from pathways2go.
manual_plus_inferred_mapping.txt
Here is the current report on inferred classes for Reactome entities. Ping @deustp01 @ukemi @thomaspd @cmungall
(Note that it does contain multiple rows for single Reactome ids when those entities are mapped to both a biological process and a molecular function and when the same reaction appears in multiple pathways. e.g. R-HSA-156678 shows up in both situations. @deustp01 depending on if/how you would like to use this information I can adapt the output.
Notes on run:
- Inference is based on September 2018 version of GOPlus, arachne_2.12, v1.2, October 4 version of human Reactome
- Reactome mapping is done with the 'Direct Import' configuration (complexes, locations are not removed), as opposed to NoctuaCuration configuration.
from pathways2go.
Adding ping to connect @fabregat to this thread.
from pathways2go.
Hi @goodb,
I was just taking a look at the report you enclosed above. It looks like some of the data is about Reactome Disease models. We are filtering those in the loads, aren't we?
We certainly don't want annotations from GO_CAM models that represent the disease state.
from pathways2go.
Eg.
Biological Process Defective ABCD1 causes adrenoleukodystrophy (ALD) R-HSA-5684045
Molecular Function Defective ABCD1 does not transfer LCFAs from cytosol to peroxisomal matrix R-HSA-5684043
Biological Process RNF mutants show enhanced WNT signaling and proliferation R-HSA-5340588
Molecular Function RNF43 frameshift mutants show enhanced WNT siganling R-HSA-5340587
from pathways2go.
Would the "disease" attribute or Reactome physical entities and events be useful as a filter - for GO, you'd only want instances for which the attribute value is NULL?
from pathways2go.
We need to learn more from you about the Reactome disease annotations. Are whole pathways labeled as disease pathways? Since GO only annotates 'normal' biology we wouldn't want pathways that represent a disease state. Mind you in the long run it would be fascinating to do the semantic comparison of the 'disease' pathways versus the 'healthy' pathways.
from pathways2go.
@ukemi (first note that a lot has changed since that run, so the inference report is likely going to be very different now).
Right now the code does not do any filtering of the disease models. It looks like the BioPAX for these models isn't really complete enough to develop a proper model anyway. e.g. looking at 'Defective ABCD1 causes adrenoleukodystrophy (ALD)' there isn't any structured data about the disease, the mutant genes, or the relation to the normal pathway in the BioPAX export. If we ever do want to turn this information into GO-CAMs (which I personally think would be very valuable for building analyses) we'd need some work on their end to improve the BioPAX or we'd need to access the data another way. (Ping @deustp01 )
For now I'll add the disease filter to the converter. See #58
from pathways2go.
@deustp01 I don't see any 'disease' information coming through the BioPAX. If there was such a tag, that would be very helpful. My plan was to make use of the Disease pathway hierarchy and ignore any of the subpathways there. Just visually from your browser that looks like it ought to work.
from pathways2go.
(which I personally think would be very valuable for building analyses)
me too!
from pathways2go.
We need to learn more from you about the Reactome disease annotations. Are whole pathways labeled as disease pathways? Since GO only annotates 'normal' biology we wouldn't want pathways that represent a disease state. Mind you in the long run it would be fascinating to do the semantic comparison of the 'disease' pathways versus the 'healthy' pathways.
The relationship between disease pathways and their normal counterparts is complicated, and doesn't work very well for us. A plan to revise it substantially is in the works but it will be a really large effort so it's not clear when it is going to happen.
Meanwhile, there are about three versions of disease pathway: loss-of-function, a variant gene encodes a nonfunctional protein or no protein at all so any reaction dependent on that protein fails (phenylketonuria, adrenoleukodystrophy); gain-of-function, a variant gene has a novel function so a reaction dependent on the protein is altered (constitutively active mutant forms of signaling proteins); a pathogen introduces novel alien proteins into a human cell and those proteins mediate novel reactions with no normal human counterpart.
In every case though, the basic unit of disease annotation is a disease pathway containing one or more disease reactions involving abnormal proteins and possibly abnormal molecules of other sorts, e.g., lipopolysaccharide. If a disease reaction has a normal counterpart, that is noted. All loss-of-function annotations point to the reaction that would have happened if the normal protein had been available, for example. But I'm not sure how any of this is represented in the BioPax export. Try getting a BioPax download of an individual disease pathway and see what it contains.
from pathways2go.
I looked at one and basically none of that information comes through - just the reactions involved and their participants. Even the mutant gene in the one I looked at was not there. So.. if and when we want to go down this road we will need to think through how to do it. Perhaps another case for working on a BioPAX level 4...
from pathways2go.
Grasping at another straw here, do the modified-residue attributes of protein (entity with accessioned sequence) instances come through, specifically ones of the genetically modified residue subclass? That's how we annotate the sequence variants that differentiate a mutant disease protein from its canonical UniProt normal counterpart. Any protein with a non-null genetically modified residue attribute is a disease protein, and any reaction involving that protein is a disease reaction.
from pathways2go.
We do get BioPAX "ModificationFeature" annotations on the mutants. These are linked to a SequenceSite and a SequenceModificationVocabulary annotation (e.g. L-arginine removal) which in turn is xrefed to something with db MOD and an id like MOD:01632 .
Getting to this info. is possible but a bit complex. My impression is that it would be easier and perhaps more consistent if we just use the disease subtree to filter these out for now.
from pathways2go.
The disease subtree should be equally reliable.
from pathways2go.
Related Issues (20)
- Test load of Mouse and Fly Reactome Pathways HOT 8
- Imported pathways missing protein/activity unit, inputs, chemicals HOT 5
- The direction of a transport reaction is backwards in the new import HOT 12
- occurs_in for Yeast Pathways is incorrect HOT 2
- There are still orphan entities in some models
- Load YeastPathways as development state into Noctua HOT 4
- Remove extraneous edges in imported models
- Emit correct datatype for date field (and maybe other fields) HOT 4
- Reactome - Discrepancy between internal database and web presentation?
- Reactome - duplicate EWAS? HOT 1
- We need the ability to load Reactome biopax test files into Noctua-dev HOT 1
- Add NAS evidence code & reference to all Yeast Pathway models HOT 5
- Exclude "molecular event" for reactions that don't occur in yeast HOT 2
- Make appropriate pathways cyclical HOT 3
- GPAD export erroneous line for molecular_event as subject HOT 1
- TNF signaling (R-HSA-75893) and TNFR1-induced NF-kappa-B signaling pathway (R-HSA-5357956) HOT 1
- TNF signaling (R-HSA-75893) and TNFR1-induced proapoptotic signaling (R-HSA-5357786)
- TNFR1-mediated ceramide production (R-HSA-5626978)
- Heme biosynthesis (R-HSA-189451)
- Carnitine synthesis (R-HSA-71262)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pathways2go.