Comments (26)
Also keep in mind that at the moment, we still say 'NO' to drugs. So this should get lower priority than the other bugs.
from pathways2go.
I am fine with filtering at an earlier step. But just in case we ever wanted to include the drug reactions in a later iteration of this work, is it possible to write the filter in such a way that it could be ignored? @deustp01 are you ok with this?
from pathways2go.
@ukemi Oh yes, good point! We can add an option to -include_drug_rxns
so that the default behavior is to filter these out but they can still be included on demand.
from pathways2go.
Sounds good - nothing to add form here.
from pathways2go.
@ukemi @deustp01 Sort of related to the bug here, I've tried testing the -include_drug_rxns
option and it exposes an issue with some ID prefixes including spaces when converted to an IRI. Specifically, "Guide to Pharmacology_4518" for DCA in our example model R-HSA-204174. I believe we should really be extracting the ChEBI ID (CHEBI:28240
) for this but it's in a different "BioPAX place" than expected by the code (it's in a RelationshipXref
of the SmallMolecule
entity instead of a UnificationXref
of the SmallMolecule
entity's EntityReference
).
For DCA and perhaps other drugs, should we be using the ChEBI ID if available?
from pathways2go.
For DCA and perhaps other drugs, should we be using the ChEBI ID if available?
This looks do-able in principle (ChEBI should either already have or be able to create instances for all or drugs) but a horrible legacy clean-up because of our 1186 referenceTherapeutic instances, only 13 now have ChEBI ID's as crossReference slot values
Meanwhile 1085 have Guide to Pharmacology identifier values and 85 have PubChem identifier values. Our curation guideline (and also our collaboration with the GtoP team) mandates use of GtoP identifiers whenever possible, and GtoP in turn is willing to coin new ones for us as needed. Would it be helpful to get GtoP identifiers for all of our referenceTherapeutics? That's probably easier than getting and applying ChEBI ID's. Or does this miss the point - am I hacking around a problem or evading it rather than solving it properly?
from pathways2go.
Ultimately, this is a decision for @deustp01, but I do notice that Reactome sometimes has both the "Guide to Pharmacology" and a ChEBI identifier for the drugs I looked at (for example in the aspirin pathway). Sometimes they don't (Azathioprine pathway. As long as we can still identify them as drugs, I think the ChEBI identifier being primary would keep things parallel with our other data representation. @deustp01, do you systematically try to assign ChEBI identifiers to drugs? I notice that in R-HSA-9751051, the 6MP has a ref to GtP 7226 and the ChEBI id 50667 is not xref'd.
from pathways2go.
Sorry @deustp01! I think we were typing at the same time. You answered my question.
from pathways2go.
do you systematically try to assign ChEBI identifiers to drugs?
Posts crossed - no, it's the reverse: GtoP is mandated and ChEBI is optional.
from pathways2go.
a horrible legacy clean-up
There may be a workaround. I just checked a very small sample of GtoP pages, the one for aspirin and see that in the cross-reference section of that page they give the mapping to the ChEBI ID, so perhaps we can mine the information from there to populate our referenceTherapeutic instances with ChEBI IDs or to create a look-up table to be used at the stage of GO-CAM generation? (My hunch is that populating Reactome is the correct way to do it but populating GO-CAMs is easier for Reactome.)
from pathways2go.
@deustp01 The (very) few drugs I've checked on Reactome all seem to have the ChEBI ID in their cross-references section.
I think if this is consistent for drugs I can add some logic to dig this out, no lookup table required.
@ukemi I'll just commit the orphan fix I have right now along with my ChEBI hack even if the hack is never used.
from pathways2go.
As far as I can tell, referenceTherapeutic instances with ChEBI crossReference slot values are very rare (10 of 1037 instances):
Aspirin is indeed one of the rare ones:
from pathways2go.
@deustp01 @ukemi I just came across the drug carbovir monophosphate in pathway R-HSA-2161541. This is recognized as a drug in Reactome but does not have either a "Guide to Pharmacology" or an "IUPHAR" prefixed xref - instead, it has PubChem Compound:135565291
. (Again, the space in the prefix here is killing the OWL writer)
Should I add "PubChem Compound" to the list of "is a drug" prefixes?
from pathways2go.
@dustine32 Do you mean that all PubChem Compounds will be classified as drugs or things classified as drugs are allowed to have PubChem xrefs? I think the latter is certainly fine. @deustp01 do you know if there are 'physiological' chemicals that have a PubChem Compound xref in reactome? I found some pubchems, but am probably searching very inefficiently.
from pathways2go.
do you know if there are 'physiological' chemicals that have a PubChem Compound xref in reactome?
I don't know what we have actually done, but historically we allow use of PubChem Compound instances as xrefs in cased where no ChEBI instance is available, and while we have tried to clean these up by getting needed ChEBI instances, I don't know if we have completed the job. Equally important, there is nothing to prevent someone from annotating a fascinating chemical tomorrow that is unknown to ChEBI.
from pathways2go.
So I don't think we can assume that all PubChem identifiers refer to drugs. Actually, I found some PubChem, but no other PubChem Compound cases.
from pathways2go.
it has
PubChem Compound:135565291
. (Again, the space in the prefix here is killing the OWL writer)
Are internal spaces in names always fatal to OWL? (I guess they are.) If so, are underscore characters OK, or would it be better to run the words together with capitalization? I.e., which is better for you, "PubChem_Compound", "PubChem Compound", or "PubChemCompound". At various places in Reactome, we use all three of these, as well as other characters like "(", as shown in our list of Reference database instances, but I suspect that it would not be too hard or too dangerous for us to standardize on one scheme, especially if this would bring us into compliance with an existing stable widely used community standard.
from pathways2go.
Naive question, but what's the difference between just PubChem and 'PubChem Compound'? Is it just due to inconsistency?
from pathways2go.
Never mind. I followed your link and answered my question.
from pathways2go.
PubChem and 'PubChem Compound
PubChem comes in two sections, PubChem Compound and PubChem Substance. In my mind they are sort of like SwissProt (manually curated are more or less uniquified) and TrEMBL (automatically compiled so more extensive but with no guarantee of uniqeness or consistency), respectively, so as a matter of editorial policy we should use PubChem Compond but the same resource gaps that push us to any kind of PubChem instead of ChEBI can also push us to PubChem Substance. And in fact I see 6 uses of PubChem Substance as a reference database for a reference therapeutic.
We should never use just plain "PubChem".
from pathways2go.
@ukemi I guess I simple-mindedly suggested that all "PubChem Compound" entities might be drugs but that's quite an assumption. Basically, I need something from this carbovir monophosphate class in the BioPAX to tell the GO-CAM conversion that it's a drug.
from pathways2go.
tell the GO-CAM conversion that it's a drug.
Apologies for the senior moment - we must have discussed this earlier in the "say no to drugs" process, but why doesn't the "type | chemical drug" attribute do the job. We probably need to say "no" to a few things that lack this type attribute but I expect we will want to say "no" to all that do have it?
from pathways2go.
@deustp01 from https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4702940/
"PubChem consists of three inter-linked databases, Substance, Compound and BioAssay. The Substance database contains chemical information deposited by individual data contributors to PubChem, and the Compound database stores unique chemical structures extracted from the Substance database. Biological activity data of chemical substances tested in assay experiments are contained in the BioAssay database."
I agree that Compound is the right one to use, if possible.
from pathways2go.
I.e., which is better for you, "PubChem_Compound", "PubChem Compound", or "PubChemCompound".
@deustp01 Actually, I see "PubChem_Compound" is already used in the go_context.jsonld
prefix file:
https://github.com/prefixcommons/biocontext/blob/8f2b3226d4a5e5151fd7940ca1f775b5b767fe74/registry/go_context.jsonld#LL149C10-L149C26
So "PubChem_Compound" would maybe be preferred as it's less likely to cause ID resolving issues.
"type | chemical drug" attribute
@deustp01 Oh let me see if this is in the BioPAX. I assumed it was not.
from pathways2go.
@deustp01 Oh let me see if this is in the BioPAX. I assumed it was not.
Nope. I can't find anything like a type = ChemicalDrug
or any text for "Drug", "drug", or "DRUG" in the BioPAX for this pathway. @deustp01 Is there some field that can be used in the BioPAX to carry this type info?
from pathways2go.
Tagging @adamjohnwright on this thread.
from pathways2go.
Related Issues (20)
- providedBy: URLs inconsistent between users from same group HOT 3
- Header on the "clone evidence" popup has a gomodel: that doesn't match the model title HOT 1
- Triglyceride catabolism (R-HSA-163560)
- EGFR Downregulation (R-HSA-182971)
- Vitamin B1 (thiamin) metabolism (R-HSA-196819)
- DUPLICATE: Vitamin B1 (thiamin) metabolism (R-HSA-196819)
- Work with Reactome to modify the Reactome2GO xref file. HOT 1
- Vitamin B2 (riboflavin) metabolism (R-HSA-196843) HOT 6
- YeastPathways models have imaginary DB ref HOT 4
- Merge manually-curated YeastPathways2BP into conversion code HOT 1
- YeastPathways - Intermediate small molecules shared between reactions HOT 11
- Vitamin B5 (pantothenate) metabolism (R-HSA-199220)
- Vitamins B6 activation to pyridoxal phosphate (R-HSA-964975)
- Biotin transport and metabolism (R-HSA-196780)
- Molybdenum cofactor biosynthesis (R-HSA-947581)
- Test load of Mouse and Fly Reactome Pathways HOT 8
- Imported pathways missing protein/activity unit, inputs, chemicals HOT 5
- The direction of a transport reaction is backwards in the new import HOT 12
- occurs_in for Yeast Pathways is incorrect HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pathways2go.