Giter VIP home page Giter VIP logo

Comments (11)

deustp01 avatar deustp01 commented on July 29, 2024

Yes! We have treated case 2 as obviously true for a long time, but I'm not sure the code to enable it has ever been implemented in the GO-CAM conversion process. And when it is implemented, it should also generate a report of all the catalystActivity instances that it fixed in this way, to be fed back to Reactome to patch the Reactome annotations that are the single source of truth here. @dustine32 ?

from pathways2go.

dustine32 avatar dustine32 commented on July 29, 2024

@deustp01 Yes, we can log out the number 2 cases where the resulting "complex" only has a single has_part GP.

@thomaspd If you can find an example complex with the active unit specified that would help. I'll keep looking too.

from pathways2go.

dustine32 avatar dustine32 commented on July 29, 2024

Also, asking @huaiyumi for any examples of active subunit annotation in Reactome that I can look for in the BioPAX.

from pathways2go.

deustp01 avatar deustp01 commented on July 29, 2024

There are 1784 catalystActivity instances in our central database whose activeUnit slot is not null. Let me figure out who to ask here to get you a table of the subset of these instances that has actually been released. We should be able to generate a table of the dbID of each instance, its physicalEntity (the complex), its activeUnit (the individual EWAS gene product), and the dbID and name of the reaction in which it occurs. Are there other attributes you'd want in the table?

from pathways2go.

deustp01 avatar deustp01 commented on July 29, 2024

@dustine32 but meanwhile, here is a short arbitrary list of catalystActivity instances whose physicalEntity is a heteromeric complex and whose activeUnit is a protein monomer, as a starting point to begin to explore the BioPAX to see what can be done on point 2, above, and what a useful format would be for bulk processing.

https://reactome.org/content/schema/instance/browser/1806156
https://reactome.org/content/schema/instance/browser/5676051
https://reactome.org/content/schema/instance/browser/6798176
https://reactome.org/content/schema/instance/browser/1806283
https://reactome.org/content/schema/instance/browser/8868073
https://reactome.org/content/schema/instance/browser/5358378
https://reactome.org/content/schema/instance/browser/109879
https://reactome.org/content/schema/instance/browser/9836928

Each URL points to a page that lists the names and dbIDs of the heteromeric complex, the protein monomer activeUnit, and the reaction that the caqtalystActivity mediates.

I can also make a list of samples of catalystActivity instances where the physicalEntity is a set of heteromeric complexes and the activeUnit is a set of monomers or a set of subcomplexes, also of cases where the heteromeric complex involves both protein and non-protein (RNA or DNA or small-molecule) subunits, if any of those are of interest.

I hope, from this test material, we can figure out what you need in a comprehensive list.

from pathways2go.

ukemi avatar ukemi commented on July 29, 2024

@dustine32 Does the catalyst activity here help? R-HSA-21271

from pathways2go.

dustine32 avatar dustine32 commented on July 29, 2024

@deustp01 @ukemi Thank you for these examples! I don't really need the full list of all activeUnits as these few helped me find where in the BioPAX I can expect to find them. An example for reaction R-HSA-1675883:

  <bp:Catalysis rdf:ID="Catalysis1698">
    <bp:controller rdf:resource="#Complex3671" />
    <bp:controlled rdf:resource="#BiochemicalReaction3397" />
    <bp:controlType rdf:datatype="http://www.w3.org/2001/XMLSchema#string">ACTIVATION</bp:controlType>
    <bp:xref rdf:resource="#RelationshipXref3080" />
    <bp:xref rdf:resource="#RelationshipXref3090" />
    <bp:dataSource rdf:resource="#Provenance1" />
    <bp:comment rdf:datatype="http://www.w3.org/2001/XMLSchema#string">activeUnit: #Protein9680</bp:comment>
  </bp:Catalysis>

Here, the activeUnit, which eventually points to PI4KB [Golgi membrane] (Homo sapiens) for complex ARF1/3:GTP:PI4KB, is embedded in a comment field. Not the greatest feeling about this placement but it'll definitely do for now!

from pathways2go.

deustp01 avatar deustp01 commented on July 29, 2024

is embedded in a comment field

If I understand what you're saying correctly, yes, if you look at the instancebrowser view for the EWAS PI4KB [Golgi membrane] (Homo sapiens) its role as the activeUnit of a complex involved in catalysis is shown as a comment. But if you come at the annotation from the other direction - start with the catalystActivity instance 1-phosphatidylinositol 4-kinase activity of ARF1/3:GTP:PI4KB [Golgi membrane] then its role is shown as an attribute. Or am I misunderstanding the problem?

Also, it makes sense to me to work starting from reactions that have associated catalystActivities, systematically looking at what the physicalEntity of each catalystActivity is, and if that physicalEntity is not at EWAS or set of EWASs, then proceed further to see if it fits case 2 above.

Also also, if a by-product of this survey were a list of catalystActivites where the physicalEntity is a complex or set of complexes but the activeUnit slot is null, that list would be the starting point for re-curation to fill the empty slots. And if in each case the components of the complex could be checked in the central GO annotation file to see if any have been assigned the same GO molecular function as Reactome has assigned to the whole complex, that would make the re-curation process at Reactome much faster and more reliable. @ukemi I know we talked about something like this with Ben Good; I don;t know how close he got to implementing it.

Or does this last part duplicate work you've already done to generate the tables described in #296 (which I haven't looked at yet)?

from pathways2go.

deustp01 avatar deustp01 commented on July 29, 2024

And if in each case the components of the complex could be checked in the central GO annotation file to see if any have been assigned the same GO molecular function as Reactome has assigned to the whole complex, that would make the re-curation process at Reactome much faster and more reliable. @ukemi I know we talked about something like this with Ben Good; I don;t know how close he got to implementing it.

That failed - in many cases the catalystActivity of the whole complex has been assigned to all of its protein components - but perhaps re-doing it with the cleaned-up fly set of complex component functions would yield good results.

from pathways2go.

deustp01 avatar deustp01 commented on July 29, 2024

Summarizing the discussion so far as a to-do list.

  • Identify cases where Reactome has assigned a catalystActivity to an entire heteromeric complex but experimental data identify one of the gene product components of the complex as capable of the catalystActivity. Annotate that component of the complex as its activeUnit.
  • In the case of a homomeric complex, creation of activeUnit annotations is formally redundant, but is it useful to support accurate parsing of Reactome BioPAX entries into GO-CAM models?
  • In cases where the catalystActivity is an emergent property of two or more gene product components of the complex, current GO documentation suggests that correct GO annotation is to assert that all of these components "contribute to" the catalystActivity. In Reactome, the activeUnit attribute of a catalystActivity instance can be multivalued. Could we enforce a new curation standard in Reactome, connected to a new rule for parsing BioPAX into GO-CAM, that if the activeUnit slot has a single gene product entry, that gene product "enables" the catalystActivity while if the slot has more than one value, each of those gene products "contributes to" the molecular activity?
  • To annotate regulatory subunits of multimeric complexes, Reactome allows creation of regulation instances whose physicalEntity is a complex and whose [regulatory] activeUnit is a gene product subunit of the complex, so regulatory abilities of individual subunits of the catalytic complex could be annotated just as catalytic abilities are, and emergent regulatory abilities of two or more subunits could be annotated just as emergent catalytic activities are.
  • Can we capture
  • In all of this, can the cleaned-up list of annotations of specific functions to specific components of Drosophila complexes (previous comment) be useful to predict specific functions of specific components of the homologous human complexes?
    Tag @huaiyumi @vanaukenk @sjm41 to be sure they're on this ticket

from pathways2go.

deustp01 avatar deustp01 commented on July 29, 2024

@deustp01 Yes, we can log out the number 2 cases where the resulting "complex" only has a single has_part GP.

@dustine32 @ukemi just to document the current state / need, here is an active unit in the first reaction of the "carnitine biosynthesis" pathway and in the derived GO-CAM. The physical entity is a complex involving one copy of one gene product and one copy each of a couple of chemical entities. Can the GO-CAM generation script be re-done to identify the gene product and make it the enabler?

Screenshot 2023-11-14 at 5 47 50 PM

Or if that is hard or dangerous, can we plan to generate a list (partial is OK to start) of the number 2 cases, that we can use to figure out how to bulk-edit the Reactome annotations to add the missing activeUnit annotations, so that the existing GO-CAM genertion script can use them? (A practical issue here is whether David and I, as we (re)curate pathways should add this information manually as part of our work, or leave it out because a script will soon be available to do it automatically?

from pathways2go.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.