Giter VIP home page Giter VIP logo

Comments (19)

ukemi avatar ukemi commented on September 24, 2024

The with/from field qualifies (or was originally intended to qualify) the evidence, so the individuals should be collapsed, but each ref/evidence_code/with is a separate piece of evidence.

The one caveat to that is that I thought we had decided to express binding annotations in a formally correct way with the bound entity being an input. then we would spit them back out as they currently are in the with field. @vanaukenk is that still the plan?

from gocamgen.

vanaukenk avatar vanaukenk commented on September 24, 2024

@ukemi
Yes, that is still the plan for protein binding annotations. If we hadn't done this for protein binding, then not only would we have been inconsistent with GO-CAM, but we would not have been able to collapse the evidence for this particular term.

from gocamgen.

dustine32 avatar dustine32 commented on September 24, 2024

OK, I think we can still work with that protein binding caveat. I now see the protein binding section on the wiki.

So basically, DO split out distinct with/from values into multiple assertions (translated with different has_input entities) IF the primary term is protein binding (and descendants or just GO:0005515?), ELSE collapse these different with/from values onto the same assertion individual.

@ukemi @vanaukenk Sound correct?

from gocamgen.

ukemi avatar ukemi commented on September 24, 2024

Yes. That sounds correct.

from gocamgen.

dustine32 avatar dustine32 commented on September 24, 2024

Cool, thanks! I updated the header/line thingy in my first comment to clarify how with/from is being handled.

from gocamgen.

vanaukenk avatar vanaukenk commented on September 24, 2024

@dustine32

We will want to split out distinct With/From values into multiple assertions for GO:0005515 and also its children, as we may have annotations to terms like 'protein kinase binding' GO:0019901 that still refer to different entities in the With/From field.

I'll update the protein binding section of the wiki to make it clearer which of the options we chose.

from gocamgen.

vanaukenk avatar vanaukenk commented on September 24, 2024

@dustine32

I've updated the import rules section for protein binding. Please let me know if anything is unclear or doesn't look right to you:

http://wiki.geneontology.org/index.php/Noctua_MOD_Imports#Protein_Binding_Annotations

Thx.

from gocamgen.

dustine32 avatar dustine32 commented on September 24, 2024

Thanks @vanaukenk ! That definitely is more straight-forward. I just wanted to make sure I understand the last point:

Evidence will not be combined for annotations to protein binding or its children

So multiple GPAD lines with same GP-term-with/from-etc (the header fields above) values won't be collapsed into the same assertion if their evidence code-references (line fields) vary? More simply, each protein binding GPAD line will have its own assertion individual in GO-CAM?

from gocamgen.

ukemi avatar ukemi commented on September 24, 2024

Is this true even for annotation lines that have the same value in the 'with' field?
Ex:

MGI MGI:1340046 enables GO:0005515 MGI:MGI:4845793|PMID:21068328 ECO:0000353 UniProtKB:Q8K1S1 20130311 MGI
MGI MGI:1340046 enables GO:0005515 MGI:MGI:4441002|PMID:20220021 ECO:0000353 UniProtKB:Q8K1S1 20130311 MGI

from gocamgen.

vanaukenk avatar vanaukenk commented on September 24, 2024

@dustine32
Yes, we can combine protein binding GPAD lines if they are the same GP-term-with/from but different references. @ukemi - is that what you are meaning to illustrate above?

from gocamgen.

vanaukenk avatar vanaukenk commented on September 24, 2024

image
Another potential illustration.
These three annotations could be combined as evidence for a single Noctua instance since they refer to the same binding (EGL-1 binds CED-9) but just cite different references (and have different annotation dates).

from gocamgen.

vanaukenk avatar vanaukenk commented on September 24, 2024

Whereas in this example (just looking at the WB annotations, ignore the SWIE one), we would have two separate protein binding annotations:

image

One that combined three pieces of evidence and the WB:WBGene00000418 in the With/From field and one with a single piece of evidence and WB:WBGene00001170.

from gocamgen.

vanaukenk avatar vanaukenk commented on September 24, 2024

Here's an example where it looks like we could combine evidence for a cellular component annotation, but haven't yet:

image

image

ced-9 (WB:WBGene00000423)

lat-1 (WB:WBGene00002251) is another example of two CC annotations whose evidence could be merged into a single individual.

from gocamgen.

dustine32 avatar dustine32 commented on September 24, 2024

@vanaukenk Oh ok, that's what I was thinking too. Differing references alone shouldn't require multiple assertion individuals. I also forgot that all protein binding and descendant term annotations should be using the same IPI evidence code, so that removes one variable.

Thanks so much for clarifying!

from gocamgen.

dustine32 avatar dustine32 commented on September 24, 2024

@vanaukenk I think those two examples are now collapsing correctly. Here they are on my dev server:

from gocamgen.

vanaukenk avatar vanaukenk commented on September 24, 2024

@dustine32 - the two CC examples above are indeed now collapsing correctly. Thanks!

from gocamgen.

ukemi avatar ukemi commented on September 24, 2024

@vanaukenk have you set up a formal testing document? If not, I will have a shot at it.

from gocamgen.

vanaukenk avatar vanaukenk commented on September 24, 2024

Yes, I started with this spreadsheet here:

https://docs.google.com/spreadsheets/d/1XFuD6LOyFKXNk94jIK8zv1TrESfwCJo-RnrXQ3tzmJg/edit

from gocamgen.

dustine32 avatar dustine32 commented on September 24, 2024

@vanaukenk @ukemi The latest iteration of WB, MGI models are now up on noctua-dev so this can be tested there now. This won't have the fix for the comma-separated with/from snafu that @ukemi pointed out here but I've since fixed it on my USC server.

Here are some stats from the import attached to the PR.

from gocamgen.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.