Giter VIP home page Giter VIP logo

go-pombase's Introduction

go-pombase

Code for generating process-centric GO-CAM models from GAFs.

Working towards creating GO-CAMS by inputting a list of annotations (either GAF or directly ontobio association objects). Right now, this takes a GO biologcal process term as input, does some heuristic gene set calculation and generates a GO-CAM ttl for the BP term’s gene set. Separating this gene set logic from the annotation-to-GO-CAM logic is another goal.

Running

pip install -r requirements.txt

As this is coded right now for a specific use case, this can be ran simply by inputting a GO BP term, source GAF filename, and a destination filename:

python3 generate_rdf.py -t "GO:0010971" -g "gene_association.pombase" -f "filename.ttl"

With the source GAF filename argument this now frees up the library to create GO-CAM models from any set of GAF's, not just ones pertaining to S. pombe. The example GAF can be downloaded from ftp://ftp.geneontology.org/pub/go/gene-associations/.

Running for generating PomBase GO-CAM models

For my purpose right now I'm running generate_pombase_model.py specifying BP term (-t), output filename (-f), and GAF input file (-g):

python3 generate_pombase_model.py -t 'GO:0031929' -f 'TOR signaling.ttl' -g 'gene_association.pombase'

Reusing computed gene-to-BP term dictionary data

You can also specify the data (-j) to use in the first step in order to speed up processing during repeated runs (~1.5 min -> 10 sec):

python3 generate_pombase_model.py -j 'tad_go_gafs.json' -t 'GO:0031929' -f 'TOR signaling.ttl' -g 'gene_association.pombase'

To dump out this data into a reusable JSON, you can run:

python3 pombase_direct_bp_annots_query.py -j 'json_outfile.json' -g 'gene_association.pombase'

With -j specifying the JSON output path.

Dependencies

Requires ontobio.

go-pombase's People

Contributors

dougli1sqrd avatar dustine32 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

go-pombase's Issues

Should only use protein binding annotations if there’s no extension that already links the two gene products (via an MF node)

The code to generate models using generate_pombase_model.py currently adds "protein binding with" connections between two genes even if there's already a extensions-derived connection (e.g. has_direct_input) for those two genes. For example:
image
In this image there are redundant "with" connections (displayed as two triples: gene1-enabled by- "protein binding" activity and "protein binding"-has_input-gene2) between sty1-atf1 and sty1-wis1.

Standardize model titles

Follow something like:

PomBase_[BP term]_[BP term label]

Also be aware that minerva dumping out the imported model will name the file [UUID].ttl, which can be different than the initially imported file name and thus can result in duplicate models. I should figure out how to pre-import assign the correct UUID filename.

Only use annotations with experimental evidence codes (or ND)

Need to add a filter to annotations for only experimental evidence codes or ND:
EXP
IDA
IPI
IMP
IGI
IEP
ND

Example: in model generated for GO:0031929 - TOR signaling, “enzyme regulator activity” is used for ste20 even though its evidence code is "IBA".

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.