Giter VIP home page Giter VIP logo

Comments (4)

a-r-j avatar a-r-j commented on May 28, 2024 1

Hey, thanks for this Ryan! Looks exciting!

So, I think we should keep RNA secondary structure & 3D structure separate for now. The secondary structure is functional as a standalone piece of functionality (though it would be really nice to hook it up to Nussinov or bpRNA - the largest database I know of).

With respect to 3D graphs - I had a quick look at this. I think it's actually quite straightforward as most of the components are implemented for protein structure graphs. Essentially, we can use the low-level API in graphein as building blocks and make a function more or less identical to the construct_graphs we use for proteins. The main things I saw so far that need changing:

We need some granularity options for RNA graphs

Then, we simply add a new function convert_structure_to_rna in this block eg.

RNA_ATOMS = [
    "C1'",
    "C2",
    "C2'",
    "C3'",
    "C4",
    "C4'",
    "C5",
    "C5'",
    "C6",
    "C8",
    "N1",
    "N2",
    "N3",
    "N4",
    "N6",
    "N7",
    "N9",
    "O2",
    "O2'",
    "O3'",
    "O4",
    "O4'",
    "O5'",
    "O6",
    "OP1",
    "OP2",
    "P",
]


def subset_structure_to_rna(
    df: pd.DataFrame,
) -> pd.DataFrame:
    """
    Return a subset of atomic dataframe that contains only certain atom names relevant for RNA structures.

    :param df: Protein Structure dataframe to subset
    :type df: pd.DataFrame
    :returns: Subsetted protein structure dataframe
    :rtype: pd.DataFrame
    """
    return filter_dataframe(
        df, by_column="atom_name", list_of_values=RNA_ATOMS, boolean=True
    )

but more flexible (not keeping the RNA_ATOMS fixed so users can subset as they wish)

The only other line that breaks is this one and we easily fix it by removing the three_to_1 call if we're constructing an RNA graph. Then we're good to go essentially. The graph has been populated with the nodes and we write whatever edge functions we like to go on top as per the protein API.

What I'm unfamiliar with is how we coarsen the RNA graphs. E.g. all atom is what I've described above. For proteins it's obviously very normal to consider the alpha carbon trace as representative of a residue-level graph. I'm not sure what the standard for RNA is. In any case, we can leave this open to users with the granularity param. What do you think?

from graphein.

a-r-j avatar a-r-j commented on May 28, 2024 1

Implemented in 1.5.0

from graphein.

a-r-j avatar a-r-j commented on May 28, 2024

Came across this today: https://www.biorxiv.org/content/10.1101/2022.03.14.484334v1

Might be of interest to you @rg314

from graphein.

rg314 avatar rg314 commented on May 28, 2024

Just to follow up on this... we found that the nussinov.py algo isn't great at predicting the dot-bracket notation. I suggest that we create a container running https://github.com/rg314/centroid-rna-package and ping it to get the centroid secondary structure. What do you think @a-r-j ?

from graphein.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.