Comments (4)
Hey, thanks for this Ryan! Looks exciting!
So, I think we should keep RNA secondary structure & 3D structure separate for now. The secondary structure is functional as a standalone piece of functionality (though it would be really nice to hook it up to Nussinov or bpRNA - the largest database I know of).
With respect to 3D graphs - I had a quick look at this. I think it's actually quite straightforward as most of the components are implemented for protein structure graphs. Essentially, we can use the low-level API in graphein as building blocks and make a function more or less identical to the construct_graphs
we use for proteins. The main things I saw so far that need changing:
We need some granularity
options for RNA graphs
Then, we simply add a new function convert_structure_to_rna
in this block eg.
RNA_ATOMS = [
"C1'",
"C2",
"C2'",
"C3'",
"C4",
"C4'",
"C5",
"C5'",
"C6",
"C8",
"N1",
"N2",
"N3",
"N4",
"N6",
"N7",
"N9",
"O2",
"O2'",
"O3'",
"O4",
"O4'",
"O5'",
"O6",
"OP1",
"OP2",
"P",
]
def subset_structure_to_rna(
df: pd.DataFrame,
) -> pd.DataFrame:
"""
Return a subset of atomic dataframe that contains only certain atom names relevant for RNA structures.
:param df: Protein Structure dataframe to subset
:type df: pd.DataFrame
:returns: Subsetted protein structure dataframe
:rtype: pd.DataFrame
"""
return filter_dataframe(
df, by_column="atom_name", list_of_values=RNA_ATOMS, boolean=True
)
but more flexible (not keeping the RNA_ATOMS
fixed so users can subset as they wish)
The only other line that breaks is this one and we easily fix it by removing the three_to_1
call if we're constructing an RNA graph. Then we're good to go essentially. The graph has been populated with the nodes and we write whatever edge functions we like to go on top as per the protein API.
What I'm unfamiliar with is how we coarsen the RNA graphs. E.g. all atom is what I've described above. For proteins it's obviously very normal to consider the alpha carbon trace as representative of a residue-level graph. I'm not sure what the standard for RNA is. In any case, we can leave this open to users with the granularity
param. What do you think?
from graphein.
Implemented in 1.5.0
from graphein.
Came across this today: https://www.biorxiv.org/content/10.1101/2022.03.14.484334v1
Might be of interest to you @rg314
from graphein.
Just to follow up on this... we found that the nussinov.py algo isn't great at predicting the dot-bracket notation. I suggest that we create a container running https://github.com/rg314/centroid-rna-package and ping it to get the centroid secondary structure. What do you think @a-r-j ?
from graphein.
Related Issues (20)
- PDB structure culstering HOT 4
- FoldCompDataset calls graphein.protein.tensor.Protein with incorrect __init__ arguments HOT 2
- Error during `Getting requirements to build wheel` HOT 3
- Running colab exmaple got an error ModuleNotFoundError: No module named 'torch_geometric' HOT 2
- `graphein.protein.edges.distance.add_distance_threshold` should allow computing distance by "any to any" atomistic even with "residue" granularity set
- can't read pdb ends with .pdb1.gz HOT 3
- [Docs] API Reference for `graphein.ml` is broken
- `ProteinGraphDataset` fails if a single graph construction fails.
- dssp version argument HOT 2
- add insertion code to node_id when insertoins are set to True HOT 2
- Sidechain torsion angle computation fails on examples containing Pyrolysin
- error in the tutorial HOT 1
- docker compose fails to build due to jupyter nbextension HOT 3
- keep_het parameter not working HOT 2
- deprotonate=False not working? HOT 5
- `pip install` fails on fresh `python==3.11` and `python==3.10` conda environment HOT 1
- Error occurred while Constructing edges when reading PDB file. HOT 7
- Error occurred while constructing edges from pdb files HOT 1
- TypeError: construct_graph() got an unexpected keyword argument 'pdb_path' HOT 2
- convert_nx_to_pyg doubles edge_index but not kind
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from graphein.