Giter VIP home page Giter VIP logo

drug2ways's Introduction

drug2ways

Travis CI License DOI

Drug2ways is a Python package for reasoning over paths on biological networks for drug discovery

QuickstartApplicationsInstallation

Quickstart

Drug2ways supports generic network formats such as JSON, CSV, GraphML, or GML. Check out drug2ways's documentation here. Ideally, the network should contain three different types of nodes representing drugs, proteins, and indications/phenotypes. The hypothesis underlying this software is that by reasoning over a multitude of possible paths between a given drug and indication, the drug regulates the indication in the direction of the signs of the most frequently occurring paths (i.e., majority rule). In other words, we assume that a drug has a greater likelihood of interacting with its target, and its target with intermediate nodes, to modulate a pathological phenotype as the number of possible paths connecting a drug to the phenotype increases. Based on this hypothesis, this software can be applied for different applications outlined in the next section.

Citation

If you use drug2ways for your research please cite our paper:

Daniel Rivas-Barragan, Sarah Mubeen, Francesc Guim-Bernat,Martin Hofmann-Apitius, and Daniel Domingo-Fernández (2020). Drug2ways: Reasoning over causal paths in biological networks for drug discovery. PLOS Computational Biology 16(12): e1008464; https://doi.org/10.1371/journal.pcbi.1008464

Applications

Drug2ways can be applied for three different applications:

Scripts and real examples: https://github.com/drug2ways/drug2ways/tree/master/examples

1. Identifying candidate drugs

The following command of the command line interface (CLI) of drug2ways enables candidate drug identification. The minimum required input are the path to the network and its format, a path to the nodes considered as drugs and the ones considered as conditions/phenotypes. Finally, the maximum length allowed for a given path (i.e., lmax). Type "python -m drug2ways explore --help" to see other optional arguments.

python -m drug2ways explore \
       --graph=<path-to-graph> \
       --fmt=<format> \
       --sources=<sources> \
       --targets=<targets> \
       --lmax=<lmax>

2. Optimization of drugs' effects

The following command of the CLI of drug2ways enables searching drugs that not only target a given disease but also activate/inhibit a set of phenotypes. This method requires the same arguments as the previous explore functionality but the target file requires an additional second column where the desired effect on the node (e.g., 'node1,activate') is specified. See the examples directory for more information.

python -m drug2ways optimize \
       --graph=<path-to-graph> \
       --fmt=<format> \
       --sources=<sources> \
       --targets=<targets> \ # Note that this file is slightly different than the other targets
       --lmax=<lmax>

3. Proposing combination therapies

The following command of the CLI of drug2ways enables the identification of candidate drugs for combination therapies. The minimum required input are the path to the network and its format, a path to the nodes considered as drugs and the ones considered as conditions/phenotypes. As with the optimization command, here again the target file requires an additional second column specifying the desired effect on the node (e.g., 'node1,activate'). Furthermore, the maximum length allowed for a given path (i.e., lmax) and the possible number of combinations of drugs must be provided. Type "python -m drug2ways combine --help" to see other optional arguments.

python -m drug2ways combine \
       --graph=<path-to-graph> \
       --fmt=<format> \
       --sources=<sources> \
       --targets=<targets> \
       --lmax=<lmax> \
       --combination-length=<number>

Installation

Documentation Stable Supported Python Versions PyPi

The latest stable code can be installed from PyPI with:

python -m pip install drug2ways

The most recent code can be installed from the source on GitHub with:

python -m pip install git+https://github.com/drug2ways/drug2ways.git

For developers, the repository can be cloned from GitHub and installed in editable mode with:

git clone https://github.com/drug2ways/drug2ways.git
cd drug2ways
python -m pip install -e .

Requirements

click==7.1.1
tqdm==4.47.0
networkx>=2.1
pandas==1.0.3
networkx>=2.4
numpy
scipy
statsmodels

drug2ways's People

Contributors

cthoyt avatar ddomingof avatar sarahbeenie avatar yojanagadiya avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

drug2ways's Issues

Error on missing nodes

Why do i get an error here?

    raise ValueError(
ValueError: ('The following target nodes are not in the graph (12/18):', "({'mondo:0010200', 'go:0050890', 'go:0072678', 'go:0042098', 'go:0001779', 'go:0008219', 'go:0042110', 'ncbigene:2806', 'ncbigene:2875', 'go:0072593', 'go:0046911', 'go:0001787'})")

Update PyBEL dependency

What's the rational behind using a significantly outdated version of PyBEL?

If it's because of PathMe, then you're shooting the usability of the package in the foot

Fix slow iteration during loading of network

If you use df.values you get potentially 100-1000x speed in loading.

for _, row in tqdm(df.iterrows(), total=df.shape[0], desc='Loading graph'):
# Get node names from data frame
sub_name = row[SOURCE]
obj_name = row[TARGET]
relation = row[RELATION]

Alternatively, just use the super-fast method for loading from a pandas dataframe provided by networkx:

https://networkx.github.io/documentation/networkx-1.10/reference/generated/networkx.convert_matrix.from_pandas_dataframe.html

Give results in reasonable data structure

Right now, the following code is taking highly structured data and putting it into a string that needs to be parsed again later

sorted_most_common_nodes = [
f"{node} ({count})"
for node, count in counter.most_common()
if count > min_count and (count * 100) / total > min_proportion
# Threshold on absolute count and proportion
]

If there are two pieces of data that need to go together, either:

  1. Don't use a dataframe as the datastructure to hold it. JSON is probably better.
  2. make two columns

Error reading example files

When trying to execute the example with the custom network, I found an error in graph_reader.py because the relation is defined as 'polarity' in the constants.py script, while in the .tsv file it only finds the column 'relation'. I suppose I could avoid this error by changing the column name of the .tsv or modifying the constants.py file. However, I find it strange that in none of the example networks is there a column named 'polarity' and I was wondering if there is something I am not taking into account.

Report name of pathway database that was being used for enrichment

The following code seems to use a hard-coded integer point to some pathway database, but it's not computationally possible to figure out what it is

# TODO: currently using one pathway database
enrichment_results = pathway_enrichment(df, genesets[0])

Idea on how to fix it:

The genesets object could be a dictionary from Identifiers.org prefix to the actual data, rather than a list that doesn't say what database it is at each position. This means updating get_genesets

def get_genesets():
"""Get gene sets as dicts."""
return (
parse_gmt_file(KEGG_GENESETS),
parse_gmt_file(REACTOME_GENESETS),
parse_gmt_file(WIKIPATHWAYS_GENESETS),
)

to return a dictionary, which seems pretty obvious, like this:

def get_genesets() -> Mapping[str, ...]: 
     """Get gene sets as dicts.""" 
     return {
         'kegg.pathway': parse_gmt_file(KEGG_GENESETS), 
         'reactome': parse_gmt_file(REACTOME_GENESETS), 
         'wikipathways': parse_gmt_file(WIKIPATHWAYS_GENESETS), 
     }

Or even better, you switch over to external ComPath code that abstracts all of this away from the identities of the databases, so as new ones get added they get put in analysis automatically.

example data 404

Dear,
I found out that all the example data links returned to 404, can you share this via another link? Thanks

image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.