Giter VIP home page Giter VIP logo

decagon's Introduction

Decagon: Representation Learning on Multimodal Graphs

Overview

This repository contains code necessary to run the Decagon algorithm. Decagon is a method for learning node embeddings in multimodal graphs, and is especially useful for link prediction in highly multi-relational settings. See our paper for details on the algorithm.

Usage: Polypharmacy

Decagon is used to address a burning question in pharmacology, which is that of predicting safety of drug combinations.

We construct a multimodal graph of protein-protein interactions, drug-protein target interactions, and polypharmacy side effects, which are represented as drug-drug interactions, where each side effect is an edge of a different type.

Decagon uses graph convolutions to embed the multimodal graph in a compact vector space and then uses the learned embeddings to predict side effects of drug combinations.

Running the code

The setup for the polypharmacy problem on a synthetic dataset is outlined in main.py. It uses a small synthetic network example with five edge types. Run the code as following:

$ python main.py

The full polypharmacy dataset (described in the paper) is available on the project website. To run the code on the full dataset first download all data files from the project website. The polypharmacy dataset is already preprocessed and ready to use. After cloning the project, replace the synthetic example in main.py with the polypharmacy dataset and run the model.

Citing

If you find Decagon useful for your research, please consider citing this paper:

@article{Zitnik2018,
  title     = {Modeling polypharmacy side effects with graph convolutional networks.},
  author    = {Zitnik, Marinka and Agrawal, Monica and Leskovec, Jure},
  journal   = {Bioinformatics},
  volume    = {34},
  number    = {13},
  pages     = {457โ€“466},
  year      = {2018}
}

Miscellaneous

Please send any questions you might have about the code and/or the algorithm to [email protected].

This code implements several different edge decoders (innerproduct, distmult, bilinear, dedicom) and loss functions (hinge loss, cross entropy). Many deep variants are possible and what works best might depend on a concrete use case.

Requirements

Decagon is tested to work under Python 2 and Python 3.

Recent versions of Tensorflow, sklearn, networkx, numpy, and scipy are required. All the required packages can be installed using the following command:

$ pip install -r requirements.txt

License

Decagon is licensed under the MIT License.

decagon's People

Contributors

agrawalm avatar marinkaz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

decagon's Issues

Unsupported feed type

I keep getting the following error when I try and run the code on my laptop with no changes.

Train model
Traceback (most recent call last):
  File "C:\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py", line 1334, in _do_call
    return fn(*args)
  File "C:\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py", line 1319, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "C:\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py", line 1407, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InternalError: Unsupported feed type

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "main.py", line 279, in <module>
    outs = sess.run([opt.opt_op, opt.cost, opt.batch_edge_type_idx], feed_dict=feed_dict)
  File "C:\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py", line 929, in run
    run_metadata_ptr)
  File "C:\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py", line 1152, in _run
    feed_dict_tensor, options, run_metadata)
  File "C:\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py", line 1328, in _do_run
    run_metadata)
  File "C:\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py", line 1348, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: Unsupported feed type

I don't understand where the problem is. Please help.

Does Decagon regard the relation and its corresponding reverse one as different relations?

Typically, DDI prediction is a pairwise classification problem.
In the given toy example, you artificially generate a small graph.
My concern is that Decagon seems to treat a DDI and its corresponding reverse one as two different edges. But, I am confused by how to calculate the metrics with the predicted results for potential DDI triplets and the reverse ones.
For instance, "Drug A's metabolism is increased when combined with Drug B" (symbolized as triplet (A, metabolism increased, B)) is semantically equal to the reverse one "Drug B can increase the metabolism of Drug A" (symbolized as triplet (B, increase metabolism, A)).

decagon/main.py

Lines 140 to 145 in 86ff6b1

adj_mats_orig = {
(0, 0): [gene_adj, gene_adj.transpose(copy=True)],
(0, 1): [gene_drug_adj],
(1, 0): [drug_gene_adj],
(1, 1): drug_drug_adj_list + [x.transpose(copy=True) for x in drug_drug_adj_list],
}

If the model gives different scores for both triplets, how to calculate the final metric values. By simply keeping predicted results for both groups?

Confusion about weights update

It seems that the model takes into consideration a single edge type on each batch iteration, so it only uses that corresponding W_r, is only that matrix updated on that iteration?

variables' means

Hello! Could you tell me the meaning of every variable in main.py?
Thanks

A question about the 'adj_mats_orig'

@marinkaz Thank you so much for your work! Would you please explain to me why the adj_mats_orig contains both gene_adj and gene_adj.transpose (the same to drug_drug_adj_list)? I think that a 'gene_adj' is enough, as the code below.
Looking forward to your reply. Thank you!
` #data representation

adj_mats_orig = {
(0, 0): [gene_adj, gene_adj.transpose(copy=True)],
(0, 1): [gene_drug_adj],
(1, 0): [drug_gene_adj],
(1, 1): drug_drug_adj_list + [x.transpose(copy=True) for x in drug_drug_adj_list],
}`

`#In my view, the data representation should be as follows.

adj_mats_orig = {
(0, 0): [gene_adj],
(0, 1): [gene_drug_adj],
(1, 0): [drug_gene_adj],
(1, 1): drug_drug_adj_list,
}`

data leakage problem in your model

The design of your adjacency matrix adj_mats_orig and the way you split the train/test set will cause a huge data leakage problem in your training, because your validation and test set is created independently for gene_adj and gene_adj.transpose(copy=True), and therefore the edges from the validation / test set in gene_adj is actually included in the training set of gene_adj.transpose(copy=True).

Same problem goes for the train / validate set between gene_drug_adj and drug_gene_adj. The validation edges from gene_drug_adj are actually used for training in drug_gene_adj, and vise versa.

Could you please clarify?
Thanks!

Originally posted by @hurleyLi in #7 (comment)

has this repo have any supports?

Thank you for providing this repo but wondering if this repo is maintained or not? there are lots of dependency issue in the requirement file and time consuming to solve all of those issue.

AttributeError: module 'tensorflow' has no attribute 'app'

AttributeError: module 'tensorflow' has no attribute 'app'

(base) C:\Users\chert\decagon>python main.py
Traceback (most recent call last):
File "main.py", line 14, in
from decagon.deep.optimizer import DecagonOptimizer
File "C:\Users\chert\decagon\decagon\deep\optimizer.py", line 4, in
flags = tf.app.flags
AttributeError: module 'tensorflow' has no attribute 'app'

(base) C:\Users\chert\decagon>

how to solve this error ??

ERROR: Could not find a version that satisfies the requirement futures==3.2.0

How to solve this error ?

Using cached https://files.pythonhosted.org/packages/69/cb/f5be453359271714c01b9bd06126eaf2e368f1fddfff30818754b5ac2328/funcsigs-1.0.2-py2.py3-none-any.whl
Collecting futures==3.2.0 (from -r requirements.txt (line 8))
ERROR: Could not find a version that satisfies the requirement futures==3.2.0 (from -r requirements.txt (line 8)) (from versions: 0.2.python3, 0.1, 0.2, 1.0, 2.0, 2.1, 2.1.1, 2.1.2, 2.1.3, 2.1.4, 2.1.5, 2.1.6, 2.2.0, 3.0.0, 3.0.1, 3.0.2, 3.0.3, 3.0.4, 3.0.5, 3.1.0, 3.1.1)
ERROR: No matching distribution found for futures==3.2.0 (from -r requirements.txt (line 8))

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.