This is the PyTorch implementation for our paper published on Briefings in Bioinformatics:
iDPath is an interpretable deep learning-based path-reasoning framework to identify potential drugs for the treatment of diseases, by capturing the mechanism of drug action (MODA) based on simulating the paths from drugs to diseases in the human body.
The code has been tested running under Python 3.7. The required package are as follows:
- pytorch == 1.6.0
- numpy == 1.19.1
- sklearn == 0.23.2
- networkx == 2.5
- pandas == 1.1.2
- To install the required packages for running iDPath, please use the following command first. If you meet any problems when installing pytorch, please refer to pytorch official website
pip install -r requirements.txt
- You may need to download the following files to run iDPath
- Download the the shortest paths between all the targets of drugs and diseases and put two files (
disease_path_dict.pkl
anddrug_path_dict.pkl
) under the folderdata/path
. - Download the processed data and put these files under the folder
data/processed
. - Download the test data and put these files under the folder
data/test
.
python train.py --config config/config.json
When the training is finished, you will get a file that records the parameters for the best model, remember its location (such as saved/models/iDPath/0117_164440/model_best.pth
) and use it for testing.
python test.py --config config/config.json --resume saved/models/iDPath/0117_164440/model_best.pth
- Data. To make an inference on the new drug-disease pair, you need to prepare a csv file named
test.csv
under the folderdata/test
with the following fields, where the drug is denoted by its PubChem CID and disease is denoted by its ICD10 code. Note that if your input drugs or diseases cannot be found in our dataset, the corresponding pairs will be ignored.
drug_pubchemcid disease_icd10
CID-132971 ICD10-C61
- Pre-trained model. You can use your own pre-trained model or use our prepared one
model_best.pth
and put theconfig.json
andmodel_best.pth
to the folderdata/test
. - Run. We provide an argument
K
in theinference_config.json
to control the output of the number of top-k critical paths identified by iDPath. Please use the following command to run the inference.
python inference.py --resume data/test/model_best.pth --config config/inference_config.json
- Result. After the inference is done, you will get a file named
result.csv
under the foldersaved/models/iDPath/xxxx_xxxxxx
(wherexxxx_xxxxxx
is your runing time as the runing id). Theresult.csv
contains the predicted probability of therapeutic effect and top-k critical paths of your input drug-disease pairs.
Datasets used in the paper:
Distributed under the GPL-2.0 License License. See LICENSE
for more information.
Jiannan Yang - [email protected]