Giter VIP home page Giter VIP logo

ps4dr's Introduction

PS4DR (Pathway Signatures for Drug Repositioning)

This package comprises a modular workflow designed to identify drug repositioning candidates using multi-omics data sets. A schematic figure of the workflow is presented below. The R scripts necessary to run the MSDRP pipeline are located in the R directory.

image

Figure 1. Design of the MSDRP workflow. Differentially expressed genes/proteins (i.e., DEG/DEP) from disease and drug perturbed profiles are passed as input together with GWAS data. Once the data is correctly formatted, users can define a custom pipeline, or series of steps in the workflow that will then be applied to the datasets. The steps performed in this pipeline constitute the optional portion of the workflow and involve filtering the -omics features coming from the dataset in order to reduce dimensionality by exclusively analyzing genes that have been associated with GWAS studies. Next, a previously selected pathway enrichment method is applied to DEG/DEP datasets deriving from both the disease and drug perturbed profiles to evaluate the direction of dysregulation for each affected pathway in each of these contexts. Finally, the workflow prioritizes drugs by finding the drugs that are predicted to invert the pathway signatures observed in the pathophysiology context.

Citation

If you use PS4DR in your work, please consider citing our preprint:

Installation

If devtools is not installed, do:

$ R -e 'install.packages("devtools", repos="http://cran.us.r-project.org")'

This library can be installed directly from GitHub using the following instructions adapted from R Package primer:

$ R -e 'library(devtools); install_github("ps4dr/ps4dr")'

Alternatively, ps4dr can be cloned then installed from the source with:

$ R -e 'install.packages(c("cowplot", "BiocManager", "data.table", "doParallel", "dplyr", "ggplot2", "gridExtra", "Hmisc", "httr", "jsonlite", "pROC", "purrr", "qqplotr","RColorBrewer", "RecordLinkage", "stringr", "tools", "tidyr", "VennDiagram"))'
$ R -e 'BiocManager::install(c("BiocParallel", "biomaRt", "graphite", "org.Hs.eg.db", "SPIA"))'

Then

$ git clone https://github.com/ps4dr/ps4dr.git
$ R -e 'library(devtools); install("ps4dr")'

Reproduction

To run the entire pipeline:

$ sh ps4dr.sh

Alternatively, see the instructions to:

  1. Run all pre-processing scripts using the instructions at https://github.com/ps4dr/ps4dr/tree/master/R/preprocessing
  2. Run all analysis scripts using the instructions at https://github.com/ps4dr/ps4dr/tree/master/R/analysis

How to Modify the Workflow

Notes how to change parts of the workflow:

  1. Selecting different gene sets (i.e., "gene set intersection" part in the figure)
  2. Modifying the Pathway enrichment Analysis method (i.e., GSEA instead of SPIA)

Disclaimer

PS4DR is a scientific software that has been developed in an academic capacity, and thus comes with no warranty or guarantee of maintenance, support, or back-up of data.

ps4dr's People

Contributors

asifemon avatar cthoyt avatar ddomingof avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

ps4dr's Issues

Change storage of Rdata files to TSV

It's impossible to look into these and see what's there. We need to make the file input/output more amenable to debugging by outputting as CSV or TSV and loading from those instead of using the R data loader. I understand this might result in making the scripts a bit slower, but it's worth it

Error in reproduction

in 2.1 Calculating significant overlaps:


	Wilcoxon rank sum test with continuity correction

data:  disease_genes[same.disease == FALSE, -log10(p.adjusted)] and disease_genes[same.disease == TRUE, -log10(p.adjusted)]
W = 812133, p-value = 5.152e-06
alternative hypothesis: true location shift is not equal to 0

[1] 5.15187e-06
Error in ggplot(disease_genes, aes(x = same.disease, y = -log10(p.adjusted),  : 
  could not find function "ggplot"
Calls: print
Execution halted

Missing dependency in script 2.2

Error in library(org.Hs.eg.db) : 
  there is no package called ‘org.Hs.eg.db’
Execution halted

Is this because the dependency wasn't written on the readme?

Missing packages in calculation of Diseaes SPIAs

Need to update README and DESCRIPTION

...
Error in library(org.Hs.eg.db) : 
  there is no package called ‘org.Hs.eg.db’
Execution halted
\n2.3 Calculating Drug SPIAs\n
Error in library(doMC) : there is no package called ‘doMC’
Execution halted
\n2.4 Checking Distributions\n
Error in library(Hmisc) : there is no package called ‘Hmisc’
..

Package code

  • Make sure it's all installable
  • Make sure it can be run via the CLI or through top-level functions

Error in 2.1

2.1 Calculating significant overlaps

[1] "Using results folder at /Users/cthoyt/dev/hbp/ps4dr/data"
[1] "GWAS to DEGs Gene Set Overlap Signifcance Calculation"
  |======================================================================| 100%
[1] "Drug to Disease_Gene Set Overlap Signifcance Calculation"
  |======================================================================| 100%
Error in drugPdisease_genes$chembl.id : 
  $ operator is invalid for atomic vectors
Calls: sprintf -> unique
Execution halted

README doesn't tell users where to start

Am I supposed to explore the repository until I find something that looks meaningful?

What command am I supposed to run to install all of the required dependencies?

The workflow diagram is nice but it's a huge distraction. Installation and "Getting Started" sections need to come first.

There's no point in reviewing the manuscript until I can actually run this

Script fails

Idk what's being output here, there's too much garbage output (fixed by #15) and not enough useful logging, but it crashes because it assumes some file exists that isn't there

[1] "Using results folder at /home/choyt/dev/ps4dr/data"
[1] 2725
[1] 9353
[1] 673
Error in forderv(x, by = by, sort = FALSE, retGrp = TRUE) : 
  Column 9 of by= (9) is type 'list', not yet supported. Please use the by= argument to specify columns with types that are supported. See NEWS item in v1.12.2 for more information.
Calls: unique -> unique -> unique.data.table -> forderv
Execution halted
Error in library(org.Hs.eg.db) : 
  there is no package calledorg.Hs.eg.dbExecution halted
Error in library(doSNOW) : there is no package calleddoSNOWExecution halted
Error in library(gridExtra) : there is no package calledgridExtraExecution halted
[1] "Using results folder at /home/choyt/dev/ps4dr/data"
Error in readChar(con, 5L, useBytes = TRUE) : cannot open the connection
Calls: load -> readChar
In addition: Warning message:
In readChar(con, 5L, useBytes = TRUE) :
  cannot open compressed file '/home/choyt/dev/ps4dr/data/spia_output/spia_kegg_47Diseases_drugPdisease_nopar.RData', probable reason 'No such file or directory'
Execution halted

Error when running STOPGAP

\n1.1 Running STOPGAP\n
[1] "Using results folder at /home/choyt/dev/ps4dr/data"
Error in source(stopgap_functions_path) : 
  /home/choyt/dev/ps4dr/R/preprocessing/STOPGAP2_functions.R:426:94: unexpected ','
425: # Write out the new GWAS data as "stopgap_4sources_dbSNP141.RData"
426: rsID.update <- function(gwas.file = file.path(gwasDataFolder,"stopgap_4sources_clean.RData")),
                                                                                                  ^
Execution halted

Error when retreiving DEGs

\n1.3 Retrieving DEGs\n
[1] "Using results folder at /home/choyt/dev/ps4dr/data"
Error in { : 
  task 1 failed - "Column 5 of by= (5) is type 'list', not yet supported. Please use the by= argument to specify columns with types that are supported. See NEWS item in v1.12.2 for more information."
Calls: %dopar% -> <Anonymous>
Execution halted

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.