Giter VIP home page Giter VIP logo

scenic's Introduction

⚠️ WARNING
SCENIC is deprecated, use pySCENIC instead.

SCENIC

SCENIC (Single-Cell rEgulatory Network Inference and Clustering) is a computational method to infer Gene Regulatory Networks and cell types from single-cell RNA-seq data.

The description of the method and some usage examples are available in Nature Methods (2017).

There are currently implementations of SCENIC in R (this repository), in Python (pySCENIC), as well as wrappers to automate analyses with Nextflow (VSN-pipelines).

The output from any of the implementations can be explored either in R, Python or SCope (a web interface).

Tutorials

If you have access to Nextflow and a container system (e.g. Docker or Singularity), we recommend to run SCENIC through the VSN-pipeline.

This option is specially useful for running SCENIC on large datasets, or in batch on multiple samples.

If you prefer to use R for the whole analysis, these are the main tutorials:

The tutorials in R include a more detailed explanation of the workflow and source code.

Python/Jupyter notebooks with examples running SCENIC in different settings are available in the SCENIC protocol repository.

Frequently asked questions: FAQ


News

2021/03/26:

2020/06/26:

  • The SCENICprotocol including the Nextflow workflow, and pySCENIC notebooks are now officially released. For details see the Github repository, and the associated publication in Nature Protocols.

2019/01/24:

2018/06/20:

2018/06/01:

  • Updated SCENIC pipeline to support the new version of RcisTarget and AUCell.

2018/05/01:

2018/03/30: New releases

  • pySCENIC: lightning-fast python implementation of the SCENIC pipeline.
  • Arboreto package including GRNBoost2 and scalable GENIE3:
    • Easy to install Python library that supports distributed computing.
    • It allows fast co-expression module inference (Step1) on large datasets, compatible with both, the R and python implementations of SCENIC.
  • Drosophila databases for RcisTarget.

scenic's People

Contributors

ghuls avatar juhaa avatar mschilli87 avatar s-aibar avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

scenic's Issues

What's goal dividing the original gene list into nParts pieces

What's the function of parameter nParts in runGenie3 function. In practice, how we specify the nParts.

#SCENIC/R/runGenie3.R, line 30
genesSplit <- split(sort(rownames(exprMat)), 1:nParts)
and I don't understand why you use sort? Shouldn't we put all genes together to infer gene regulatory networks from gene expression data??

Error loading library(RcisTarget.hg19.motifDatabases.20k)

Hello,

I've downloaded RcisTarget.hg19.motifDatabases.20k from the lab website, located it in the 'data' directory created by the workflow published here but calling:

library(RcisTarget.hg19.motifDatabases.20k) as you do here (or library('data/RcisTarget.hg19.motifDatabases.20k') or install.packages('data/RcisTarget.mm9.motifDatabases.20k ', repos = NULL))

doesn't work.

How can I reproduce the results/follow the rest of the tutorial?

Thanks,
Assaf

gene format issues

Dear SCENIC developers,

Thank you very much for sharing the scripts and organizing them into rmd! I am very green in bioinformatics and having troubles to run scenic due to the gene format in my datasets. The gene names in my datasets are all capitalized, while most databases use the first letter capitalized format for the mouse. I am working on to run SCENIC as separate steps to see whether I could toupper the gene list in the database.

Would it be possible to update the scenic package to capitalize gene names of all necessary database?

Thank you!

Wei

error at RcisTarget, R session Aborted

Hi there,
I ran the R code. At the step,
library(RcisTarget)
motifRankings <- importRankings(getDatabases(scenicOptions)
R session aborted.

I searched the answer from other issues. The reason might be the databases are incomplete/corrupt. Follow the tutorial, only dm6-5kb-upstream-full-tx-11species.mc8nr.feather was downloaded in my local folder. I tried to download dm6-regions-11species.mc8nr.feather.

However, I found there is no region_based folder in this link, https://resources.aertslab.org/cistarget/databases/drosophila_melanogaster/dm6/flybase_r6.02/mc8nr/
I am not sure whether dm6-regions-11species.mc8nr.feather is necessary in SCENIC.
Without dm6-regions-11species.mc8nr.feather, was it the reason of R session aborted?

Thank you!

runSCENIC_3_scoreCells + error

Hello,

I'm having a hard time troubleshooting is error:

Error in sample.int(length(x), size, replace, prob) : 
  cannot take a sample larger than the population when 'replace = FALSE'

which appears after running runSCENIC_3_scoreCells.

Thank you for you help,
Andrea

Could you more Specify Example data format?

I want to do SCENIC analysis on my own expression set
And I find difficulties in inputting my data.

Because, example files were not provided (CellLabels.tsv, expression.txt)
I could not figure out how to make those two files from expression_mRNA_17-Aug-2014.txt (example file)
Could you more Specify Example data format like below image (from Granatum)?

image

there's no geneFiltering function

Hi,

I just followed the tutorial of Running SCENIC. But I encountered an error when I ran:
genesKept <- geneFiltering(exprMat, scenicOptions=scenicOptions, minCountsPerGene=3*.01*ncol(exprMat), minSamples=ncol(exprMat)*.01)

It looks like a function within SCENIC, but I can't find it. Could you please tell me which package it is from? I have libraried all the packages which are suggested to have for this tutorial.

Thanks!

Best,
Xingyu

Access to actual motifs as opposed to motif-ID

Hello,

We've successfully run through the tutorial with our dataset and were able to detect interesting regulons. In our system, we would greatly benefit from examining the motif weight/probability matrices (in addition to their IDs or logos), as we would like to quantify CpG presence in those motifs.
We see in the RcisTarget folder structure that the "raw" database (used to generate the logos) is currently unavailable, would you please provide it for any of the working versions, particularly v8?

Thanks a lot!

"breaks not unique" when running "plot_aucTsne" in part 3.1

Thank you for your program.
I've been trying to diagnose this problem to no avail.
I made sure to set "stringsasfactor=FALSE"
and re-installed all the right versions strictly according to the tutorial,
but keep running into this problem.

Please help!

Memory usage while computing AUC

Hi,

I am testing SCENIC on my single-cell data (563 cells, full length scRNAseq, RPKM data).
I am facing a problem of RAM usage while computing the AUC during the runSCENIC_2_createRegulons() step.
The cluster I use do not allow more than 250GB of RAM per node, and this step seems to use more that 250GB:

TERM_SWAP: job killed after reaching LSF swap usage limit.
Exited with signal termination: Killed.

Resource usage summary:

    CPU time :               879.38 sec.
    Max Memory :             255511.41 MB
    Average Memory :         76464.67 MB
    Total Requested Memory : 250000.00 MB
    Delta Memory :           -5511.41 MB
    (Delta: the difference between total requested memory and actual max usage.)
    Max Swap :               266461 MB

    Max Processes :          33
    Max Threads :            34

I am following the tutorial but with my own data, so I use the default settings (except that I use 30 cores).
I tried to run SCENIC with half of the cells and I get the same problem. I also tried to run SCENIC with only the genes that are differentially expressed between mu cell clusters (~5K genes) and same, I am facing the memory issue.

Is it normal that SCENIC uses so much RAM and Swap?

RcisTarget row.names issue

Hi,

I am running into the following error message:

runSCENIC_2_createRegulons(scenicOptions)
09:00 Step 2. Identifying regulons
tfModulesSummary:

top5perTarget top10perTarget w005 top50 top50perTarget
198 531 870 944 1254
w001
1323
09:00 RcisTarget: Calculating AUC
Scoring database: [Source file: mm9-500bp-upstream-7species.mc9nr.feather]
Scoring database: [Source file: mm9-tss-centered-10kb-7species.mc9nr.feather]
10:02 RcisTarget: Adding motif annotation
Number of motifs in the initial enrichment: 2080070
Number of motifs annotated to the corresponding TF: 17945
10:09 RcisTarget: Prunning targets
[1] "error in running command"
Number of motifs that support the regulons: 17945
Error in .rowNamesDF<-(x, value = value) :
duplicate 'row.names' are not allowed
Calls: runSCENIC_2_createRegulons ... row.names<- -> row.names<-.data.frame -> .rowNamesDF<-
In addition: Warning message:
non-unique values when setting 'row.names':

sessionInfo()

R version 3.5.0 (2018-04-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux Server 7.2 (Maipo)

Matrix products: default
BLAS: /sc/wo/app/R/v3.5.0/lib64/R/lib/libRblas.so
LAPACK: /sc/wo/app/R/v3.5.0/lib64/R/lib/libRlapack.so

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets
[8] methods base

other attached packages:
[1] SingleCellExperiment_1.2.0 SummarizedExperiment_1.10.1
[3] DelayedArray_0.6.1 BiocParallel_1.14.1
[5] matrixStats_0.53.1 Biobase_2.40.0
[7] GenomicRanges_1.32.3 GenomeInfoDb_1.16.0
[9] IRanges_2.14.10 S4Vectors_0.18.3
[11] BiocGenerics_0.26.0 RColorBrewer_1.1-2
[13] foreach_1.4.4 AUCell_1.2.4
[15] RcisTarget_1.0.2 SCENIC_0.99.0-03

loaded via a namespace (and not attached):
[1] lattice_0.20-35 htmltools_0.3.6 blob_1.1.1
[4] XML_3.98-1.11 rlang_0.2.1 R.oo_1.22.0
[7] later_0.7.3 pillar_1.2.3 DBI_1.0.0
[10] R.utils_2.6.0 bit64_0.9-7 GenomeInfoDbData_1.1.0
[13] zlibbioc_1.26.0 R.methodsS3_1.7.1 codetools_0.2-15
[16] memoise_1.1.0 httpuv_1.4.4.1 AnnotationDbi_1.42.1
[19] GSEABase_1.42.0 Rcpp_0.12.17 xtable_1.8-2
[22] promises_1.0.1 feather_0.3.1 graph_1.58.0
[25] annotate_1.58.0 XVector_0.20.0 mime_0.5
[28] bit_1.1-14 hms_0.4.2 digest_0.6.15
[31] shiny_1.1.0 grid_3.5.0 tools_3.5.0
[34] bitops_1.0-6 magrittr_1.5 RCurl_1.95-4.10
[37] RSQLite_2.1.1 tibble_1.4.2 pkgconfig_2.0.1
[40] Matrix_1.2-14 data.table_1.11.4 iterators_1.0.9
[43] R6_2.2.2 compiler_3.5.0

Any help would be appreciated!

Joe

Old SCENIC

Dear Aerts team,

Thank you for your great job on SCENIC(s) and the regulon approach.

I have been using pySCENIC and R SCENIC/AUCell for a few months. You recently updated lots of your packages and their associated workflows and tutorials, completely removing the previous ones. There are many, many differences with the previous versions. I used to work with pySCENIC and switch to R-SCENIC/AUCells for analyses. I need to reproduce some old results, but I have to face some analysis issues. In order to understand the origin of the problems I would appreciate to build on the old tutorials you provided.

In order to maintain a compatibility with older workflows, could you please give access to these previous workflow/tutorials ?

eg.
https://raw.githubusercontent.com/aertslab/SCENIC/master/inst/doc/Step3.1_NwActivity.html
https://raw.githubusercontent.com/aertslab/SCENIC/master/inst/doc/Step3.2_BinaryNwActivity.html

Many thanks,
Best regards.

GENIE3 installation link is not working

Hi,

When trying to install GENIE3 using the installation links it gives a 404 error.

trying URL 'http://bioconductor.org/packages/release/bioc/src/contrib/GENIE3_1.0.0.tar.gz'
Warning in install.packages :
cannot open URL 'http://bioconductor.org/packages/release/bioc/src/contrib/GENIE3_1.0.0.tar.gz': HTTP status was '404 Not Found'
Error in download.file(p, destfile, method, mode = "wb", ...) :
cannot open URL 'http://bioconductor.org/packages/release/bioc/src/contrib/GENIE3_1.0.0.tar.gz'

error in runSCENIC_3_scoreCells

Hi, SCENIC team
I got an error when I run runSCENIC_3_scoreCells

Error in signif(trhAssignment, 3):
non-numeric argument to mathematical function

Any suggestion that where error comes from?
Thanks!

PLUS: I saw some different errors aboutrunSCENIC_ functions, is it possible due to OS environment?
I am using

Linux version 2.6.32-696.13.2.el6.x86_64 ([email protected]) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-18) (GCC) ) #1 SMP Thu Oct 5 21:22:16 UTC 2017

Can't load RcisTarget databases

Dear SCENIC team,

I'm following the tutorial for SCENIC and ran into this problem:

library(SCENIC)
dbFiles <- c("https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg38/refseq_r80/mc9nr/gene_based/hg38__refseq-r80__500bp_up_and_100bp_down_tss.mc9nr.feather","https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg38/refseq_r80/mc9nr/gene_based/hg38__refseq-r80__10kb_up_and_down_tss.mc9nr.feather")
for(featherURL in dbFiles){
download.file(featherURL, destfile=basename(featherURL))
descrURL <- gsub(".feather$", ".descr", featherURL)
if(file.exists(descrURL)) download.file(descrURL, destfile=basename(descrURL))
}
org="hgnc"
dbDir="/path/Rscenic"
myDatasetTitle="SCENIC"

scenicOoptions <- initializeScenic(org=org, datasetTitle = myDatasetTitle, dbDir=dbDir)

ERROR MESSAGE:
RcisTarget databases not found. Please, initialize them manually.

Am I missing anything? Both the *.feather files are located in /path/Rcenic

Thank you for your help!

Best,
Gwen

runSCENIC_3_scoreCells.R Line84 non-numeric error

Hi,

I found line 83-84 of runSCENIC_3_scoreCells.R:

trhAssignment <- getThresholdSelected(cells_AUCellThresholds)
trhAssignment <- signif(trhAssignment,3)

The 83-line gave a 'list' result, but 'signif' needs a numeric input?
By the way, there are some 'NA' names in 'cells_AUCellThresholds'. Coud I simply delete theme items?

Thank you, sincerely.

Scale of GRNBoost Results

Hi! first, thanks for the detailed installation guide and tutorials, they were very useful! I was hoping you could help me understand how to integrate the data from the GRNBoost procedure which speeds-up the GENIE3 process.
Whereas GENIE3 returns a weighted adjacency matrix where the weights are fractional values between 0-1, the GRNBoost returns a list with the weights ranging to about 1E9.
I was wondering if something had gone wrong in the GRNBoost procedure (though no errors were reported) or is there a normalization step I am missing?

Thanks!

Error with AUCell.buildRankings

Hello,

While trying to reproduce the results found in the tutorials here I find that the 'AUCell.buildRankings' function does not exist and therefore I can't run the complete tutorials code.

library(AUCell)
aucellRankings <- AUCell.buildRankings(exprMat, nCores=4, plotStats=TRUE)
Error: could not find function "AUCell.buildRankings"

I made sure I have the last version of the package installed:

devtools::install_github("aertslab/AUCell", build_vignettes=F)
Skipping install of 'AUCell' from a github remote, the SHA1 (568aa712) has not changed since last install.

How can it be fixed?

Thanks,
Assaf

Handling GRNboost output

Hi there. I've been going through the vignettes and I can't seem to figure out how to get scenicOptions to point to the tsv output file from GRNboost (or a data frame of the results after importing it into R).

Any advice on how to use GRNboost results in the SCENIC pipeline would be greatly appreciated!

bplapply error in runSCENIC_2_createRegulons

The early steps of Scenic appear to have worked well on my dataset of ~350 cells, using a server session with access to 50GB RAM. However, the following error appeared >12 hours into running the runSCENIC_2_createRegulons function:

> runSCENIC_2_createRegulons(scenicOptions)
08:44   Step 2. Identifying regulons
tfModulesSummary:

 top5perTarget          top50 top10perTarget           w005 top50perTarget
           417            443            835            959            963
          w001
           963
08:45   RcisTarget: Calculating AUC
Scoring database:  [Source file: mm9-500bp-upstream-7species.mc9nr.feather]
Scoring database:  [Source file: mm9-tss-centered-10kb-7species.mc9nr.feather]
22:46   RcisTarget: Adding motif annotation
Error: 'bplapply' receive data failed:
  error reading from connection

Has anyone encountered this before?

Issue while running SCENIC step2: regulons ("rankingWrapper")

Hello,

I've read your recent Nature Methods paper with great interest. Our lab studies transcriptional networks, and I am currently trying to apply it to my single-cell data set.

Thanks for the really detailed and easy to follow tutorial. I appreciate it alot.

I've been able to make it as far as part-way through step2: regulons, but then got stuck by an unexpected error which I am not sure how to resolve. When I run the following:

library(RcisTarget)
load("int/2.1_tfModules_forMotifEnrichmet.RData")
org <- "mm9"
if(org=="mm9")
{
library(RcisTarget.mm9.motifDatabases.20k)

Motif rankings (genes x motifs)

data(mm9_500bpUpstream_motifRanking)
data(mm9_10kbpAroundTss_motifRanking)
motifRankings <- list()
motifRankings[["500bp"]] <- mm9_500bpUpstream_motifRanking
motifRankings[["10kbp"]] <- mm9_10kbpAroundTss_motifRanking

Motif annotation (TFs)

data(mm9_direct_motifAnnotation)
direct_motifAnnotation <- mm9_direct_motifAnnotation
data(mm9_inferred_motifAnnotation) # optional
inferred_motifAnnotation <- mm9_inferred_motifAnnotation
}
motifs_AUC <- lapply(motifRankings, function(ranking) calcAUC(tfModules, ranking, aucMaxRank=0.01*nrow(ranking@rankings), nCores=4, verbose=FALSE))**

This is where I get the following error>>>>>>>>>>>>>>>>
Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘getMaxRank’ for signature ‘"rankingWrapper"’
I was reading some of the previous posts and tried to implement the following thinking that ranking Wrapper is the problematic part here:

motifRankings <- rankingWrapper(rankings=mm9_10kbpAroundTss_motifRanking,
rowType="gene", colType="motif", org="mouse", genome="mm9", maxRank = 5000, description="")

However, it is not able to find the function rankingWrapper.....

Any help would be appreciated!

I'm running the script on a HPC cluster, R version 3.4.0, RcisTarget_0.99.5.

Thanks,
Sohyon

runSCENIC_3_scoreCells cut.default error

Breaks-related error during runSCENIC_3_scoreCells:

> runSCENIC_3_scoreCells(scenicOptions, exprMat)
14:21   Step 3. Analyzing the network activity in each individual cell

Number of regulons to evaluate on cells: 336
Biggest (non-extended) regulons:
         Taf1 (2113g)
         Elf1 (2069g)
         Fli1 (1707g)
         Elf4 (1511g)
         Elk3 (1287g)
         Ep300 (1147g)
         Gabpa (1043g)
         Elk4 (993g)
         Elk1 (971g)
         Etv6 (623g)
Quantiles for the number of genes detected by cell:
(Non-detected genes are shuffled at the end of the ranking. Keep it in mind when choosing the threshold for calculating the AUC).
   min     1%     5%    10%    50%   100%
1556.0 1636.8 1782.0 1910.0 2386.0 7637.0
Using 20 cores.
Using 20 cores.
Error in cut.default(a, breaks = 100) : 'breaks' are not unique
```
`

runSCENIC_3_scoreCells error

I get this error when I want to run the third function in the SCENIC wrapper

library(SCENIC)
scenicOptions <- readRDS("int/scenicOptions.Rds")
scenicOptions@settings$verbose <- TRUE
scenicOptions@settings$nCores <- 20
scenicOptions@settings$seed <- 123

runSCENIC_1_coexNetwork2modules(scenicOptions)
runSCENIC_2_createRegulons(scenicOptions)
runSCENIC_3_scoreCells(scenicOptions, exprMat)
13:58	Creating TF modules
             [,1]
nTFs         1006
nTargets    10418
nGeneSets    6036
nLinks    4169753

 top5perTarget          top50 top10perTarget top50perTarget           w001           w005 
           415            529            826           1006           1006           1006 
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
    2.0     8.0    24.0   159.8   106.0  3396.0 
    min      1%      5%     10%     50%    100% 
2322.00 2373.64 2777.20 2954.80 3546.00 5308.00 

Error in gzfile(file, "rb") : cannot open the connection

Cell identity tutorial

Dear team. Great tutorial and package!
When I create the regulon heatmaps, is there a way to add my previously determined identities of each cell?
I didn't understand my way around the "cellInfo" and "colVars" objects. What would be most efficient in this phase, where I have a vector of identities of each cell (from another source) and the matrix of relevant regulons?
Thanks in advance, Idan.

Can I skip the step of filtering by TFinDB

Hi,

Thank you for the useful package.
In Step 2:

1.3 Keep only the motifs annotated to the initial TF

motifEnrichment_selfMotifs <- motifEnrichment[which(motifEnrichment$TFinDB != ""),, drop=FALSE]

Several important TFs for cell development (already verified by my experiment) are filtered out by this step which are empty in TFinDB.
My research object is a pretty novel cell system and maybe lack of relevant info in database.
My question is can I skip this step? Will it affect the downstream analysis?
Or do you have any other suggestions to modify parameters or databases?

Error on the cluster with runSCENIC_2_createRegulons

Hi,
I am trying to run the SCENIC pipeline on a high performance cluster and keep getting the same error at "RcisTarget: Adding motif annotation" step. The error is:
Error in serialize(data, node$con, xdr = FALSE) : error writing to connection Calls: runSCENIC_2_createRegulons ... .send_EXEC -> <Anonymous> -> sendData.SOCK0node -> serialize
I tried running with various number of cores and memory allocations on a single node but keep getting the same error. Any suggestions on settings to use on the cluster to avoid this issue?
Thank you!

Differences in data scales

Hi,

The input to GENIE3 allows for a log-scale normalized UMI counts; however, step 3.1 uses just the normalized counts as in put. The wrapper function takes the log-scale normalized UMI counts as input, but never requires or complains about not having the normalized UMI counts.

Does this matter? For example, I see that the 'nGene' per cell calculation of 3.1 assumes there are no negative values, but we know this is likely not the case with log-scaled values.

Thank you in advance,
Chris

Error in GENIE run

Hello,

I get the following when I run GENIE:
Error in weightMatrix[regulatorNames, ] <- weightMatrix.reg :
number of items to replace is not a multiple of replacement length

Any idea of what might it be?

Thanks,
Assaf

Error in executing the motifs_AUC step

Hi,
I'm trying to implement SCENIC, but regardless of which dataset I use, I keep encountering an error with the following command in Step 2:
motifs_AUC <- lapply(motifRankings, function(ranking) calcAUC(tfModules, ranking, aucMaxRank=0.01*nrow(ranking@rankings), verbose=FALSE))
The error message I get is:
Error in calcAUC(tfModules, ranking, aucMaxRank = 0.01 * nrow(ranking@rankings), :
Fewer than 80% of the genes in the gene sets are included in the rankings. Check wether the gene IDs in the 'rankings' and 'geneSets' match.
Do you have any suggestion or advice on overcoming this problem?
Thank you!

matrixWrapper() function

Hi I'm trying to run SCENIC on a x86_64-pc-linux-gnu (64-bit) machine using R version 3.4.3 (2017-11-30) -- "Kite-Eating Tree". I have several issues:

  1. the SCENIC package would not install, saying it requires an older version of AUCell.

  2. I get until Step2_Regulons.Rmd lines 2-19 (Step2_Regulons.Rmd) where the R throws the following error:
    Warning in doTryCatch(return(expr), name, parentenv, handler) :
    restarting interrupted promise evaluation
    Error in matrixWrapper(matrix = aucMatrix, rowType = "gene-set", colType = "motif", :
    could not find function "matrixWrapper"
    Quitting from lines 2-19 (Step2_Regulons.Rmd)

I could not find the matrixWrapper function on the web. Could you please help?

best,

roman

New release of SCIENIC

Hi,

I have problems installing the packages, some links are not working, and when I install them from bioconductor, installing SCENIC from devtools gives an error about incompatibility of the versions (about AUCell I think since SCENIC needs .99 version while from bioconductor we have 1.0.... Also when I install from the link you provided the 0.99 version, I still get an error about it while installing SCENIC...). Also not sure if I can find RcisTarget in Bioconductor yet...
You have mentioned that 31st of October all required packages will be available in Bioconductor, and early November SCENIC will be updated to work with bioconductor versions...
I think it has not happened yet!
Can you give a date about the new release which wont have problems with different versions of required packages?

Many thanks!
Ati

RcisTarget error at motif annotation step

Hi,

I am getting the error below running

runSCENIC_2_createRegulons(scenicOptions)

05:53 Step 2. Identifying regulons

tfModulesSummary:

top5perTarget top10perTarget w005 top50 top50perTarget

       198            531            870            944           1254 

      w001 

      1323 

05:53 RcisTarget: Calculating AUC

Scoring database: [Source file: mm9-500bp-upstream-7species.mc9nr.feather]

Scoring database: [Source file: mm9-tss-centered-10kb-7species.mc9nr.feather]

06:41 RcisTarget: Adding motif annotation

Error in .local(x, ...) :

cannot create 286 workers; 125 connections available in this session

Calls: runSCENIC_2_createRegulons ... bplapply -> bplapply -> bpstart -> bpstart -> .local

Execution halted

sessionInfo()

R version 3.5.0 (2018-04-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux Server 7.2 (Maipo)

Matrix products: default
BLAS: /sc/wo/app/R/v3.5.0/lib64/R/lib/libRblas.so
LAPACK: /sc/wo/app/R/v3.5.0/lib64/R/lib/libRlapack.so

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets
[8] methods base

other attached packages:
[1] SingleCellExperiment_1.2.0 SummarizedExperiment_1.10.1
[3] DelayedArray_0.6.1 BiocParallel_1.14.1
[5] matrixStats_0.53.1 Biobase_2.40.0
[7] GenomicRanges_1.32.3 GenomeInfoDb_1.16.0
[9] IRanges_2.14.10 S4Vectors_0.18.3
[11] BiocGenerics_0.26.0 RColorBrewer_1.1-2
[13] foreach_1.4.4 AUCell_1.2.4
[15] RcisTarget_1.0.2 SCENIC_0.99.0-03

loaded via a namespace (and not attached):
[1] lattice_0.20-35 htmltools_0.3.6 blob_1.1.1
[4] XML_3.98-1.11 rlang_0.2.1 R.oo_1.22.0
[7] later_0.7.3 pillar_1.2.3 DBI_1.0.0
[10] R.utils_2.6.0 bit64_0.9-7 GenomeInfoDbData_1.1.0
[13] zlibbioc_1.26.0 R.methodsS3_1.7.1 codetools_0.2-15
[16] memoise_1.1.0 httpuv_1.4.4.1 AnnotationDbi_1.42.1
[19] GSEABase_1.42.0 Rcpp_0.12.17 xtable_1.8-2
[22] promises_1.0.1 feather_0.3.1 graph_1.58.0
[25] annotate_1.58.0 XVector_0.20.0 mime_0.5
[28] bit_1.1-14 hms_0.4.2 digest_0.6.15
[31] shiny_1.1.0 grid_3.5.0 tools_3.5.0
[34] bitops_1.0-6 magrittr_1.5 RCurl_1.95-4.10
[37] RSQLite_2.1.1 tibble_1.4.2 pkgconfig_2.0.1
[40] Matrix_1.2-14 data.table_1.11.4 iterators_1.0.9
[43] R6_2.2.2 compiler_3.5.0

Any help would be appreciated!

Joe

runSCENIC_3_scoreCells error with semi transparency

HI,

When I am running runSCENIC_3_scoreCells on my data file and although it works, I am getting the following warning messages

In plot.xy(xy, type, ...): semi transparency is not supported on this device: reported only once per page

I have tried to install and load the Cairo package, but it does not resolve the issue.
Could you maybe give some advise?

Best,
Sarah

Error in t-SNE plots based on AUC scrore

Hi,
Thanks for the great tutorial for applications of AUCell as well as the recently updated website providing a user-friendly interface for SCOPE!

Based off on the packages your team has kindly provided, I am very interested in performing intra- and inter-species geneset analysis for my datasets. However, we have met with a small problem when trying to get the code to work as described below and would greatly appreciate any assistance and/or feedback:

We have been trying to graph t-SNE plots based on the AUC score using the same dataset that was published on the AUCell website, more specifically, this line
“passThreshold <- getAUC(cells_AUC)[geneSetName,] > selectedThresholds[geneSetName]
if(sum(passThreshold) >0 )
{
aucSplit <- split(getAUC(cells_AUC)[geneSetName,], passThreshold)

# Assign cell color
cellColor <- c(setNames(colorPal_Neg[cut(aucSplit[[1]], breaks=nBreaks)], names(aucSplit[[1]])), 
setNames(colorPal_Pos[cut(aucSplit[[2]], breaks=nBreaks)], names(aucSplit[[2]])))

# Plot
plot(cellsTsne, main=geneSetName,
sub="Pink/red cells pass the threshold",
col=cellColor[rownames(cellsTsne)], pch=16) 

}”
Here, we ran into an error saying “Error in xy.coords(x,y,xlabel,ylabel,log):’x’ is a list, but does not have components ‘x’ and ‘y.’” Could I check if we are missing out on some lines of analyses?

Thank you so much once again for your time!

File name inaccuracies and code block order swapped

Hi,

Thank you for the package development. I wanted to bring to your attention several file naming errors associated with saving/loading intermediate data sets.

In Step 1.2 the intermediate output is saved as: load("int/1.7_tfModules_withCorr.RData")

But in Step 2 the the intermediate output is loaded as: load("int/1.8_tfModules_withCorr.RData")

The same type of error exists with file "int/1.4_GENIE3_linkList.RData".

Also in Step 2, the code block order seems to be wrong. The object 'allTFs' was loaded in a much earlier step, but not loaded in Step 2 document, so I had to modify the pipeline.

tfModules_withCorr <- tfModules_withCorr[which(as.character(tfModules_withCorr$TF) %in% allTFs),]

Feather file is old and R crushed

Hi,

I'm trying to repeat the steps as directed. Everything's fine until
library(RcisTarget)
motifRankings <- importRankings(getDatabases(scenicOptions)[[1]])

The error showed "This Feather file is old and will not be readable beyond the 0.3.0 release" and then R crashed.

The feather files I used are
image

My previous codes are:
library(SCENIC)
org="hgnc" # or hgnc, or dmel
dbDir="databases" # RcisTarget databases location
myDatasetTitle="human_data" # choose a name for your analysis
scenicOptions <- initializeScenic(org=org, dbDir=dbDir, datasetTitle=myDatasetTitle, nCores=4)

It seems that the feather files are generated by an old version of feather package and is no longer readable by the newest version (0.3.1). I've checked feather github and they said no backwards compatibility.
Any suggestions?

Thanks!

Can we 'plotTsne' with our own tsne data

Hi all, after i accomplished GRN(regulon scoring), i wish to plot GRN scores on t-sne plot obtained from Seurat package. But in the tutorial

https://htmlpreview.github.io/?https://github.com/aertslab/SCENIC/blob/master/inst/doc/SCENIC_Running.html#optional-creatingcomparing-t-snes

i found that 'plotTsne_compareSettings' function seems to plot based on tsne results calculated by Scenic package itself. I cant find where Scenic store tsne data(in Seurat, it is stored in object@dr$[email protected])
Can anyone find out how to plot with 'GRN from Scenic' + 'tsne from Seurat'? Thanks all!

utilizing PANDA instead of GRNBoost2

Is it possible to utilize output from PANDA for the initial GRN links instead of GENIE3 or GRNBoost2? At first glance, the data formats seem fairly similar, but I'm not clear on whether any conversion is required or not.

Step 2 Error

Hi,

I kept running into the error: failed to stop 'SOCKcluster' cluster: ignoring SIGPIPE signal while running step 2 SCENIC. This error comes after the RcisTarget: Adding motif annotation. Could you please advise me a way to go around it?

Thanks

Error on runSCENIC_3_scoreCells

I am getting the following error when I run runSCENIC_3_scoreCells, could you please tell me how to fix it.

runSCENIC_3_scoreCells(scenicOptions, exprMat_filtered)
22:36 Step 3. Analyzing the network activity in each individual cell

Number of regulons to evaluate on cells: 415
Biggest (non-extended) regulons:
EGR1 (4140g)
ELF1 (4100g)
ATF3 (4088g)
ETS2 (3401g)
BCLAF1 (2974g)
ETV6 (2873g)
POLR2A (2530g)
ETV5 (2436g)
ELK3 (1792g)
REL (1741g)
Quantiles for the number of genes detected by cell:
(Non-detected genes are shuffled at the end of the ranking.Keep in mind when choosing the threshold for calculating the AUC).
min 1% 5% 10% 50% 100%
159.00 325.66 717.30 842.20 1863.00 5663.00
Using 2 cores.
Using 2 cores.
Error in sample.int(length(x), size, replace, prob) :
cannot take a sample larger than the population when 'replace = FALSE'

error in SCENIC run for Rtnse (duplicate error)

Error.txt

Hi,

while running SCENIC, I got error as attached text file.

So, I need to pass the argument

check_duplicates = FALSE
to "Rtsne"

I tried to add this function in runSCENIC.R script (as attached) but could not compile with this

Could you you tell me how can I do that while running wrapper function.

thanks
runSCENIC.txt

why not make into one single command line? not very user-friendly

hi there,

I run thru the codes. Wondering why we have to copy and paste so much codes, when you can actually hide all of the middle processes, and package them into one single command line?

These days, good bioinformatics software is user-friendly. should only define the input, and explain the output very well, and hide all of the controlling details in the command line's parameters. Pretty sure your codes can be packaged that way.

The current way is not tidy, not well structured. Also not outputting enough running info even when VERBOSE set to TRUE.

Chancellor

Error at addSignificantGenes (rn %in% geneSet : object 'rn' not found)

Hi,

I faced this error when running addSignificantGenes() in step2

Error in rn %in% geneSet : object 'rn' not found
Calls: addSignificantGenes ... getSignificantGenes -> subset -> subset -> subset.default -> %in%

I'm running it on
Linux 64bit (CentOS 6.5)
R version 3.4.1

other attached packages:
RcisTarget.hg19.motifDatabases.20k_0.1.1
reshape2_1.4.3
data.table_1.10.4-3
Biobase_2.36.2
BiocGenerics_0.22.0
doMC_1.3.5
doParallel_1.0.11
iterators_1.0.8
doRNG_1.6.6
rngtools_1.2.4
pkgmaker_0.22
registry_0.3
foreach_1.4.4
SCENIC_0.1.7
AUCell_0.99.5
RcisTarget_0.99.0
GENIE3_1.0.0

Any idea how should I fix this?

Thank you in advance :)

Error when using calcAUC (could not find function "matrixWrapper")

Hi,

I have been following the tutorial on github and currently stuck in Step2_Regulons calcAUC()
I've tried running it a few times using the code below:

motifs_AUC <- lapply(motifRankings, function(ranking) calcAUC(tfModules, ranking, aucMaxRank=0.01*nrow(ranking@rankings), nCores=12, verbose=FALSE))

but I seem to run into the same error

Error in matrixWrapper(matrix = aucMatrix, rowType = "gene-set", colType = "motif", :
could not find function "matrixWrapper"

I'm running it on
- Linux 64bit (CentOS 6.5)
- R version 3.4.1
- RcisTarget_0.99.0
- AUCell_1.0.0

Any idea on how can I fix this error?

Thanks,
Justine

Plots with expression of target genes

Hi,
I think I'm having many transcription factors with repression activity.
Is there a way to generate plots with the expression of target genes of a given TF to show this negative association?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.