Giter VIP home page Giter VIP logo

sankaranlab / scavenge Goto Github PK

View Code? Open in Web Editor NEW
73.0 4.0 34.0 39.42 MB

SCAVENGE is a method to optimize the inference of functional and genetic associations to specific cells at single-cell resolution.

License: GNU General Public License v3.0

R 100.00%
cell-heterogeneity cell-state cell-trajectory complex-traits disease-mechanisms fine-mapping network-propagation random-walk scatac-seq single-cell-analysis

scavenge's Introduction

R build status License: GPL (>= 2)

SCAVENGE: Identifying genetic trait/phenotype relevant cell type/state at single cell resolution

¶ Last updated: Dec-01-2022

Overview

Co-localization approaches using genetic variants and single-cell epigenomic data are unfortunately uninformative for many cells given the extensive sparsity across single-cell profiles. Therefore, only a few cells from the truly relevant population demonstrate reliable phenotypic relevance. The global high-dimensional features of individual single cells are sufficient to represent the underlying cell identities or states, which enables the relationships among such cells to be readily inferred. By taking advantage of these attributes, SCAVENGE identifies the most phenotypically-enriched cells by co-localization and explores the transitive associations across the cell-to-cell network to assign each cell a probability representing the cell’s relevance to those phenotype-enriched cells via network propagation.

We developed a novel enrichment method (SCAVENGE) (Single Cell Analysis of Variant Enrichment through Network propagation of GEnomic data) that can discriminate between closely related cell types/states and score single cells for GWAS enrichment.

Schematic view of SCAVENGE

We’ve implemented SCAVENGE as an R package for computing single-cell based GWAS enrichments from fine-mapped posterior probabilities and quantitative epigenomic data (i.e. scATAC-seq and potentially other single-cell epigenome profiling methods).
As single-cell genomic datasets grow in volume, we expect SCAVENGE will have great promise for efficiently uncovering relevant cell populations for more phenotypes or functions in different scenarios, which may expand beyond the complex trait genetic variants we have examined here. We welcome you to use SCAVENGE to discover more phenotype relevant cells!

Installation:

The package can be installed directly from GitHub by typing the following in an R console:

if(!require("remotes")) install.packages("remotes")

remotes::install_github("https://github.com/sankaranlab/SCAVENGE")
library(SCAVENGE)

Documentation

This web resource and vignette compiliation shows how to reproduce results of SCAVENGE analysis with monocyte count on a 10X PBMC dataset.

Tutorials

See the [Wiki page] for extra information such as preparing your GWAS data for SCAVENGE (finemapping):

  • [SCAVENGE] Preparing your GWAS data for finemapping
  • [SCAVENGE] Preparing your scATAC-seq data
  • [SCAVENGE] Rule of thumb of SCAVENGE analysis and intepretation
  • [SCAVENGE-L] SCAVENGE-L method for single cell (mt)DNA mutation-based lineage tracing analysis

FAQs

  • What input data are accepted for SCAVENGE analysis?
    A: The count matrix of scATAC-seq data and fine-mapped variants from GWAS summary statistics (we provided a tutorial for fine-mapping analysis from GWAS [Wiki page]). Theoretically, GWAS summary statistics can be used as input but we do not recommend it because LD can obscure causal cell type identification.
  • Can I use scRNA-seq instead of scATAC-seq?
    A: It is not feasible for SCAVENGE analysis from scRNA-seq currently. We are actively developing this tool to be scalable to scRNA-seq, please stay tuned.
  • How can I request new feature?
    A: We open [Discussions] page, please feel free to discuss and post your ideas.

Citation

If you used or adapted SCAVENGE in your study, please cite our paper [Nat Biotechnol] || [PubMed].
Variant to function mapping at single-cell resolution through network propagation.

Contact

If you run into issues and would like to report them, you can submit an Issue.
Alternatively, you can contact authors: fyu{at}broadinstitute.org, lcato{at}broadinstitute.org, cweng{at}wi.mit.edu, and/or sankaran{at}broadinstitute.org.

scavenge's People

Contributors

bschilder avatar fl-yu avatar ldcato avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

scavenge's Issues

Two analyses of the same code yielded different results

Hello,

I run the same code by twice. I did not change anything. And I set the same seed number like your examples.
But I got the different result of "true_cell_top_idx" for each cell. Is this a bug? Or is it because of the permutation test method?

Best

TF-IDF issue

Hello Fulong,
SCAVENGE is a good tool deciphering the function of genetic variant at single-cell level.
I have 2 questions about the algorithm of SCAVENGE.
1: what is the binarized sparse matrix?
2: you used TF-IDF to calculate the weight for each feature. it seems that the IDF in your paper looks a little different from (log(N/(dfi+1))).

image

Fine mapping to reproduce COVID19 results

Hi,
I am trying to reproduce the fine mapping results for your COVID19 analysis, but I am not able to get the same sentinel SNPs as reported. Would it be possible for you to share the reference panel and script used by you?
Thanks

Recommended R Version and chromVAR installation issues

Hi Sankaran Lab,

Is there an R version that you would recommend/is a dependency? I've been trying to install some of the dependencies but I'm running into problems mainly with chromVAR. I've been following instructions on Bioconductor to install chromVAR (https://bioconductor.org/packages/release/bioc/html/chromVAR.html) and it does say that R >=3.4 is ok but I'm wondering if it's better to use an R version closer to R 3.4.0? Currently I don't have the error warnings to share but I'm in the process of re-running the installation of chromVAR with R 3.5.1. I'll update soon with those error messages.

Thanks in advance,

Joaquin

Input issues

Hi Fulong,

Thanks for developing SCAVENGE! I am trying to do cell enrichment with the help of it but met an input issue.

I extracted peakmatrix from our ArchR object.
peakmatrix <- getMatrixFromProject(projHeart,useMatrix = "PeakMatrix") and used it as the input of SCAVENGE following your tutorial.
The projHeart we used was processed similarly with your COVID data in the SCAVENGE paper. The finemapping data we used currently are tutorial data in SCAVENGE package.

When I tried to run computeWeightedDeviations(peakmatrix, trait_import), it returned

Error in (function (cond)  : 
  error in evaluating the argument 'object' in selecting a method for function 'getFragmentsPerPeak': "counts" %in% assayNames(object) is not TRUE

I am wondering if I extract peakmatrix correctly from the ArchR object. Do you have some suggestion on how to extract the input of SCAVENGE?

Any suggestion would be appreciated!
Thank you!
Xiaotong

error at get_sigcell_simple()

Hi, thanks for this great tool! I am facing an error during permutation analysis:

Stationary step: 74
Stationary Delta: 9.83820972996253e-06
Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, :
arguments imply differing number of rows: 0, 199764
Calls: get_sigcell_simple ... as.data.frame -> as.data.frame.list -> do.call ->

How to make trait_file from GWAS summary statistics ?

Hello,
First, thanks for developing a new method for overcoming the data sparsity of scATAC!
We were suffering the same issue...

My pipeline is using archR and I successfully exported peakmatrix and import it to SCAVENGE.
I have 2 questions for running SCAVENGE on my sample.

  1. my archR object is made using hg38, so my peakmatrix is hg38 mapped.
    If I liftover GWAS SNPs from hg19 to hg38, is it possible to run SCAVENGE on my hg38 mapped peakmatrix?

  2. I'm trying to generate trait_file from GWAS summary statistics file which I want to use.
    I understood that to run SCAVENGE, I have to calculate each GWAS SNP's posterior probability of causality.

image
I tried to find how to generate trait_file from GWAS summary statistics file, but I couldn't find it...
So, is there any instructions for it?

Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.