Giter VIP home page Giter VIP logo

supercell's Introduction

R-CMD-check DOI License

Coarse-graining of large single-cell RNA-seq data into metacells

SuperCell is an R package for coarse-graining large single-cell RNA-seq data into metacells and performing downstream analysis at the metacell level.

The exponential scaling of scRNA-seq data represents an important hurdle for downstream analyses. One of the solutions to facilitate the analysis of large-scale and noisy scRNA-seq data is to merge transcriptionally highly similar cells into metacells. This concept was first introduced by Baran et al., 2019 (MetaCell) and by Iacono et al., 2018 (bigSCale). More recent methods to build metacells have been described in Ben-Kiki et al. 2022 (MetaCell2), Bilous et al., 2022 (SuperCell) and Persad et al., 2022 (SEACells). Despite some differences in the implementation, all the methods are network-based and can be summarized as follows:

1. A single-cell network is computed based on cell-to-cell similarity (in transcriptomic space)

2. Highly similar cells are identified as those forming dense regions in the single-cell network and merged together into metacells (coarse-graining)

3. Transcriptomic information within each metacell is combined (average or sum).

4. Metacell data are used for the downstream analyses instead of large-scale single-cell data

Unlike clustering, the aim of metacells is not to identify large groups of cells that comprehensively capture biological concepts, like cell types, but to merge cells that share highly similar profiles, and may carry repetitive information. Therefore metacells represent a compromise structure that optimally remove redundant information in scRNA-seq data while preserving the biologically relevant heterogeneity.

An important concept when building metacells is the graining level (γ), which we define as the ratio between the number of single cells in the initial data and the number of metacells. We suggest applying γ between 10 and 50, which significantly reduces the computational resources needed to perform the downstream analyses while preserving most of the result of the initial (i.e., single-cell) analyses.

Installation

SuperCell requires igraph, RANN, WeightedCluster, corpcor, weights, Hmisc, Matrix, matrixStats, plyr, irlba, grDevices, patchwork, ggplot2. SuperCell uses velocyto.R for RNA velocity.

install.packages("igraph")
install.packages("RANN")
install.packages("WeightedCluster")
install.packages("corpcor")
install.packages("weights")
install.packages("Hmisc")
install.packages("Matrix")
install.packages("patchwork")
install.packages("plyr")
install.packages("irlba")

Installing SuperCell package from gitHub

if (!requireNamespace("remotes")) install.packages("remotes")
remotes::install_github("GfellerLab/SuperCell")

library(SuperCell)

Examples

  1. Building and analyzing metacells with SuperCell
  2. RNA velocity applied to SuperCell object
  3. Building metacells with SuperCell and alayzing them with a standard Seurat pipeline
  4. Data integration of metacells built with SuperCell

SuperCell is developed by the group of David Gfeller at University of Lausanne.

SuperCell can be used freely by academic groups for non-commercial purposes (see license). The product is provided free of charge, and, therefore, on an “as is” basis, without warranty of any kind.

FOR-PROFIT USERS

If you plan to use SuperCell or any data provided with the script in any for-profit application, you are required to obtain a separate license. To do so, please contact [email protected] at the Ludwig Institute for Cancer Research Ltd.

If required, FOR-PROFIT USERS are also expected to have proper licenses for the tools used in SuperCell, including the R packages igraph, RANN, WeightedCluster, corpora, weights, Hmisc, Matrix, ply, irlba, grDevices, patchwork, ggplot2 and velocyto.R

For scientific questions, please contact Mariia Bilous ([email protected]) or David Gfeller ([email protected]).

How to cite

If you use SuperCell in a publication, please cite: Bilous et al. Metacells untangle large and complex single-cell transcriptome networks, BMC Bioinformatics (2022).

supercell's People

Contributors

leonardherault avatar mariiabilous avatar shians avatar xloctran avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

supercell's Issues

chore: Code of Conduct

Hey, I've seen this research on [BC]2 lately, nice work!
I may suggest some improvements from the software engineering perspective:

It would be nice to include a Code of Conduct for this repository.
I usually use this one but its entirely up to you really.

Conda package

Hi!
First off, thank you for this nice package and for the active development!

Since I want to apply your package in my work, I wanted to isolate it in a conda environment. Therefore I forked your code, packaged it and uploaded it to anaconda. I made sure to reference your work and also your licence there. If you have any issue with this, let me know and I remove the package.

You can find the package here

Best wishes!
Kevin

ci: static code analysis

Hey, I've seen this research on [BC]2 lately, nice work!
I may suggest some improvements from the software engineering perspective:

Having GH Actions mechanism set up for this repo (see #8) you could add another CI workflow for code linting.
That way you would ensure that the codebase remains high quality in terms of R code standards.

Take a look at these packages:

It would be a good idea to have a simple YAML CI workflow which would scan your R scripts on commit push/pull request and fail in case any modifications should be applied.

chore: GitHub templates

Hey, I've seen this research on [BC]2 lately, nice work!
I may suggest some improvements from the software engineering perspective:

To ease others interacting with this repo you could add GitHub templates for issues (bug report/ feature request) and pull requests.
I usually use these ones, but feel free to construct your own:

ci: automated test run on a sampled dataset

Hey, I've seen this research on [BC]2 lately, nice work!
I may suggest some improvements from the software engineering perspective:

Since this is a public repo unlimited minutes of GitHub Actions for CI/CD are available.

I see that there are already some instructions on how to test the functionality of this package on an example dataset in the README. It would be very nice to set up a simple GH Actions workflow which would trigger a testrun with every commit being pushed or every merge request being issued.
I would suggest to add a separate top level directory test and inside input and output subdirectories. The CI would then run test commands on the data in the former and compare the results against the latter.

More info on the Actions:

chore: bioconductor upload

Hey, I've seen this research on [BC]2 lately, nice work!
I may suggest some improvements from the software engineering perspective:

As I see now currently one needs to manually install all the dependencies and then this package from GitHub.
It would be cool to upload it to Bioconductor and set automatic installation of all the requirements.

merging independent SuperCell runs

Hi, thanks for developing this very nice package!

I have a question about merging SuperCell objects. Let's say I have three samples and I wish to run SuperCell separately on each sample -- e.g. to ensure that metacells are only composed of cells from the same biological specimen. Is there a way to merge these multiple SuperCell objects so that I can then run the weighted PCA and clustering on the combined data? Thanks for any tips!

docs: R vignette

Hey, I've seen this research on [BC]2 lately, nice work!
I may suggest some improvements from the software engineering perspective:

Currently this r package does not have a vignette, right?
It would be suitable to move all the example analyses info from the README.md into a vignette, where such descriptions are usually available. Having that set up README would be more concise too.
More info: https://r-pkgs.org/vignettes.html

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.