Giter VIP home page Giter VIP logo

berenslab / umi-normalization Goto Github PK

View Code? Open in Web Editor NEW
40.0 8.0 2.0 91.31 MB

Companion repository to Lause, Berens & Kobak (2021): "Analytic Pearson residuals for normalization of single-cell RNA-seq UMI data", Genome Biology

Home Page: https://doi.org/10.1186/s13059-021-02451-7

License: GNU Affero General Public License v3.0

Jupyter Notebook 99.88% Python 0.12%
scrna single-cell-rna-seq single-cell-analysis umi-count umi-count-matrix normalization glm-pca negative-binomial-regression negative-binomial-model negative-binomial

umi-normalization's Issues

Integration

Hello,

Thanks for your great work. I've been following the development of your method and its' implantation in Scanpy and can't wait to start using it in my future analyses.

I would like to know however, if you have any recommendation of how this method could be used in the integration step of single cell RNA seq analysis? Seurat's SCTransform offers an integration workflow that uses SCT normalized dataset.

Thanks.

add normalization method to scanpy?

hi!
recently read the paper and found the method really convincing and effective!
would you be interested in submitting a PR to scanpy? Had a look at the code and seems pretty straightforward to re implement there. This is essentially the function needed right?

def pearson_residuals(counts, theta):

If you are interested and willing, I would suggest you to loosely follow sc.pp.normalize_total for implementation
https://github.com/theislab/scanpy/blob/5533b644e796379fd146bf8e659fd49f92f718cd/scanpy/preprocessing/_normalization.py#L28-L202
you can also have a look at docs
would be very happy to assist/help out in case you are interested!

Thank you!
Giovanni

Handling batch effects

Hey, how would you recommend handing multiple batches using this normalization scheme? In the past, I've used scTransform, which puts the batch variable into the model itself, so I guess it kind of regresses out the batch effect. It's worked very well in the past for me. This normalization scheme here is simpler and doesn't account for batch effects in the model, so I'm wondering how you recommend dealing with them.

In your paper, in the Cao figure, I noticed you identified batch-specific genes, and just removed them from the dataset. I'm unsure of this, isn't it possible these genes might be biologically relevant? I also noticed that in your PR to scanpy comment you mention applying this normalization to each batch separately, then just concatenating the results. Wouldn't this also be problematic? For instance, if I have two batches of different cell populations and one gene is never expressed in one batch, the residuals will always be zero, since it never deviates. In the other batch, for instance the gene is always expressed. In this case the residuals will also always be zero, since the gene is always expressed, and the model mean can fit this.

I'd love to get your feedback regarding this.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.