berenslab / umi-normalization Goto Github PK

Companion repository to Lause, Berens & Kobak (2021): "Analytic Pearson residuals for normalization of single-cell RNA-seq UMI data", Genome Biology

Home Page: https://doi.org/10.1186/s13059-021-02451-7

License: GNU Affero General Public License v3.0

Jupyter Notebook 99.88% Python 0.12%

scrna single-cell-rna-seq single-cell-analysis umi-count umi-count-matrix normalization glm-pca negative-binomial-regression negative-binomial-model negative-binomial

umi-normalization's Issues

Integration

Hello,

Thanks for your great work. I've been following the development of your method and its' implantation in Scanpy and can't wait to start using it in my future analyses.

I would like to know however, if you have any recommendation of how this method could be used in the integration step of single cell RNA seq analysis? Seurat's SCTransform offers an integration workflow that uses SCT normalized dataset.

Thanks.

add normalization method to scanpy?

hi!
recently read the paper and found the method really convincing and effective!
would you be interested in submitting a PR to scanpy? Had a look at the code and seems pretty straightforward to re implement there. This is essentially the function needed right?

umi-normalization/tools.py

Line 128 in d15bd59

def pearson_residuals(counts, theta):

If you are interested and willing, I would suggest you to loosely follow sc.pp.normalize_total for implementation
https://github.com/theislab/scanpy/blob/5533b644e796379fd146bf8e659fd49f92f718cd/scanpy/preprocessing/_normalization.py#L28-L202
you can also have a look at docs
would be very happy to assist/help out in case you are interested!

Thank you!
Giovanni

Handling batch effects

Hey, how would you recommend handing multiple batches using this normalization scheme? In the past, I've used scTransform, which puts the batch variable into the model itself, so I guess it kind of regresses out the batch effect. It's worked very well in the past for me. This normalization scheme here is simpler and doesn't account for batch effects in the model, so I'm wondering how you recommend dealing with them.

In your paper, in the Cao figure, I noticed you identified batch-specific genes, and just removed them from the dataset. I'm unsure of this, isn't it possible these genes might be biologically relevant? I also noticed that in your PR to scanpy comment you mention applying this normalization to each batch separately, then just concatenating the results. Wouldn't this also be problematic? For instance, if I have two batches of different cell populations and one gene is never expressed in one batch, the residuals will always be zero, since it never deviates. In the other batch, for instance the gene is always expressed. In this case the residuals will also always be zero, since the gene is always expressed, and the model mean can fit this.

I'd love to get your feedback regarding this.

berenslab / umi-normalization Goto Github PK

umi-normalization's Issues

Integration

add normalization method to scanpy?

Handling batch effects

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent