Giter VIP home page Giter VIP logo

Comments (4)

mariiabilous avatar mariiabilous commented on June 22, 2024 1

Hi @daskelly,

Thanks a lot for you interest in our method!

There are two option available to address your question:

  1. Run SuperCell on your 3 samples specifying that there are 3 different samples and that the metacells (super-cells) should not contain single cells from different samples. This can be done with the parameter cell.split.condition in SCimplify().

  2. Run SuperCell on your separate samples and merge the results using supercell_merge() function that I just added thanks to your question. Please, see the example in the function description.

Which approach to use highly depends on what you would do at the single-cell level. For instance, if you would analyze your samples together, you can go with the first approach.
If your samples are very big, you can use the second approach that will save you some time and memory, as each sample will be processed individually. Please, note, that in the first approach, the set of features to build metacells will be the same for all the samples, while at the second one, if you don't provide your set of features (genes.use parameter in SCimplify()), each sample will be processed (i.e., metacells will be built) with its own set of highly variable genes.

Please, let me know if this answers your question and don't hesitate to contact me if you have any other questions or suggestions!

Bets,
Mariia

from supercell.

mariiabilous avatar mariiabilous commented on June 22, 2024 1

Sure!

  1. Suppose I build the metacells separately on each sample using the same granularity and same highly variable genes, then merge them with supercell_merge(). Is this result going to be equivalent to running on all samples simultaneously and using the parameter cell.split.condition as you suggest (with the same granularity and HVGs)?

The short answer is "No".
Processing samples separately based on the same set of HVG would still result in different dimensionality reduction embedding. Namely, PCA of each separate sample is different from the PCA of all samples merged together, as the first one would be driven by the heterogeneity of a particular sample and the second one by the overall heterogeneity (and possibly some technical variability among samples). Since SuperCell does dimensionality reduction to build metacells, this will result in different metacell partitions.
Please, see the brief example showing that the metacell partition is different when doing an independent construction (supercell_merge()) and a combined approach (SCimplify() for all samples together specifying cell.split.condition parameter).

I expect, that metacells built with an independent approach would be more 'stable' as they are based on the intra-sample heterogeneity.

Note, that in the case of a combined approach (all samples together, specifying cell.split.condition parameter), the actual graining level represents the overall granularity and might be different for each sample. For instance, if your actual grading level is 20, it might be that one sample s1 has granularity 15 (i.e., average metacell size is 15), another sample s2 has granularity 22, etc.
While the independent construction (using supercell_merge()) will guarantee a requested graining level for all samples.

Would there be anything strange about using different granularities on different samples, or is this a normal thing to do? I am thinking it might be natural to use a lower granularity on a sample with fewer cells, and a higher granularity on a sample with more cells (assuming the same tissue).

It can happen in the combined analysis, different samples have different granularity due to their different complexity and heterogeneity. I think it is acceptable to process samples of different sizes at different graining levels, as long as the size distribution of metacells you are going to combine in the same analyses is acceptable. I wouldn't go with gamma = 10 and gamma = 100 in the same analysis.

Thank you a lot for your interesting questions! If you try different approaches, I would be happy to know your experience and your thought on which approach was more appropriate in the analyses you performed.

from supercell.

daskelly avatar daskelly commented on June 22, 2024

Hi @mariiabilous thank you! This is really helpful and it does answer my question.

Can I ask two follow-up questions?

  1. Suppose I build the metacells separately on each sample using the same granularity and same highly variable genes, then merge them with supercell_merge(). Is this result going to be equivalent to running on all samples simultaneously and using the parameter cell.split.condition as you suggest (with the same granularity and HVGs)?
  2. Would there be anything strange about using different granularities on different samples, or is this a normal thing to do? I am thinking it might be natural to use a lower granularity on a sample with fewer cells, and a higher granularity on a sample with more cells (assuming the same tissue).

Thanks for your responsiveness!

from supercell.

daskelly avatar daskelly commented on June 22, 2024

Thank you @mariiabilous this is helpful and makes a lot of sense!

from supercell.

Related Issues (10)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.