Giter VIP home page Giter VIP logo

Comments (3)

csoneson avatar csoneson commented on August 16, 2024

Hi Gabby,

the functions in the apply_* scripts are called from the run_diffexpression.R script, which also does all the preprocessing and prepares the list L from the original MultiAssayExperiment object. In the process, it calls the cleaning and subsetting functions in the prepare_mae.R script, to remove rows corresponding to ERCC spike-ins, subset to predefined groups and filter the expression matrix. In principle, I think what you have above should be enough as a minimal L list, except that you may need L$condt to be a named vector, with names matching the column names of L$count. Could you let me know what type of problem you are having? Also note that the code is adapted and in some places limited to two-group comparisons, since that was the focus of our study.

Charlotte

from conquer_comparison.

gabriellajg avatar gabriellajg commented on August 16, 2024

Hi Charlotte,

Thanks for your response. I also want to clean my data set before feeding them into the functions for differential analysis, and I was having some problems creating objects like args, config_file, and config as used in your run_diffexpression.R file I am wondering if you can give me a quick walk through of the process? Say, I have the object L I created in my first post, how could I clean the data and feed them into run_SeuratBimod() function?

Thank you,

Gabby

from conquer_comparison.

csoneson avatar csoneson commented on August 16, 2024

Hi Gabby,

the "args" lines is just there since I call the R scripts from the command line via the makefile, and I need a way to provide arguments to the code.
The "config_file" is a configuration file for each data set (located in the "config" folder), which lists which MultiAssayExperiment object to use, which groups to compare, the sample sizes and number of repeated subsamplings, where to write output etc.

To apply the cleaning functions, you need to have your data in a MultiAssayExperiment object, like the ones that we used in our comparison. The clean_mae() function basically just removes ERCC spike-ins, so if you don't have them you don't need to do that.

The subset_mae() function first extracts a pre-defined collection of samples from the data set and defines a named vector with the grouping information. If you already have that information, you don't need to do that either. Then it does some filtering of genes with low expression (lines 40-61 of prepare_mae.R). This you can just as well do directly on the count matrix.

So to summarise, if you want to apply our functions, you need to have your data in a MultiAssayExperiment object, and you need to tell the function which samples to retain and what group they belong to (for our comparisons, we generated this information with the generate_subsets.R script). However, if you already have the count matrix and the group vector that you want to use, you can just remove the ERCC spike-ins (if applicable) and filter the matrix manually before providing the object to run_SeuratBimod().

Charlotte

from conquer_comparison.

Related Issues (9)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.