Giter VIP home page Giter VIP logo

cibersortx's Introduction

Cibersortx pipeline

Step 1 Signature matrix

IF each column of 'single cell reference matrix' is a phenotype label, proceed to step 2

<From cibersortx website>  
The reference sample file is an input file required for custom signature genes file generation by CIBERSORT and consists of a table of the gene expression profiles of reference sample cell populations that will be compared to each other as defined in the phenotype classes file (next section) to generate the custom signature genes file. Each column corresponds to the gene expression profile of a reference sample. Multiple replicates of a given reference sample cell type (one in each column) may be used.'

ELSE take the following two files and convert them using 'convert_reference_file.py' single cell reference matrix class matrix

python convert_reference_file.py MCA_liver_cell_expression.tsv MCA_liver_cell_class.tsv

Step 2 Run Cibersortx to create signature matrix

⭐ parameters (pro tem)
Single cell input options:
Min. Expression 0.5
❗ Download output signature matrix

Step 3 Filter Mixure Matrix

Filter out columns with no expressions, using the signature matrix from step 2

python filter_by_signature_genes.py signature_gene_file.txt mixture_file.tsv

This creates a new mixture file named {mixture_file}_filtered.tsv

Step 4 Run Cibersortx to impute cell fractions

Upload the Mixture matrix from step 3 to the Cibersortx website, then run fraction imputation
⭐ parameters
default

💫 Memo
For the mouse liver data set, 19133+ samples exceeded the 'Allowed memory size of 536870912 bytes'
Use something like the following command to divide data set

# first 10000 samples
cat mixture_file_filtered.tsv  |  cut -f 1-10001  > mixture_file_filtered_1.tsv
# the rest of the samples (if <10000); first column == gene names (index)
cat mixture_file_filtered.tsv  |  cut -f 1,10002-  > mixture_file_filtered_2.tsv

And to check the number of columns per file

cat file.tsv | awk '{print NF}'| sort -nu | tail -n 1  

Cibersortx validation

For validating Cibersortx's results
Using single cell data, mix two cell types together and run cibersortx.

The below command generates an tsv file which may be used as cibersortx input.
Each column consists of a 50%-50% mix of two cell type expressions

python validation.py single_cell_reference_made_in_step_1.tsv

Sorting bulks by specific celltype fractions

Use the following Jupyter Notebook. Input file will be a Cibersortx job result file (Default in .txt format).

Sort_Cibersortx_output_fractions.ipynb

cibersortx's People

Contributors

nina2727 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.