IRIS-FGM

---Integrative scRNA-Seq Interpretation System for Functional Gene Module analysis

Abstract

Single-cell RNA-Seq data is useful in discovering cell heterogeneity and signature genes in specific cell populations in cancer and other complex diseases. Specifically, the investigation of functional gene modules (FGM) can help to understand gene interactive networks and complex biological processes. QUBIC2 is recognized as one of the most efficient and effective tools for FGM identification from scRNA-Seq data. However, its availability is limited to a C implementation, and its applicative power is affected by only a few downstream analyses functionalities. We developed an R package named IRIS-FGM (integrative scRNA-Seq interpretation system for functional gene module analysis) to support the investigation of FGMs and cell clustering using scRNA-Seq data. Empowered by QUBIC2, IRIS-FGM can identify co-expressed and co-regulated FGMs, predict types/clusters, identify differentially expressed genes, and perform functional enrichment analysis. It is noteworthy that IRIS-FGM also applies Seurat objects that can be easily used in the Seurat vignettes.

Workflow

Environment

System environment:

We recommend users to install IRIS-FGM on large memory (32GB) based UNIX/Linux operation system if the user aims at analyzing bicluster-based co-expression analysis. If the user seeks to analyze data by quick mode, we recommend installing IRIS-FGM on a small memory (8GB) based Windows or Linux operation system; IRIS-FGM does not support MAC.

R environment: R (equal or greater than 3.5)

Usage

Pre-installation

Install required packages from CRAN:

install.packages(c('BiocManager','devtools', 'AdaptGauss', "pheatmap", 'mixtools','MCL', 'anocva', 'qgraph','Rtools','ggpubr',"ggraph","Seurat"))

Install required packages from Bioconductor:

BiocManager::install(c('org.Mm.eg.db','multtest','org.Hs.eg.db','clusterProfiler','DEsingle', 'DrImpute','scater', 'scran'))

Install IRIS-FGM from github:

devtools::install_github("BMEngineeR/IRIS-FGM", force = T)

Now the MetaQUBIC is successfully installed and ready for use.

Data preparation

Download example data from Yan's RPKM 90 cell embroyonic single cell RNA-Seq data, meta file (cell type information) and paper. The alternative way to download is to use the step 2 method in this section.
Set the working directory where you put data and import the IRIS-FGM library.

setwd("./my_working_dir/")
library(IRISFGM)
# this step is for download Yan's expression data and meta file.
download.file(url = "https://bmbl.bmi.osumc.edu/downloadFiles/Yan_expression.txt",destfile = "./my_working_dir/Yan_expression.txt")
download.file(url = "https://bmbl.bmi.osumc.edu/downloadFiles/Yan_cell_label.txt",destfile = "./my_working_dir/Yan_cell_label.txt")

Read in data.

InputMatrix <- read.table("./my_working_dir/Yan_expression.txt",header = T, row.names = 1,check.names = F)
Meta_info <- read.table("./my_working_dir/Yan_cell_label.txt",header = T,row.names = 1)

IRIS-FGM also provides the function to read in 10X scRNA-Seq data format.

Read HDF5 fileReadFrom10X_h5() or read the folder which contain three files (barcode, sparse matrix, genes)ReadFrom10X_folder()

Analysis data

Data preprocessing and LTMG modeling

Create IRIS-FGM object:

object <- CreateIRISCEMObject(InputMatrix)

Adding meta information for your cell:

This step can add the customized cell label by the user. The format of a file passing to meta.info is a data frame of which row name should be cell ID, and column name should be cell type.

object <- AddMeta(object, meta.info = Meta_info)

Filtering out low quality data:

Use PlotMeta and SubsetData together to filter out low-quality cells.

PlotMeta(object)

object <- SubsetData(object , nFeature.upper=15000,nFeature.lower=8000,
                         Counts.upper=700000,Counts.lower=400000)

After cells get filtered, we can check the distribution again.

PlotMeta(object)

Normalization:

Users can choose to perform normalization based on their needs. The normalization method has two options, one is the simplest CPM normalization (default normalization = 'LibrarySizeNormalization'). The other is from package scran and can be opened by using parameter normalization = 'scran'. Compared to the CPM normalization method, scran will be more accurate but takes more time.

object <- ProcessData(object, normalization = "LibrarySizeNormalization")

LTMG modeling:

Here, we will use Left-truncated Mixture Gaussian distribution to model the regulatory signal of each gene. Parameter, 'Gene_use', decides number of top highly variant gene for LTMG modeling, and here we use top 2000 highly variant genes.

object <- RunLTMG(object, Gene_use = "2000", seed = 123)

Biclustering

IRIS-FGM can provide biclustering function, which is based on our in-house novel algorithm, [QUBIC2] (https://github.com/maqin2001/qubic2) to predict functional gene module.

Discretization & biclustering
If you have more cells to analyze functional gene module you can use LTMG or Quantile based discretization (QUBIC). In this step, IRIS-FGM will generate files in the local working directory and do not remove them before finishing your analysis (files include tmp_expression.txt, tmp_expression.txt.chars, tmp_expression.txt.chars.blocks, tmp_expression.txt.chars.chars, and tmp_expression.txt.rules).

LTMG based discretization:

You need to run LTMG modeling then binarizing them. It might take a lot of time if you have a large size of cells.

object <- RunLTMG(object, Gene_use = "all", seed = 123)
object <- CalBinaryMultiSignal(object)
object <- RunBicluster(object, DiscretizationModel = "LTMG",OpenDual = FALSE,
                          NumBlockOutput = 100, BlockOverlap = 0.7, BlockCellMin = 15)

Quantile based discretization:

It might require users to adjust parameter q for deciding cutoff of binarizing.

object <- RunDiscretization(object, q = 0.06)
object <- RunBicluster(object, DiscretizationModel = "Quantile",OpenDual = FALSE, Extension = 0.90,
                          NumBlockOutput = 1000, BlockOverlap = 0.7, BlockCellMin = 15)

Analyzing functional gene module This section is based on the quantile based discretization and biclustering results.

Visualize the gene module-to-module relationship globally:

The figure via the number of overlap genes (controlled edge.by = "gene") shows that bicluster 20, 23, 24, and 25 have similar gene components. Bicluster 21, 22, 25, and until bicluster 35 have similar gene components. In this figure, we can see gene modules in bicluster 20, 23, 24, and 25 may have a similar component and similar functionality, whereas the gene components from this gene module may differ from the other gene modules from the other biclusters (i.e., bicluster 21, 22, 25, and until bicluster 35)

PlotNetwork(object,N.bicluster =c(20:30), edge.by = "gene")

Visualize bicluster 20 & bicluster 35 on heatmap.

From step 1 in "Analysis functional gene module," we postulate gene components from bicluster 20, 23, 24, and 25 may differ from gene components from bicluster 21, 22, 25, and until bicluster 35. Therefore, in this section, we will focus on how the difference is. Therefore, we use bicluster 20 and bicluster 35 to generate heatmap and show such a difference.

PlotHeatmap(object,N.bicluster = c(20,35),show.annotation = F)

Visualize local co-expression gene module network:

Since we already know the bicluster 20 and bicluster 35 showing the difference at the global level. We then will focus on a local gene module and investigate the co-expression gene network with in the module. Yellow nodes represent the gene module network from bicluster #20. The nodes' size indicates the degree of presence (the number of connected edges for one node). The thickness of edges suggests the value of the correlation coefficient.

From this figure (bicluster 20, click the figure for zooming in), we can tell the EIFAD gene show a negative correlation (red color edge) to GOSR1 & BBS5, and also show a positive correlation (grey edge) to ZNF394 & POTEM in a group of cells from this bicluster.

PlotModuleNetwork(object, N.bicluster = 20, cutoff=0.6, Node.color = "#E8E504")

From this figure (bicluster 35, click the figure for zooming in), we can tell the ROBO1 gene shows a negative correlation (red color edge) to NLRP4 & BPGM. GNPDA2 gene shows a positive correlation to BPGM, KIT, and CCDC25 in a group of cells from this bicluster.

PlotModuleNetwork(object, N.bicluster = 35, cutoff=0.6, Node.color = "#E8E504")

Functional enrichment analysis.

ISIR-FGM provide a functional enrichment analysis for a selected gene module.

For gene module from bicluster 20:

object <- RunPathway(object ,module.number =20, selected.gene.cutoff = 0.05, species = "Human", database = "GO", genes.source = "Bicluster")

	ONTOLOGY	ID	Description	GeneRatio	BgRatio	pvalue	p.adjust	qvalue	geneID	Count
GO:0008278	CC	GO:0008278	cohesin complex	2/34	16/19717	0.000341142521037922	0.0313851119354888	0.0280095964641662	STAG3L3/STAG3L2	2
GO:0005689	CC	GO:0005689	U12-type spliceosomal complex	2/34	27/19717	0.000986050510033922	0.0453583234615604	0.0404799683066557	ZMAT5/SNRNP35	2
GO:0000062	MF	GO:0000062	fatty-acyl-CoA binding	2/32	22/17697	0.000715377512795969	0.0470414266188569	0.038513448693801	ACBD5/SOAT2	2
GO:1901567	MF	GO:1901567	fatty acid derivative binding	2/32	28/17697	0.00116271921939764	0.0470414266188569	0.038513448693801	ACBD5/SOAT2	2
GO:0005484	MF	GO:0005484	SNAP receptor activity	2/32	31/17697	0.001425497776329	0.0470414266188569	0.038513448693801	STX11/GOSR1	2

For gene module from bicluster 35:

object <- RunPathway(object ,module.number =35, selected.gene.cutoff = 0.05, species = "Human", database = "GO", genes.source = "Bicluster")

	ONTOLOGY	ID	Description	GeneRatio	BgRatio	pvalue	p.adjust	qvalue	geneID	Count
GO:0001667	BP	GO:0001667	ameboidal-type cell migration	6/29	461/18670	6.43045336163777e-05	0.0376874533879247	0.0301960094109077	ROBO1/PTPRG/ERBB4/KIT/PTPRR/HDAC9	6
GO:0021889	BP	GO:0021889	olfactory bulb interneuron differentiation	2/29	12/18670	0.000152281291259449	0.0376874533879247	0.0301960094109077	ROBO1/ERBB4	2
GO:0010631	BP	GO:0010631	epithelial cell migration	5/29	351/18670	0.000186973532900314	0.0376874533879247	0.0301960094109077	ROBO1/PTPRG/KIT/PTPRR/HDAC9	5
GO:0090132	BP	GO:0090132	epithelium migration	5/29	354/18670	0.000194519550646714	0.0376874533879247	0.0301960094109077	ROBO1/PTPRG/KIT/PTPRR/HDAC9	5
GO:0090130	BP	GO:0090130	tissue migration	5/29	360/18670	0.000210309449709401	0.0376874533879247	0.0301960094109077	ROBO1/PTPRG/KIT/PTPRR/HDAC9	5
GO:0042692	BP	GO:0042692	muscle cell differentiation	5/29	385/18670	0.000286907770833753	0.0428448937778404	0.0343282631067754	KIT/SYNE1/TMOD2/UCHL1/HDAC9	5
GO:0005001	MF	GO:0005001	transmembrane receptor protein tyrosine phosphatase activity	2/32	17/17697	0.000423558866693072	0.0222368405013863	0.0164965032290986	PTPRG/PTPRR	2
GO:0019198	MF	GO:0019198	transmembrane receptor protein phosphatase activity	2/32	17/17697	0.000423558866693072	0.0222368405013863	0.0164965032290986	PTPRG/PTPRR	2

Cell type identification based on MCL

IRIS-FGM provides the function to use the output biclusters to predict cell type based on the MCL algorithm, which predicted results outperformed the other nine bicluster and cluster methods (Details can be found from figure B in IRIS-FGM paper). In this case, we identify eight cell clusters.

PlotModuleNetwork(object, N.bicluster = 35, cutoff=0.6, Node.color = "#E8E504")
unique(object@MetaInfo$MC_Label)
[1] 1 2 3 4 5 6 7 8

Visualization cell clusters on UMAP

IRIS-FGM integrates the Seurat package for carrying out dimension reduction. Furthermore, IRIS-FGM also integrates the Seurat package cell clustering method as an alternative approach. Before running dimension reduction, you should perform LTMG modeling if you would like to LTMG signal matrix as input rather than the processed expression matrix. The advantage of using the LTMG signal matrix as input is that it can improve dimension reduction and cell clustering. (Detail can be found in this LTMG paper)

Use processed expression matrix as input to generate UMAP and cell clusters based on the Seurat package.

runing program:

object <- IRISFGM::RunDimensionReduction(object,mat.source = "UMImatrix")
object <- IRISFGM::RunClassification(object,resolution = 0.8)

Visualization by using different cell label: "Cluster" is the benchmark label from paper. "MC_Label" is the label based on MCL prediction. "Seurat0.6" is the label based on the Seurat cell clustering method.

# This function will ask you to choose an index (represent a cluster identity).
# Use benchmark cell clusters.
PlotDimension(object)
select condition to present
1 : ncount_RNA
2 : nFeature
3 : Cluster
4 : MC_Label
5 : Seurat0.8
select index of cell condition: 3

# Use MCL predicted cell clusters.
PlotDimension(object)
select condition to present
1 : ncount_RNA
2 : nFeature
3 : Cluster
4 : MC_Label
5 : Seurat0.8
select index of cell condition: 4

# Use Seurat predicted cell clusters.
PlotDimension(object)
select condition to present
1 : ncount_RNA
2 : nFeature
3 : Cluster
4 : MC_Label
5 : Seurat0.8
select index of cell condition: 5

Use LTMG signal matrix as input to generate UMAP and cell clusters based on the Seurat package.

runing program:

object <- IRISFGM::RunDimensionReduction(object,mat.source = "LTMG")
object <- IRISFGM::RunClassification(object,resolution = 0.8)

Visualization by using different cell label: "Cluster" is the benchmark label from paper. "MC_Label" is the label based on MCL prediction. "Seurat0.6" is the label based on the Seurat cell clustering method.

# This function will ask you to choose an index (represent a cluster identity).
# Use benchmark cell clusters.
PlotDimension(object)
select condition to present
1 : ncount_RNA
2 : nFeature
3 : Cluster
4 : MC_Label
5 : Seurat0.8
6 : Seurat0.8
select index of cell condition: 3

# Use MCL predicted cell clusters.
PlotDimension(object)
select condition to present
1 : ncount_RNA
2 : nFeature
3 : Cluster
4 : MC_Label
5 : Seurat0.8
6 : Seurat0.8
select index of cell condition: 4

# Use Seurat predicted cell clusters.
PlotDimension(object)
select condition to present
1 : ncount_RNA
2 : nFeature
3 : Cluster
4 : MC_Label
5 : Seurat0.8
6 : Seurat0.8
select index of cell condition: 6

DEG analysis and functional enrichment analysis.

Cell clusters have been identified by a different method. We can use these cell clusters to analyze DEGs and perform functional enrichment analysis.

Find DEG based on MCL prediction.

Find DEGs.

This function will ask you to provide an index of a cluster label, select a target cluster (group 1), and select a comparing cluster (group 2). Here we select "ML_Label" cell clusters (index is 4), choose the 8th cluster (index is 3, as group 1), and compare to the rest of all clusters (index is 9, as group 2). The DEGs' result can be found in object@BiCluster@MarkerGene .

object <- IRISFGM::FindMarkers(object)
select condition to compare
1 : ncount_RNA
2 : nFeature
3 : Cluster
4 : MC_Label
5 : Seurat0.8
6 : Seurat0.8

select index of cell condition: 4
select index (left) of first group to compare : 
1 : 1
2 : 2
3 : 3
4 : 4
5 : 5
6 : 6
7 : 7
8 : 8
input first group index : 3
select index (left) of second group to compare : 
1 : 1
2 : 2
3 : 3
4 : 4
5 : 5
6 : 6
7 : 7
9 : rest of all

select index of group 2: 9
# DEGs based on MCL predicted label was store in object@BiCluster@MarkerGene
object@BiCluster@MarkerGene[1:5,]

	LFC	pval	pvalue.adj.FDR
GNA14	1.775559	0.000000e+00	0.000000e+00
SYTL3	1.753460	0.000000e+00	0.000000e+00
OSBPL10	1.682444	3.330669e-16	2.220446e-13
MIXL1	1.465656	7.771561e-16	3.885781e-13
PRR15	1.778150	9.992007e-16	3.996803e-13

Functional enrichment analysis.

The result can be found at object@BiCluster@PathwayFromMC.

object <- RunPathway(object, species = "Human", database = "GO", genes.source = "Bicluster")
object@BiCluster@PathwayFromMC[1:5,]

ONTOLOGY	ID	Description	GeneRatio	BgRatio	pvalue	p.adjust	qvalue	geneID	Count
GO:0006402	BP	GO:0006402	mRNA catabolic process	110/1417	364/18670	3.68880405623349e-38	2.00855380861913e-34	1.69646157070359e-34	CNOT7/CSDE1/RPL7A/ALKBH5/PATL2/DDX5/PAIP1/IGF2BP3/LSM1/YTHDF2/PSMC6/HNRNPM/LSM6/PSMB1/PSMD2/PSMD8/PSMA3/RPL17/EIF4G1/LSM2/ETF1/PSMA5/RBM46/PSMD14/EIF4A3/PSMA6/HNRNPC/RPL34/PSMA2/HSPA1A/PSMD11/HNRNPU/YBX1/HSPA8/RPL21/EXOSC3/RPS4Y1/PSMD1/RPL14/EXOSC7/RNPS1/PSMB7/PSMD12/HSPA1B/HNRNPR/HSPB1/PPP2CA/RPL4/RPS24/RBM8A/PABPC4/HNRNPD/MAGOH/NCBP2/RPL19/RPL6/EXOSC5/EXOSC9/YWHAZ/RPL13/PSMC2/PSMB3/PSMA1/POLR2G/SERBP1/PSMD9/PCID2/RPS20/XPO1/PSMB6/RPL7/RPL22/RPS7/RPL9/PSMB4/RPS3/MRTO4/RPS6/PABPC1/RPS8/YWHAB/APEX1/RPS3A/RPL10/PSMC3/RPL36A/UBC/PNRC2/RPL29/PSMD4/RPL12/UBB/PSMD6/RPS21/RPL18A/RPS26/RPL15/RPL36/RPL18/RPL38/LSM4/RPL23A/RPL26/RPL10A/RPL37A/SET/EIF3E/RPL30/EXOSC8/RPS19	110
GO:0006401	BP	GO:0006401	RNA catabolic process	113/1417	397/18670	1.90474189922193e-36	5.18565982063171e-33	4.37990387247401e-33	CNOT7/CSDE1/RPL7A/ALKBH5/PATL2/DDX5/PAIP1/IGF2BP3/LSM1/YTHDF2/PSMC6/HNRNPM/LSM6/PSMB1/PSMD2/PSMD8/PSMA3/RPL17/EIF4G1/LSM2/ETF1/XRN2/PSMA5/RBM46/PSMD14/EIF4A3/PSMA6/HNRNPC/RPL34/PSMA2/HSPA1A/PSMD11/HNRNPU/YBX1/HSPA8/RPL21/DIS3L/EXOSC3/RPS4Y1/PSMD1/RPL14/EXOSC7/RNPS1/PSMB7/PSMD12/HSPA1B/HNRNPR/HSPB1/PPP2CA/RPL4/RPS24/RBM8A/PABPC4/HNRNPD/MAGOH/NCBP2/RPL19/RPL6/EXOSC5/EXOSC9/YWHAZ/RPL13/PSMC2/PSMB3/PSMA1/POLR2G/SERBP1/PSMD9/PCID2/RPS20/XPO1/PSMB6/LIN28A/RPL7/RPL22/RPS7/RPL9/PSMB4/RPS3/MRTO4/RPS6/PABPC1/RPS8/YWHAB/APEX1/RPS3A/RPL10/PSMC3/RPL36A/UBC/PNRC2/RPL29/PSMD4/RPL12/UBB/PSMD6/RPS21/RPL18A/RPS26/RPL15/RPL36/RPL18/RPL38/LSM4/RPL23A/RPL26/RPL10A/RPL37A/SET/EIF3E/RPL30/EXOSC8/RPS19	113
GO:0042254	BP	GO:0042254	ribosome biogenesis	96/1417	297/18670	4.35948140939203e-36	7.91245875804653e-33	6.68300851846799e-33	NOP58/NOL11/RPL7A/SBDS/NOP56/RPF2/DDX47/NOP16/FBL/DDX21/MRPS7/PWP1/ISG20L2/LSM6/RPL7L1/NOLC1/GTPBP4/NHP2/DDX17/C1QBP/DDX10/XRN2/NSA2/EIF4A3/RIOK1/DDX3X/EBNA1BP2/C1D/PAK1IP1/IMP3/RIOK3/UTP18/NIP7/EXOSC3/MRPL20/IMP4/RPL14/EXOSC7/ZNF622/MRPS11/WDR43/EIF6/RPS24/RPP25/RPL6/EXOSC5/CEBPZ/EXOSC9/RPF1/RPL10L/DDX49/MRPS9/TRMT112/RSL1D1/RRN3/XPO1/RPL7/RPS7/MPHOSPH6/POP5/UTP3/NPM3/RSL24D1/ZNHIT3/ZNF593/KRR1/POP7/MRTO4/RPS6/UTP14A/GLUL/PA2G4/GTF3A/RPS8/LTV1/RPL10/WDR18/RPL12/RPS21/EMG1/RPL26L1/RPL38/RPS27L/DCAF13/RPL23A/TFB2M/RPL26/RPL10A/NOP10/MRPL36/MRPL11/MPHOSPH10/MRPL1/EXOSC8/MRPS2/RPS19	96
GO:0008380	BP	GO:0008380	RNA splicing	120/1417	469/18670	1.02798676122236e-33	1.39934697871394e-30	1.18191425257381e-30	ESRP1/TRA2B/IWS1/SRSF2/RBMX2/SRSF3/SF3B5/HNRNPA1L2/BCAS2/DDX5/SF3A3/DDX47/SNW1/CDC5L/LSM1/NONO/BUD31/SRPK1/HNRNPM/HNRNPA2B1/LSM6/SNRPC/SNRNP25/SNRPB/DDX17/C1QBP/SAP18/GEMIN5/LSM2/HNRNPK/EIF4A3/SRSF11/PCBP1/POLR2L/SRSF7/DHX15/HNRNPC/PSIP1/ISY1/BRDT/SCNM1/RBM7/HSPA1A/SFPQ/CWC25/SNRPA1/RBM3/HNRNPU/LSM10/YBX1/HSPA8/HNRNPA3/ZMAT5/THOC2/THOC7/SLU7/IK/FIP1L1/RNPS1/SYF2/PRPF38A/RBMX/HNRNPR/PPP2CA/U2AF1/RBM8A/DBR1/HNRNPD/PLRG1/MAGOH/PPIL1/PRPF40A/NCBP2/CIR1/SRSF10/POLR2C/POLR2F/HNRNPH1/PNN/PUF60/CLP1/POLR2G/RBM4/SNRPD2/ZNF830/POLR2E/SRSF9/CCAR1/PRDX6/CLK1/MFAP1/PAPOLA/RBM25/CWC15/PRPF31/SNRPD3/CTNNBL1/PPIG/PTBP1/CLK4/SNRPF/PABPC1/RNF113A/SF1/C9orf78/ZCRB1/TGS1/C2orf49/POLR2H/SNUPN/SRSF8/RPS26/STRAP/LSM4/PHF5A/GTF2F2/SRSF1/MPHOSPH10/KHDRBS3/ZMAT2	120
GO:0006413	BP	GO:0006413	translational initiation	71/1417	193/18670	5.74054371002534e-31	6.2514521002176e-28	5.28009167770542e-28	RPL7A/EIF4A1/PAIP1/YTHDF2/EIF5/EIF4A2/EIF3I/RPL17/EIF4G1/EIF4E2/DDX3X/RPL34/RPL21/RPS4Y1/EIF3D/RPL14/HSPB1/EIF3B/RPL4/EIF6/RPS24/EIF3J/NCBP2/RPL19/RPL6/EIF1AD/RPL13/ATF4/POLR2G/RBM4/EIF3A/EIF3G/RPS20/RPL7/RPL22/PPP1CA/RPS7/EIF3K/RPL9/CDC123/RPS3/RPS6/PABPC1/EIF2S2/RPS8/RPS3A/EIF3M/RPL10/RPL36A/EIF3F/RPL29/RPL12/RPS21/RPL18A/RPS26/RPL15/RPL36/MTIF2/RPL18/RPL38/RPL23A/RPL26/RPL10A/EIF4G2/RPL37A/EIF3C/EIF1/EIF3E/RPL30/PPP1R15A/RPS19	71

Find DEG based on Seurat prediction (LTMG as input matrix).

Find DEGs.

This function will ask you to provide an index of a cluster label, select a target cluster (group 1), and select a comparing cluster (group 2). Here we select "Seurat0.8" (result based on inputing LTMG signal matrix) cell clusters (index is 6), choose the 1th cluster (index is 2, as group 1), and compare to the rest of all clusters (index is 4, as group 2). The DEGs' result can be found in object@LTMG@MarkerGene .

object <- IRISFGM::FindMarkers(object)
select condition to compare
1 : ncount_RNA
2 : nFeature
3 : Cluster
4 : MC_Label
5 : Seurat0.8
6 : Seurat0.8

select index of cell condition: 6
select index of cell condition: 14
select index (left) of first group to compare : 
1 : 0
2 : 1
3 : 2

input first group index : 2
select index (left) of second group to compare : 
1 : 0
3 : 2
4 : rest of all

select index of group 2: 4

# DEGs based on MCL predicted label was store in object@LTMG@MarkerGene
object@LTMG@MarkerGene[1:5,]

	LFC	pval	pvalue.adj.FDR
PAIP1	2.72862490853613	0	0
ARG2	2.50563425970776	0	0
TPRXL	4.14463501839634	0	0
INTS8	2.15470215492115	0	0
RNF38	2.41515880455291	0	0

Functional enrichment analysis.

The result can be found at object@LTMG@Pathway.

object <- RunPathway(object, species = "Human", database = "GO", genes.source = "CTS")
object@LTMG@Pathway[1:5,]

ONTOLOGY	ID	Description	GeneRatio	BgRatio	pvalue	p.adjust	qvalue	geneID	Count
GO:0006402	BP	GO:0006402	mRNA catabolic process	134/1549	364/18670	3.15782308102538e-53	1.7298554837857e-49	1.44794498325753e-49	PAIP1/CSDE1/ETF1/ALKBH5/PSMA6/PSMD8/PSMB1/SERBP1/EIF4G1/HNRNPM/CNOT7/RPL17/PATL2/RPL7A/PSMA5/PSMD12/HSPA1A/YWHAB/RPL6/RPL34/RPL21/HSPB1/LSM2/MAGOH/RPS7/HSPA1B/PSMD1/LSM1/PSMD2/RPL4/RBM46/EXOSC3/RPL19/PCID2/YBX1/RBM8A/PSMA2/IGF2BP3/EXOSC5/HNRNPC/RPL14/RPS4Y1/PSMC6/PSMA1/EXOSC7/RPL22/HNRNPD/PSMC3/PSMD11/XPO1/PABPC1/HSPA8/HNRNPR/POLR2G/RPL13/HNRNPU/EXOSC9/PSMA3/PSMB7/RPL5/PSMD9/YTHDF2/YWHAZ/RNPS1/EXOSC8/SET/RPL9/PSMA7/RPL7/DDX5/DCP1A/PSMB2/RPS24/MRTO4/RPS3A/RPS3/EIF3E/RPS20/APEX1/EIF4A3/LSM4/RPL18/UBB/RPS8/RPL23/RPS21/NCBP2/LSM3/RPL36A/PSMB5/RPL36/RPL37A/PSMB3/PSMC2/RPL38/PSMD14/PSMB6/RPL15/RPL10/RPL29/RPL30/RPL18A/PPP2CA/PRPF18/LSM6/LSM7/RPS6/UBC/PSMD6/RPL26/RPS28/RPS11/RPL10A/PNRC2/NPM1/RPS9/PSMC4/RPLP1/RPL11/RPLP2/RPS29/RPL23A/RPL24/RPL3/RPLP0/SSB/PSMC1/RPSA/RPS19/PABPC4/PSMD7/RPS27/RPS27A/RPL35	134
GO:0006401	BP	GO:0006401	RNA catabolic process	138/1549	397/18670	2.47116937223922e-51	6.76853291056321e-48	5.66548093972317e-48	PAIP1/CSDE1/ETF1/ALKBH5/PSMA6/PSMD8/PSMB1/SERBP1/XRN2/EIF4G1/HNRNPM/CNOT7/RPL17/PATL2/RPL7A/PSMA5/PSMD12/HSPA1A/YWHAB/RPL6/RPL34/RPL21/HSPB1/LSM2/MAGOH/RPS7/HSPA1B/PSMD1/LSM1/PSMD2/RPL4/RBM46/EXOSC3/RPL19/PCID2/YBX1/RBM8A/PSMA2/DIS3L/IGF2BP3/EXOSC5/HNRNPC/RPL14/LIN28A/RPS4Y1/PSMC6/PSMA1/EXOSC7/RPL22/HNRNPD/PSMC3/PSMD11/XPO1/PABPC1/HSPA8/HNRNPR/POLR2G/RPL13/HNRNPU/EXOSC9/PSMA3/PSMB7/RPL5/PSMD9/YTHDF2/YWHAZ/RNPS1/EXOSC8/SET/RPL9/PSMA7/RPL7/DDX5/DCP1A/PSMB2/RPS24/MRTO4/RPS3A/RPS3/EIF3E/RPS20/APEX1/EIF4A3/LSM4/RPL18/UBB/RPS8/RPL23/RPS21/NCBP2/LSM3/RPL36A/PSMB5/RPL36/RPL37A/PSMB3/PSMC2/RPL38/PSMD14/PSMB6/RPL15/RPL10/RPL29/RPL30/RPL18A/PPP2CA/PRPF18/LSM6/LSM7/RPS6/UBC/PSMD6/RPL26/RPS28/RPS11/RPL10A/PNRC2/NPM1/RPS9/PSMC4/RPLP1/RPL11/RPLP2/RPS29/RPL23A/RPL24/RPL3/RPLP0/SSB/PSMC1/RPSA/RPS19/PABPC4/PSMD7/RPS27/RPS27A/RPL35/DKC1	138
GO:0006413	BP	GO:0006413	translational initiation	94/1549	193/18670	3.81465668554073e-50	6.96556310779738e-47	5.83040158674226e-47	PAIP1/EIF4A1/EIF4G1/EIF3B/RPL17/RPL7A/EIF3D/RPL6/RPL34/RPL21/HSPB1/EIF6/RPS7/EIF4A2/EIF3I/RPL4/RPL19/PPP1CA/RPL14/RPS4Y1/EIF3M/RPL22/EIF3J/PABPC1/ATF4/POLR2G/RPL13/EIF3A/EIF2S2/EIF3G/EIF4E2/EIF5B/RPL5/EIF3K/YTHDF2/EIF4EBP1/PPP1R15A/RPL9/EIF1AD/EIF1/RPL7/RPS24/RPS3A/MTIF2/RPS3/EIF4G2/EIF3E/EIF3F/RPS20/RPL18/RPS8/EIF1B/RPL23/RPS21/NCBP2/RPL36A/EIF4H/EIF5/RPL36/DDX3X/RPL37A/RPL38/EIF3C/RPL15/RPL10/RPL29/RPL30/RPL18A/EIF2S3/RPS6/RPL26/RPS28/EIF3H/RPS11/RPL10A/NPM1/RPS9/RPLP1/RPL11/DDX1/RPLP2/RPS29/RPL23A/RBM4/RPL24/RPL3/RPLP0/EIF1AX/RPSA/RPS19/RPS27/RPS27A/CDC123/RPL35	94
GO:0042254	BP	GO:0042254	ribosome biogenesis	111/1549	297/18670	6.79251406976065e-45	9.3023480185372e-42	7.78636612838878e-42	C1QBP/NSA2/RPL7L1/FBL/EBNA1BP2/PWP1/RIOK1/DDX21/XRN2/RPL7A/RPF2/MRPS7/DDX17/RPL6/UTP18/TRMT112/ISG20L2/NHP2/EIF6/RPS7/WDR43/NOP16/MRPL20/MRPS11/NOLC1/EXOSC3/DDX10/NIP7/SBDS/RIOK3/GTPBP4/EXOSC5/RPL14/NPM3/POP7/GLUL/NOP58/EXOSC7/DDX49/NOL11/GTF3A/DDX47/XPO1/RSL1D1/EMG1/WDR18/ZNF622/KRR1/IMP4/RPP25/EXOSC9/CEBPZ/C1D/NOP56/RPL5/PAK1IP1/IMP3/UTP3/MRPL1/RRS1/EXOSC8/MPHOSPH6/TFB2M/MRPS2/UTP14A/RPL7/RAN/MRPL36/RPS24/MRTO4/RRN3/LTV1/RPL10L/DCAF13/EIF4A3/MRPL10/RPS8/RPS21/ZNHIT3/RPL26L1/POP5/NGDN/MRPL11/DDX3X/RPS27L/LYAR/ZNF593/RPL38/RPF1/PA2G4/RPL10/LSM6/RPS6/RPL26/RPS28/NOB1/WBP11/BYSL/RPL10A/NPM1/RPS9/RPL11/RPL23A/RPL24/RPL3/RPLP0/RPSA/RPS19/RPS27/RPL35/DKC1	111
GO:0008380	BP	GO:0008380	RNA splicing	137/1549	469/18670	4.15036127638313e-41	4.54713581440535e-38	3.8060997305105e-38	GEMIN5/RBMX2/C1QBP/IWS1/SRPK1/ESRP1/TRA2B/PSIP1/HNRNPA3/HNRNPM/POLR2E/SRSF2/SF3A3/DDX17/SF3B5/SRSF3/CDC5L/HSPA1A/SNRPC/SRSF7/SNRPB/NONO/LSM2/MAGOH/THOC2/HNRNPK/LSM1/BUD31/PUF60/CIR1/POLR2L/PCBP1/SNRPA1/PAPOLA/RRAGC/HNRNPA2B1/YBX1/RBM8A/DHX15/SNW1/HNRNPC/CWC25/PTBP1/RBM3/PRPF40A/HNRNPD/SFPQ/DDX47/PABPC1/HSPA8/SLU7/BCAS2/FIP1L1/ZCRB1/HNRNPR/SRSF9/POLR2G/THOC7/HNRNPU/SRSF11/SNRPD2/SNRNP25/PPIL1/SCNM1/HNRNPA1L2/LSM10/POLR2F/RBM7/ISY1/SAP18/TGS1/SRSF10/RNPS1/POLR2H/ZNF830/ZMAT5/PRDX6/U2AF1/C2orf49/PRPF38A/DDX5/PQBP1/HNRNPH1/SYF2/POLR2C/CLP1/MFAP1/BRDT/TMBIM6/PRPF31/CWC15/CTNNBL1/RBM25/PNN/CLNS1A/RNF113A/EIF4A3/C9orf78/LSM4/SNRPF/CLK4/NCBP2/LSM3/CCAR1/UBL5/PLRG1/PPIG/SF1/IK/PHF5A/SNRNP27/PCBP2/GTF2F2/SRSF1/ZMAT2/SNRPE/SRSF8/PPP2CA/SRSF5/PRPF18/DBR1/PPIH/LSM6/LSM7/SNRPD3/WBP11/CLK1/LUC7L3/DDX1/POLR2K/RBM4/STRAP/RBMX/SNRPG/PRMT5/DNAJC8/HNRNPA1	137

carter-allen / irisfgm Goto Github PK

irisfgm's Introduction

IRIS-FGM

Abstract

Workflow

Environment

Usage

Pre-installation

Data preparation

Read in data.

Analysis data

Data preprocessing and LTMG modeling

Biclustering

Cell type identification based on MCL

Visualization cell clusters on UMAP

DEG analysis and functional enrichment analysis.

irisfgm's People

Contributors

Stargazers

Watchers

Forkers

irisfgm's Issues

Recommend Projects

Recommend Topics

Recommend Org