Giter VIP home page Giter VIP logo

cefcon's Introduction



CEFCON is a computational tool for deciphering driver regulators of cell fate decisions from single-cell RNA-seq data. It takes a prior gene interaction network and expression profiles from scRNA-seq data associated with a given developmental trajectory as inputs, and consists of three main components, including cell-lineage-specific gene regulatory network (GRN) construction, driver regulator identification and regulon-like gene module (RGM) identification.


About method

CEFCON initially employs the graph attention neural networks under a contrastive learning framework to construct reliable GRNs for specific developmental cell lineages (Fig. b). Subsequently, CEFCON characterizes gene regulatory dynamics from the perspective of network control theory and identifies the driver regulators that steer cell fate decisions (Fig. c). Moreover, CEFCON detects gene regulatory modules (i.e., RGMs) involving the identified driver regulators and measure their activities using AUCell (Fig. d).


CEFCON was originally tested on Ubuntu 20.04 with Python (3.9~3.10). We recommend running CEFCON on CUDA if possible. The following packages are required to be able to run this code:


Optional (for performance evaluation, visualization and other analyses)

  • matplotlib(>=3.5.3)
  • matplotlib-venn(>=0.11.7)
  • seaborn(>=0.12.1)
  • palantir(==1.0.1)
  • rpy2(>=3.4.1)
  • R(>=4.0)
    • PRROC (R package)
    • slingshot (R package)
    • MAST (R package)

Setup a conda environment

conda create -y --name CEFCON python=3.10
conda activate CEFCON

Install R and the required packages

conda install -y -c conda-forge r
R --no-save -q < ./r_env.R

Install using pip

pip install git+


We recommend using GRUOBI to solve the integer linear programming (ILP) problem when identifying driver genes. GUROBI is a commercial solver that requires licenses to run. Thankfully, it provides free licenses in academia, as well as trial licenses outside academia. If there is no problem about the licenses, you need to install the gurobipy package.

If difficulties arise while using GUROBI, the non-commercial solver, SCIP, will be employed as an alternative. But the use of SCIP does not come with a guarantee of achieving a successful solution.

Using GPU

We recommend using GPU. If you choose to do so, you will need to install the GPU version of PyTorch.

Usage example

Command line usage

cefcon [-h] --input_expData PATH --input_priorNet PATH [--input_genesDE PATH] \
           [--additional_edges_pct ADDITIONAL_EDGES_PCT] [--cuda CUDA] [--seed SEED] \
           [--hidden_dim HIDDEN_DIM] [--output_dim OUTPUT_DIM] [--heads HEADS] [--attention {COS,AD,SD}] \
           [--miu MIU] [--epochs EPOCHS] [--repeats REPEATS] [--edge_threshold_param EDGE_THRESHOLD_PARAM] \
           [--remove_self_loops] [--topK_drivers TOPK_DRIVERS] --out_dir OUT_DIR

Please use -h to view parameters information.
Please run the bash file for a usage example.

Input data

  • scRNA-seq data: a '.csv' file in which rows represent cells and columns represent genes, or a '.h5ad' formatted file with AnnData objects.
  • Prior gene interaction network: an edgelist formatted network file.
     We provide prior gene interaction networks for human and mouse respectively, located in /prior_data.
  • Gene differential expression level: a 'csv' file contains the log fold change of each gene.

An example of input data (i.e., the hESC dataset with 1,000 highly variable genes) can be found in /example_data. All the input data mentioned in the paper can be downloaded from here.

The output results can be found in the folder ${OUT_DIR}/:

- "cell_lineage_GRN.csv": the constructed cell-lineage-specific GRN;
- "gene_embs.csv": the gene embeddings;
- "driver_regulators.csv": a list of identified driver regulators with their influence scores;
- "RGMs.csv": a list of obtained RGMs;
- "AUCell_mtx.csv": the AUCell activity matrix of the obtained RGMs.

Package usage

Quick start by an example (Jupyter Notebook).
Please check this Notebook for scRNA-seq preprocessing.

import cefcon as cf

# We assume you have an AnnData object containing scRNA-seq data, cell lineages information,
# and gene differential expression levels (optional).
# We also assume you have a pandas dataframe containing the prior gene interaction network
# in edgelist format.

# Data preparation
data = cf.data_preparation(adata, prior_network)

for lineage, data_li in data.items():
    # Construct cell-lineage-specific GRN
    cefcon_GRN_model = cf.NetModel(epochs=350, repeats=3, cuda='0')
    cefcon_results = cefcon_GRN_model.get_cefcon_results(edge_threshold_avgDegree=8)
    # Identify dirver regulators

    # Identify regulon-like gene modules

Please check this Notebook for results visualization and analyses.


Please cite the following paper, if you find the repository or the paper useful.

Peizhuo Wang, Xiao Wen, Han Li, Peng Lang, Shuya Li, Yipin Lei, Hantao Shu, Lin Gao, Dan Zhao and Jianyang Zeng, Deciphering driver regulators of cell fate decisions from single-cell transcriptomics data with CEFCON, Nat Commun, 14, 8459 (2023).

  title={Deciphering driver regulators of cell fate decisions from single-cell transcriptomics data with CEFCON},
  author={Wang, peizhuo and Wen, Xiao and Li, Han and Lang, Peng and Li, Shuya and Yipin, Lei and Shu, Hantao and Gao, Lin and Zhao, Dan and Zeng, Jianyang},
  journal={Nature Communications},

Bugs & Suggestions

Please contact [email protected] or raise an issue in the github repo with any questions.

cefcon's People


wpzgithub avatar


 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar



cefcon's Issues

CUDA out of memory

Hi ,great work!
when I run

prior_network = pd.read_csv('./network_mouse.csv')
data = cf.data_preparation(adata, prior_network)
[0] - Data loading and preprocessing...
Consider the input data with 1 lineages:
  Lineage - all:
    337 extra edges (Spearman correlation > 0.6) are added into the prior gene interaction network.
    Total number of edges: 3537148.
    n_genes × n_cells = 12335 × 1822

CUDA = '0'
cefcon_results_dict = {}
for li, data_li in data.items():
    # We suggest setting up multiple repeats to minimize the randomness of the computation.
    cefcon_GRN_model = cf.NetModel(epochs=350, repeats=3, seed=-1,cuda=CUDA)

    cefcon_results = cefcon_GRN_model.get_cefcon_results(edge_threshold_avgDegree=8)
    cefcon_results_dict[li] = cefcon_results

get error like this

[1] - Constructing cell-lineage-specific GRN...
Lineage - all: 
Warning: Auxiliary gene scores (e.g., differential expression level) are not considered!
0%|                                                                       | 0/350 [00:00<?, ?it/s]
OutOfMemoryError                          Traceback (most recent call last)
File <timed exec>:5

File ~/run/miniconda3/envs/CEFCON/lib/python3.10/site-packages/cefcon/, in, adata, showProgressBar)
 368 with trange(self.epochs, ncols=100) as t:
 369     for epoch in t:
--> 370         loss = self.__train(data, DGI_model, optimizer)
 371         t.set_description('  Iter: {}/{}'.format(rep + 1, self.repeats))
 372         if epoch < self.epochs - 1:

File ~/run/miniconda3/envs/CEFCON/lib/python3.10/site-packages/cefcon/, in NetModel.__train(data, model, optimizer)
 324 model.train()
 325 optimizer.zero_grad()
--> 326 pos_z, neg_z, summary = model(data)
 327 loss = model.loss(pos_z, neg_z, summary)
 328 loss.backward()

File ~/run/miniconda3/envs/CEFCON/lib/python3.10/site-packages/torch/nn/modules/, in Module._call_impl(self, *input, **kwargs)
1190 # If we don't have any hooks, we want to skip the rest of the logic in
1191 # this function, and just call forward.
1192 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1193         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1194     return forward_call(*input, **kwargs)
1195 # Do not call functions when jit is used
1196 full_backward_hooks, non_full_backward_hooks = [], []

File ~/run/miniconda3/envs/CEFCON/lib/python3.10/site-packages/torch_geometric/nn/models/, in DeepGraphInfomax.forward(self, *args, **kwargs)
  49 def forward(self, *args, **kwargs) -> Tuple[Tensor, Tensor, Tensor]:
  50     """Returns the latent space for the input arguments, their
  51     corruptions and their summary representation."""
---> 52     pos_z = self.encoder(*args, **kwargs)
  54     cor = self.corruption(*args, **kwargs)
  55     cor = cor if isinstance(cor, tuple) else (cor, )

File ~/run/miniconda3/envs/CEFCON/lib/python3.10/site-packages/torch/nn/modules/, in Module._call_impl(self, *input, **kwargs)
1190 # If we don't have any hooks, we want to skip the rest of the logic in
1191 # this function, and just call forward.
1192 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1193         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1194     return forward_call(*input, **kwargs)
1195 # Do not call functions when jit is used
1196 full_backward_hooks, non_full_backward_hooks = [], []

File ~/run/miniconda3/envs/CEFCON/lib/python3.10/site-packages/cefcon/, in GRN_Encoder.forward(self, data)
 220 for norm, attn_in, attn_out, ffn in self.layers:
 221     x = norm(x)
--> 222     x_in, att_weights_in_ = attn_in(x, edge_index, x_auxiliary, return_attention_weights=True)
 223     x_out, att_weights_out_ = attn_out(x, edge_index, x_auxiliary, return_attention_weights=True)
 224     x = ffn(, self.act(x_out)), 1))

File ~/run/miniconda3/envs/CEFCON/lib/python3.10/site-packages/torch/nn/modules/, in Module._call_impl(self, *input, **kwargs)
1190 # If we don't have any hooks, we want to skip the rest of the logic in
1191 # this function, and just call forward.
1192 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1193         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1194     return forward_call(*input, **kwargs)
1195 # Do not call functions when jit is used
1196 full_backward_hooks, non_full_backward_hooks = [], []

File ~/run/miniconda3/envs/CEFCON/lib/python3.10/site-packages/cefcon/, in GraphAttention_layer.forward(self, x, edge_index, x_auxiliary, return_attention_weights)
 109     x_norm_l = F.normalize(x_l, p=2., dim=-1)
 110     x_norm_r = F.normalize(x_r, p=2., dim=-1)
--> 111     out = self.propagate(edge_index, x=(x_l, x_r), x_norm=(x_norm_l, x_norm_r),
 112                          x_auxiliary=x_auxiliary, size=None)
 113 else:  # SD
 114     out = self.propagate(edge_index, x=(x_l, x_r), x_norm=None,
 115                          x_auxiliary=x_auxiliary, size=None)

File ~/run/miniconda3/envs/CEFCON/lib/python3.10/site-packages/torch_geometric/nn/conv/, in MessagePassing.propagate(self, edge_index, size, **kwargs)
 452     for arg in decomp_args:
 453         kwargs[arg] = decomp_kwargs[arg][i]
--> 455 coll_dict = self._collect(self._user_args, edge_index, size,
 456                           kwargs)
 458 msg_kwargs = self.inspector.distribute('message', coll_dict)
 459 for hook in self._message_forward_pre_hooks.values():

File ~/run/miniconda3/envs/CEFCON/lib/python3.10/site-packages/torch_geometric/nn/conv/, in MessagePassing._collect(self, args, edge_index, size, kwargs)
 327         if isinstance(data, Tensor):
 328             self._set_size(size, dim, data)
--> 329             data = self._lift(data, edge_index, dim)
 331         out[arg] = data
 333 if is_torch_sparse_tensor(edge_index):

File ~/run/miniconda3/envs/CEFCON/lib/python3.10/site-packages/torch_geometric/nn/conv/, in MessagePassing._lift(self, src, edge_index, dim)
 269     raise IndexError(
 270         f"Encountered an index error. Please ensure that all "
 271         f"indices in 'edge_index' point to valid indices in "
 272         f"the interval [0, {src.size(self.node_dim) - 1}] "
 273         f"(got interval "
 274         f"[{int(index.min())}, {int(index.max())}])")
 275 else:
--> 276     raise e
 278 if index.numel() > 0 and index.min() < 0:
 279     raise ValueError(
 280         f"Found negative indices in 'edge_index' (got "
 281         f"{index.min().item()}). Please ensure that all "
 282         f"indices in 'edge_index' point to valid indices "
 283         f"in the interval [0, {src.size(self.node_dim)}) in "
 284         f"your node feature matrix and try again.")

File ~/run/miniconda3/envs/CEFCON/lib/python3.10/site-packages/torch_geometric/nn/conv/, in MessagePassing._lift(self, src, edge_index, dim)
 264 try:
 265     index = edge_index[dim]
--> 266     return src.index_select(self.node_dim, index)
 267 except (IndexError, RuntimeError) as e:
 268     if index.min() < 0 or index.max() >= src.size(self.node_dim):

OutOfMemoryError: CUDA out of memory. Tried to allocate 6.77 GiB (GPU 0; 23.65 GiB total capacity; 20.62 GiB already allocated; 2.49 GiB free; 20.64 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

How to save results from jupyter?

Dear author, thanks for your work!
However, I meet some questions during use. Would you mind provide some help?

  1. How to save result's fig as PDF or SVG file? Such as these example figs in jupyter notebook?
    Screenshot from 2024-01-25 13-53-39

  2. How to save detail result from jupyter notebook? Such as, If I hope to check the detail info of every regulon-like gene module. What should I do?

Thanks a lot for your help!

Best regards,

How to run without GPU

Hi, glad to post first post!

CUDA = '0'
cefcon_results_dict = {}
for li, data_li in data.items():
    # We suggest setting up multiple repeats to minimize the randomness of the computation.
    cefcon_GRN_model = cf.NetModel(epochs=350, repeats=3, cuda=CUDA, seed=-1)

    cefcon_results = cefcon_GRN_model.get_cefcon_results(edge_threshold_avgDegree=8)
    cefcon_results_dict[li] = cefcon_results

get error

RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from

How to run without GPU?

Request to Share Data Used in the Paper

Dear author, I would like to express my appreciation for the remarkable achievement of the CEFCON project.

I have meticulously searched the article, Supplementary Information (SI), source data, GitHub Repository, and Zenodo Repository. However, I have been unable to locate the driver list results for CEFCON or alternative methods of the mESC dataset.

Would you be able to upload the result file to this platform?

Thank you for your attention to this matter.
Best regards

Installing error: no matching torch version

Hi @WPZgithub ,
When I try to install the CEFCON locally on my PC (windows), I encountered the following error:

#install from downloading github files locally
pip install F:\Project\癌细胞可塑性\※※scRNA-seq_analysis\00.1imitation_melanoma_cellline\07.regulatory_network\


ERROR: Could not find a version that satisfies the requirement torch<2.0,>=1.13.0 (from cefcon) (from versions: 2.0.0, 2.0.1, 2.1.0, 2.1.1, 2.1.2, 2.2.0, 2.2.1, 2.2.2)
ERROR: No matching distribution found for torch<2.0,>=1.13.0

How can I fix this? I just plan to try running the example code locally and then transfer to large dataset on the distributed server with multiple threads

Lineage information input in the command line usage of CEFCON

Hi, @WPZgithub
I have a question on lineage information input in the command line usage of CEFCON. As I'm not familiar with the python, I prefer to use the command line tools for CEFCON. But if I input an expression matrix as a csv file, I realize I did not input the lineage information and found no other arguments if I specify the input_expData as the csv file purely.

I guess the information was included in the single cell object like the python package SCANPY AnnData object. But since the CEFCON offered the option to input the csv file, how can I input the lineage information with the csv file to construct the lineage specific GRN?

Many thanks if early reply can be received!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.