Giter VIP home page Giter VIP logo

perturbnet's Introduction

PerturbNet

PerturbNet is a deep generative model that can predict the distribution of cell states induced by chemical or genetic perturbation. The repository contains the code for the preprint PerturbNet predicts single-cell responses to unseen chemical and genetic perturbations.

System Requirements and Installation

PerturbNet works on Linux, Mac, or Windows. The key system requirements are Python (>3.7) and PyTorch (>1.7). TensorFlow is required for some functionality. To install the package, simply install PyTorch (and TensorFlow if needed), then clone the repository. Expected installation time is about 10 minutes.

Some related module versions are:

(1) Python: python3.8-anaconda/2020.07
(2) numpy: 1.18.5
(3) pandas 1.0.5
(4) scanpy: 1.8.1
(5) tensorflow: 1.14.0 
(6) matplotlib: 3.2.2
(7) scvi-tools: 0.7.1
(8) torch: 1.10.0
(9) umap-learn: 0.4.6

Repository Structure and Usage

./net2net contains the conditional invertible neural network (cINN) modules in the GitHub repository of Network-to-Network Translation with Conditional Invertible Neural Networks.

./perturbnet contains the code to train the PerturbNet framework. We provide illustrations and guidance of how to use our repository for PerturbNet

./pytorch_scvi contains our adapted modules to decode latent representations to expression profiles based on scVI version 0.7.1.

Demo and Instructions for Usage

We have provided example datasets on Dropbox and a Jupyter notebook showing how to run PerturbNet on the example dataset in ./examples.

Reference

Please consider citing

@article {Yu2022.07.20.500854,
	author = {Yu, Hengshi and Welch, Joshua D},
	title = {PerturbNet predicts single-cell responses to unseen chemical and genetic perturbations},
	elocation-id = {2022.07.20.500854},
	year = {2022},
	doi = {10.1101/2022.07.20.500854},
	publisher = {Cold Spring Harbor Laboratory},
	URL = {https://www.biorxiv.org/content/early/2022/07/22/2022.07.20.500854},
	eprint = {https://www.biorxiv.org/content/early/2022/07/22/2022.07.20.500854.full.pdf},
	journal = {bioRxiv}
}

We appreciate your interest in our work.

perturbnet's People

Contributors

hengshiyu avatar jw156605 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

perturbnet's Issues

How are the onehot files generated?

Thanks for the nice work! we find this interesting. But when trying to follow the models, we find that:
image
There is one file loaded as data_gi_onehot_all. The dimension of it is cell x genes. I am not sure how is this matrix generated? Does the 1 means that specific gene is perturbed in a cell?

For the other file loaded as data_gi_onehot. The dimension of it is number of perturbed genes x number of all genes. Same, I find that when there are only 2 genes perturbed, there could be more than two 1 in that row, which indicates that this is not a matrix that tells the relationship between perturbed genes and all the genes used. So, how should I generate this matrix?
Cheers,
Yue

RAM usage exceeds capacity when loading sciplex chemical dataset.

When attempting to load the SCIPLEX chemical dataset using the perturbnet_sciplex_example_notebook.ipynb file, my system's RAM becomes fully utilized and the process is killed. My system has 64 GB of RAM, but it appears that loading this dataset exceeds its capacity.

This issue occurs during the execution of the notebook, specifically when loading the SCIPLEX chemical dataset. Despite having sufficient RAM, the process is unable to complete due to excessive memory consumption. The error comes in the following line of code:

(2) load models

generation scvi

adata_train = adata[idx_to_train, :].copy()
adata_train = adata_train[kept_indices, :].copy()

scvi.data.setup_anndata(adata_train, layer = "counts")
scvi_model_cinn = scvi.model.SCVI.load(path_scvi_model_cinn, adata_train, use_cuda = False)
scvi_model_de = scvi_predictive_z(scvi_model_cinn)

device = 'cuda' if torch.cuda.is_available() else 'cpu'

ChemicalVAE

model_chemvae = ChemicalVAE(n_char = data_chem_onehot.shape[2], max_len = data_chem_onehot.shape[1]).to(device)
model_chemvae.load_state_dict(torch.load(path_chemvae_model, map_location = device))
model_chemvae.eval()

I would like to request assistance in understanding the system requirements for running PerturbNet and resolving this issue to successfully load the SCIPLEX chemical dataset without exhausting the available RAM.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.