Light

welch-lab / perturbnet Goto Github PK

View Code? Open in Web Editor NEW

26.0 2.0 10.0 7.34 MB

PerturbNet is a deep generative model that can predict the distribution of cell states induced by chemical or genetic perturbation

License: GNU General Public License v3.0

Python 100.00%

perturbnet's Introduction

PerturbNet

PerturbNet is a deep generative model that can predict the distribution of cell states induced by chemical or genetic perturbation. The repository contains the code for the preprint PerturbNet predicts single-cell responses to unseen chemical and genetic perturbations.

System Requirements and Installation

PerturbNet works on Linux, Mac, or Windows. The key system requirements are Python (>3.7) and PyTorch (>1.7). TensorFlow is required for some functionality. To install the package, simply install PyTorch (and TensorFlow if needed), then clone the repository. Expected installation time is about 10 minutes.

Some related module versions are:

(1) Python: python3.8-anaconda/2020.07
(2) numpy: 1.18.5
(3) pandas 1.0.5
(4) scanpy: 1.8.1
(5) tensorflow: 1.14.0 
(6) matplotlib: 3.2.2
(7) scvi-tools: 0.7.1
(8) torch: 1.10.0
(9) umap-learn: 0.4.6

Repository Structure and Usage

./net2net contains the conditional invertible neural network (cINN) modules in the GitHub repository of Network-to-Network Translation with Conditional Invertible Neural Networks.

./perturbnet contains the code to train the PerturbNet framework. We provide illustrations and guidance of how to use our repository for PerturbNet

./pytorch_scvi contains our adapted modules to decode latent representations to expression profiles based on scVI version 0.7.1.

Demo and Instructions for Usage

We have provided example datasets on Dropbox and a Jupyter notebook showing how to run PerturbNet on the example dataset in ./examples.

Reference

Please consider citing

@article {Yu2022.07.20.500854,
	author = {Yu, Hengshi and Welch, Joshua D},
	title = {PerturbNet predicts single-cell responses to unseen chemical and genetic perturbations},
	elocation-id = {2022.07.20.500854},
	year = {2022},
	doi = {10.1101/2022.07.20.500854},
	publisher = {Cold Spring Harbor Laboratory},
	URL = {https://www.biorxiv.org/content/early/2022/07/22/2022.07.20.500854},
	eprint = {https://www.biorxiv.org/content/early/2022/07/22/2022.07.20.500854.full.pdf},
	journal = {bioRxiv}
}

We appreciate your interest in our work.

perturbnet's People

Contributors

Stargazers

Watchers

Forkers

trellixvulnteam cjwong20 cyclopenta xpgogogo shunsunsun cbirchsy shicheng-guo nickdst

perturbnet's Issues

How are the onehot files generated?

Thanks for the nice work! we find this interesting. But when trying to follow the models, we find that:

There is one file loaded as data_gi_onehot_all. The dimension of it is cell x genes. I am not sure how is this matrix generated? Does the 1 means that specific gene is perturbed in a cell?

For the other file loaded as data_gi_onehot. The dimension of it is number of perturbed genes x number of all genes. Same, I find that when there are only 2 genes perturbed, there could be more than two 1 in that row, which indicates that this is not a matrix that tells the relationship between perturbed genes and all the genes used. So, how should I generate this matrix?
Cheers,
Yue

a question about the dataset

The dataset in example folder (https://www.dropbox.com/sh/cl8e4dm6a5peyoi/AAC2Oj200ii34k77Q0XGD2Nia?dl=0) has been deleted. Could you upload a toy dataset in anywhere else? Thx!

RAM usage exceeds capacity when loading sciplex chemical dataset.

When attempting to load the SCIPLEX chemical dataset using the perturbnet_sciplex_example_notebook.ipynb file, my system's RAM becomes fully utilized and the process is killed. My system has 64 GB of RAM, but it appears that loading this dataset exceeds its capacity.

This issue occurs during the execution of the notebook, specifically when loading the SCIPLEX chemical dataset. Despite having sufficient RAM, the process is unable to complete due to excessive memory consumption. The error comes in the following line of code:

(2) load models

generation scvi

adata_train = adata[idx_to_train, :].copy()
adata_train = adata_train[kept_indices, :].copy()

scvi.data.setup_anndata(adata_train, layer = "counts")
scvi_model_cinn = scvi.model.SCVI.load(path_scvi_model_cinn, adata_train, use_cuda = False)
scvi_model_de = scvi_predictive_z(scvi_model_cinn)

device = 'cuda' if torch.cuda.is_available() else 'cpu'

ChemicalVAE

model_chemvae = ChemicalVAE(n_char = data_chem_onehot.shape[2], max_len = data_chem_onehot.shape[1]).to(device)
model_chemvae.load_state_dict(torch.load(path_chemvae_model, map_location = device))
model_chemvae.eval()

I would like to request assistance in understanding the system requirements for running PerturbNet and resolving this issue to successfully load the SCIPLEX chemical dataset without exhausting the available RAM.

GI Application example

Hi,

Thanks for providing the Sciplex example. Can you also please provide an example for the genetic interactions application?

Originally posted by @Naghipourfar in #1 (comment)

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.