Giter VIP home page Giter VIP logo

pinellolab / crispr-bean Goto Github PK

View Code? Open in Web Editor NEW
8.0 2.0 1.0 21.06 MB

Activity-normalized variant effect size estimation from pooled CRISPR screens

Home Page: https://pinellolab.github.io/crispr-bean/

License: GNU Affero General Public License v3.0

Python 55.13% Cython 1.62% Jupyter Notebook 43.25%
base-editing crispr screen count-data-modeling ngs-sequencing-data bayesian-network probabilistic-programming-language

crispr-bean's Introduction

crispr-bean

PyPI pyversions PyPI version Test Documentation License: AGPL v3

bean improves CRISPR pooled screen analysis by 1) unconfounding variable per-guide editing outcome by considering genotypic outcome from reporter sequence and 2) through accurate modeling of screen procedure.

Reporter construct

Overview

bean supports end-to-end analysis of pooled sorting screens, with or without reporter.

dag_bean_v2.svg

bean subcommands include the following: Click on the links to see the full documentation.

  1. count, count-samples: Base-editing-aware mapping of guide, optionally with reporter from .fastq files.
    • create-screen creates minimal ReporterScreen object from flat gRNA count file. Note that this way, allele counts are not included and many functionalities involving allele and edit counts are not supported.
  2. profile: Profile editing preferences of your editor.
  3. qc: Quality control report and filtering out / masking of aberrant sample and guides
  4. filter: Filter reporter alleles; essential for tiling mode that allows for all alleles generated from gRNA.
  5. run: Quantify targeted variants' effect sizes from screen data.
  • Screen data is saved as ReporterScreen object in the pipeline. BEAN stores mapped gRNA and allele counts in ReporterScreen object which is compatible with AnnData.

Installation

First install PyTorch. Then download from PyPI:

pip install crispr-bean

For the latest version of bean (and for the test files in tests/data), install from Github:

git clone https://github.com/pinellolab/crispr-bean.git
cd crispr-bean
pip install -e .

Documentaton

See the documentation for tutorials and API references.

Tutorials

Library design Selection Reporter Tutorial link
GWAS variant library FACS sorting Yes/No GWAS variant screen
Coding sequence tiling libarary FACS sorting Yes/No Coding sequence tiling screen
GWAS variant library Survival / Proliferation Yes/No GWAS variant screen
Coding sequence tiling libarary Survival / Proliferation Yes/No Coding sequence tiling screen
Perturbation library without reporter FACS sorting No No reporter screen

Library design: variant or tiling?

The bean filter and bean run steps depend on the type of gRNA library design, where BEAN supports two modes of running. variant library design

  1. variant library: Several gRNAs tile each of the targeted variants. Only the editing rate of the target variant is considered and the bystander effects are ignored.

    • ➕ Increase power for your target variant, as the signal is not distributed across likely no-effect bystanders.
    • ➖ Ignores potential bystander effect
    • ✔️ Suitable for noncoding GWAS variant screens.
  2. tiling library: gRNA densely tiles a long region (e.g. gene(s), exon(s), coding sequence(s)). Bystander edits are considered to obtain alleles with significant fractions. Edited alleles can be "translated" to output coding variants.

    • ➕ Considers bystander effect
    • ➖ If the library results in alleles that are not diverse enough across gRNAs, signal will likely be diluted to all variants in that alleles. (ex. Allele "GGGGG" with a single gRNA score will distribute scores across 5 G's.)
    • ✔️ Suitable for coding variant screens with tiling design.

Using BEAN as Python module

import bean as be
cdata = be.read_h5ad("bean_counts_sample.h5ad")

Python package bean supports multiple data wrangling functionalities for ReporterScreen objects. See the ReporterScreen API tutorial for more detail.

Run time

  • Installation takes 14.4 mins after pytorch installation with pytorch in Dell XPS 13 Ubuntu WSL.
  • bean run takes 4.6 mins with --scale-by-acc tag in Dell XPS 13 Ubuntu WSL for variant screen dataset with 3455 guides and 6 replicates with 4 sorting bins.
  • Full pipeline takes 90.1s in GitHub Action for toy dataset of 2 replicates and 30 guides.

Contributing

See CHANGELOG for recent updates. If you have questions or feature request, please open an issue. Please feel free to send a pull request.

Citation

If you have used BEAN for your analysis, please cite:
Ryu, J., Barkal, S., Yu, T. et al. Joint genotypic and phenotypic outcome modeling improves base editing variant effect quantification. Nat Genet (2024). https://doi.org/10.1038/s41588-024-01726-6

crispr-bean's People

Contributors

jykr avatar lucapinello avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

aquamono

crispr-bean's Issues

Standardize input column names

  • Column names for the input files must be specified and kept consistently. ex) allele filtering
  • Chromosome and multiple exons could be specified
  • Write tests
  • #29

sharedarrray installation error

Report by Lukas Edward Dow ([email protected])

Using cached [email protected]_2_17_x86_64.manylinux2014_x86_64.whl (385 kB)
Using cached numba-0.59.1-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (3.7 MB)
Using cached nvidia_nvjitlink_cu12-12.4.127-py3—none-manylinux2014_x86_64.whl (21.1 MB)
Using cached [email protected]—none-any.whl (11 kB)
Using cached [email protected] (38 kB)
Using cached toposort-1.10-py3—-none-any.whl (8.5 kB)
Using cached [email protected]—none-any.whl (34 kB)
Building wheels for collected packages: sharedarray

Building wheel for sharedarray (pyproject.toml) ... error

: subprocess-exited-with-error

Building wheel for sharedarray (pyproject.toml) did not run successfully.
exit code: 1

running bdist_wheel

running build

running build_ext

building 'SharedArray' extension

creating build

creating build/temp.linux-x86_64-—cpython-39

creating build/temp.1linux-x86_64-cpython-39/src

gcc -pthread -B /home/1ud200@5/.conda/envs/crispr—bean/compiler_compat -Wno-unused-result -Wsign-compare -DNDEBUG -02 -Wall
-fPIC -02 -isystem /home/1ud2005/.conda/envs/crispr—bean/include -I/home/1ud20@5/.conda/envs/crispr—bean/include -fPIC -02 -isyst
em /home/1ud2005/.conda/envs/crispr—bean/include -fPIC -I/scratch/1ud2005_11425453/pip—build—env-i7eammi4/overlay/lib/python3.9/s
ite-packages/numpy/core/include —I/home/1ud2005/.conda/envs/crispr—bean/include/python3.9 -c ./src/map_owner.c -o build/temp.linu
x-x86_64-cpython-39/./src/map_owner.o

gcc -pthread -B /home/1ud200@5/.conda/envs/crispr—bean/compiler_compat -Wno-unused-result -Wsign-compare -DNDEBUG -02 -Wall
-fPIC -02 -isystem /home/1ud2005/.conda/envs/crispr—bean/include -I/home/1ud20@5/.conda/envs/crispr—bean/include -fPIC -02 -isyst
em /home/1ud2005/.conda/envs/crispr—bean/include -fPIC -I/scratch/1ud2005_11425453/pip—build—env-i7eammi4/overlay/lib/python3.9/s
ite-packages/numpy/core/include —I/home/1ud2005/.conda/envs/crispr—bean/include/python3.9 -c ./src/map_owner_mlock.c -o build/tem
p. linux-x86_64-cpython-39/./src/map_owner_mlock.o

gcc -pthread -B /home/1ud200@5/.conda/envs/crispr—bean/compiler_compat -Wno-unused-result -Wsign-compare -DNDEBUG -02 -Wall
-fPIC -02 -isystem /home/1ud2005/.conda/envs/crispr—bean/include -I/home/1ud20@5/.conda/envs/crispr—bean/include -fPIC -02 -isyst
em /home/1ud2005/.conda/envs/crispr—bean/include -fPIC -I/scratch/1ud2005_11425453/pip—build—env-i7eammi4/overlay/lib/python3.9/s
ite-packages/numpy/core/include —I/home/1ud2005/.conda/envs/crispr—bean/include/python3.9 -c ./src/map_owner_msync.c -o build/tem
p. linux-x86_64-cpython-39/./src/map_owner_msync.o

gcc -pthread -B /home/1ud200@5/.conda/envs/crispr—bean/compiler_compat -Wno-unused-result -Wsign-compare -DNDEBUG -02 -Wall
-fPIC -02 -isystem /home/1ud2005/.conda/envs/crispr—bean/include -I/home/1ud20@5/.conda/envs/crispr—bean/include -fPIC -02 -isyst
em /home/1ud2005/.conda/envs/crispr—bean/include -fPIC -I/scratch/1ud2005_11425453/pip—build—env-i7eammi4/overlay/lib/python3.9/s
ite-packages/numpy/core/include —I/home/1ud2005/.conda/envs/crispr—bean/include/python3.9 -c ./src/map_owner_munlock.c -o build/t
emp. linux-x86_64-cpython-39/./src/map_owner_munlock.o

gcc -pthread -B /home/1ud200@5/.conda/envs/crispr—bean/compiler_compat -Wno-unused-result -Wsign-compare -DNDEBUG -02 -Wall
-fPIC -02 -isystem /home/1ud2005/.conda/envs/crispr—bean/include -I/home/1ud20@5/.conda/envs/crispr—bean/include -fPIC -02 -isyst
em /home/1ud2005/.conda/envs/crispr—bean/include -fPIC -I/scratch/1ud2005_11425453/pip—build—env-i7eammi4/overlay/lib/python3.9/s
ite-packages/numpy/core/include —I/home/1ud2005/.conda/envs/crispr—bean/include/python3.9 -c ./src/shared_array.c -o build/temp.1
inux-x86_64-cpython-39/./src/shared_array.o

gcc -pthread -B /home/1ud200@5/.conda/envs/crispr—bean/compiler_compat -Wno-unused-result -Wsign-compare -DNDEBUG -02 -Wall
-fPIC -02 -isystem /home/1ud2005/.conda/envs/crispr—bean/include -I/home/1ud20@5/.conda/envs/crispr—bean/include -fPIC -02 -isyst
em /home/1ud2005/.conda/envs/crispr—bean/include -fPIC -I/scratch/1ud2005_11425453/pip—build—env-i7eammi4/overlay/lib/python3.9/s
ite-packages/numpy/core/include —I/home/1ud2005/.conda/envs/crispr—bean/include/python3.9 -c ./src/shared_array_attach.c -o build
/temp.linux-x86_64-cpython-39/./src/shared_array_attach.o

./src/shared_array_attach.c: In function ‘do_attach’:

./src/shared_array_attach.c:99:2: error: ‘for’ loop initial declarations are only allowed in C99 mode

for (int i = 0; i < meta—>ndims; i++)
rn
./src/shared_array_attach.c:99:2: note: use option -std=c99 or -std=gnu99 to compile your code
error: command '/usr/bin/gcc' failed with exit code 1

: This error originates from a subprocess, and is likely not a problem with pip.
Failed to build sharedarray

This comes from C version, providing -std=c99 or std-gnu99 resolves the issue (link).

CFLAGS='-std=c99' python -m pip install .

Installation error with pyx

Copying error report from Elijah Login Mena @bwh

just want to let you know I had a minor issues installing bean on my computer yesterday. What worked for me was adding the following two lines to the beginning of the
CRISPResso2Align.pyx
file:

from numpy cimport import_array
import_array()

Otherwise I got the following error message:

>>>
 import bean Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/emena/Dropbox .../crispr-bean/bean/__init__.py", line 3, in <module> from . import mapping as mp File "/Users/emena/Dropbox
.../crispr-bean/bean/mapping/__init__.py", line 1, in <module> from .GuideEditCounter import GuideEditCounter File "/Users/user/Dropbox .../crispr-bean/bean/mapping/GuideEditCounter.py",
 line 16, in <module> from ._supporting_fn import ( File "/Users/user/Dropbox .../crispr-bean/bean/mapping/_supporting_fn.py", line 9, in <module> from bean.mapping.CRISPResso2Align import read_matrix, global_align_base_editor
 File "bean/mapping/CRISPResso2Align.pyx", line 1, in init bean.mapping.CRISPResso2Align # Copied & modified from CRISPResso2 https://github.com/pinellolab/CRISPResso2/blob/master/CRISPResso2/CRISPResso2Align.pyx ImportError: numpy.core.multiarray failed to
 import (auto-generated because you didn't call 'numpy.import_array()' after cimporting numpy; use '<void>numpy._import_array' to disable if you are certain you don't need it).

Help with running the examples?

Hi,

Congrats on the release of this package. It seems that this tool will be quite useful for the CRISPR community. I especially like the functionality for accounting for heterogeneity in editing efficiency across gRNAs.

I am trying to work through the tutorials but am having a bit of difficulty. In particular, the variant sorting screen tutorial uses an example "test file", but I am unable to locate this test file. Is the idea that I should download the test directory from here and then update the file paths on my machine?

Thanks,
Tim

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.