Giter VIP home page Giter VIP logo

pyclonal's Introduction

Jupyter Notebooks to analyze T-cell Receptor Sequencing

Goal

  • Provide an interactive set of Jupyter notebooks for easily visualizing and analyzing TCR sequencing data using existing tools and methods.

What is it?

  • A program that inputs diverse file types from alignment softwares (ie. MIXCR, Changeo, and more) along with associated metadata for those files, and allows an analysis workflow.

Use cases (links to Jupyter notebooks)

  • Overlap
  • Diversity
  • [Public Database](In development...)
  • [Single cell](In development...)

Background

  • T cells are immune cells that recognize their targets through the T-cell receptor (TCR) - a complex of highly variable cell-surface proteins. Analyzing the TCR repertoire in humans or mouse models can help us understand the development of the immune system and progression of disease.

  • There are a growing pool of biologists and clinicians that want to be able to analyze the mass amounts of data they are collecting, or that others have collected and published. These notebooks will provide a tool for this community to use and interact with T-cell receptor sequencing data.

  • There has been a lot of development of methods for analyzing T-cell receptor data, a lot of which borrows from the field of ecology and associated diversity analyses. However, the tools developed for these analyses are all in different locations and not easy to access! We are solving that problem here.

How to use

Installation

Dependencies: pandas, jupyter, scipy, seaborn

For now the recommended way to install is using pipenv. If you use Anaconda python distribution make sure to upgrade it to the latest version.

  • install pipenv (installation instructions)

  • clone PyClonal repo:

      $ git clone https://github.com/NCBI-Hackathons/PyClonal.git
    
  • create virtual environment inside PyClonal directory

      $ cd PyClonal
      $ pipenv --three
    
  • activate virtualenv and install PyClonal (this will install all necessary dependencies)

      $ pipenv shell
      $ pip install -e .
    
  • open jupyter notebook within that environment

       $ jupyter notebook
    
  • once the jupyter notebook browser launches as below, notebooks are in the jupyter_notebooks folder

  • to exit the environment type exit

  • to reopen the environment after it has been downloaded once

      $ pipenv shell
    

Alternate installation methods

Older versions of Anaconda have issues running pipenv. There are a few alternatives if you run into installation issues:

  • Update your version of Anaconda, and rerun the commands

  • Use a conda environment. In the PyClonal directory:

      $ conda create --name env python=3
      $ source activate env
      $ pip install -e .
      $ jupyter notebook
    
  • Easiest but least recommended method:

      $ pip install -e . --user
      $ jupyter notebook
    

Usage

To use from the command line, run pcl.py script:

    $./pcl.py -h
    usage: pcl.py [-h] [-p PATTERN] [-f [FORMAT [FORMAT ...]]] [-n FORMAT_NAME]
          [-c [FORMAT_COLS [FORMAT_COLS ...]]] [-o OUTPUT_FILE]
          dir

    A Jupyter notebook based framework to analyze T-cell receptor sequencing data.
    Provide an interactive set of Jupyter notebooks for easily visualizing and
    analyzing TCR sequencing data using existing tools and methods.

    positional arguments:
    dir                   directory with data files

    optional arguments:
    -h, --help            show this help message and exit
    -p PATTERN, --pattern PATTERN
                            filename patterd (*.tsv)
    -f [FORMAT [FORMAT ...]], --format [FORMAT [FORMAT ...]]
                            custom format: names of columns to extract
    -n FORMAT_NAME, --format_name FORMAT_NAME
                            custom format name
    -c [FORMAT_COLS [FORMAT_COLS ...]], --format_cols [FORMAT_COLS [FORMAT_COLS ...]]
                            column to detect format
    -o OUTPUT_FILE, --output_file OUTPUT_FILE
                            output files basename

For usage example in jupyter notebook see example notebook data input.ipynb in jupyter_notebooks directory.

Resources and Existing TCR tools to gather from:

-VDJdb -https://vdjdb.cdr3.net/

-VDJ tools -https://vdjtools-doc.readthedocs.io/en/master/

-VDJviz: a versatile immune repertoire browser -https://vdjviz.cdr3.net/

-tcR -https://cran.r-project.org/web/packages/tcR/vignettes/tcrvignette.html

-miXCR -https://mixcr.readthedocs.io/en/master/

-powerTCR -https://www.biorxiv.org/content/early/2018/04/07/297119

-ImmuneDB -http://immunedb.com/

-TraCeR -https://github.com/teichlab/tracer

pyclonal's People

Contributors

britm92 avatar dcgenomics avatar eco32i avatar fcve avatar kmod avatar michellemiron avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.