Giter VIP home page Giter VIP logo

omrnaseq's Introduction

OMrnaseq: a rnaseq pipeline based on luigi

OMrnaseq, developed at the OnMath, is designed for analysis RNAseq data. Including 4 major modules: fastqc, mapping, quant and enrich.

  • qc: using fastqc to examine basic information (data size, GC, Q30) about fastq files.
  • mapping: using STAR to map reads to the genome.
  • quant: using kallisto to quantifying abundances, and using edgeR to perform differential analysis.
  • enrich: using goseq and KOBAS to analsyis enrichment of differential expressed gene to GO Terms and KEGG pathways.

virtualenv

OMrnaseq is under development, so its better to install it in a virtualenv, so you can keep up with the updating. virtualenvwrapper is a set of extensions to virtualenv tool. Its convinient to using for virtualenv management.

# install virtualenvwrapper
pip install virtualenvwrapper

# configure your bash profile
# add below two command to your ~/.bash_profile
export WORKON_HOME=/your/virturalenv/path
source /usr/bin/virtualenvwrapper.sh

# make OMrnaseq virturalenv
mkvirtualenv OMrnaseq

# enter OMrnaseq virturalenv
workon OMrnaseq

Dependencies

Before install OMrnaseq, you need to install omplotr and rnaReport first. omplotr is a R package needed for OMrnaseq to generate plots in analysis. rnaReport is needed for OMrnaseq to generate a report.

Installation

Now you are in the virtualenv for the OMrnaseq, you can download the OMrnaseq source code from github and install it in the environment.

# download
git clone https://github.com/bioShaun/OMrnaseq.git

# install
cd OMrnaseq
pip install -e .

Usage

qc

mrna \
    -p /path/of/analysis \
    -s /path/of/sample_inf \
    -f /path/of/fastqs \
    -w parallels_number \
    fastqc
  • sample_inf: tab-delimited text file indicating biological replicate relationships; see example.
  • fastqs: fastq files named format sample_1.clean.fq.gz, sample_2.clean.fq.gz.

mapping

mrna \
    -p /path/of/analysis \
    -s /path/of/sample_inf \
    -f /path/of/fastqs \
    -w parallels_number \
    --star_index /path/to/star/index \
    mapping
	

quant

mrna \
    -p /path/of/analysis \
    -s /path/of/sample_inf \
    -f /path/of/fastqs \
    -w parallels_number \
    --gene2tr /gene/transcript/map/file \
    --kallisto_idx /path/to/kallsito/index 
  • gene2tr: file containing 'gene(tab)transcript' identifiers per line; see example.

enrich

mrna \
    -p /path/of/analysis \
    -w parallels_number \
    -n result_name \
    --go /go/annotation/file \
    --gene_length /gene/length/file \
    --kegg_blast /gene/blast/to/kegg/pep/tab/outfile \
    --kegg_abbr species_kegg_abbr \
    --kegg_background species_kegg_background_abbr \ # default is kegg abbr
    --gene_list_file /file/of/gene/list/path \
    enrich
  • go: file containing 'gene(tab)go_ids' per line, go_ids are seperated with ","; see example.
  • gene_length: file containing 'gene(tab)gene_length' per line; see example.
  • kegg_blast: blast result of gene with KOBAS pep sequence; see example.
  • gene_list_file: file containing gene list file path. see example.

rnaseq

rnaseq is a collection of module: qc, quant and enrich. So you could run three module in one command.

# run rnaseq
mrna \
    -p /path/of/analysis \
    -s /path/of/sample_inf \
    -f /path/of/fastqs \
    -w parallels_number \
    --gene2tr /gene/transcript/map/file \
    --kallisto_idx /path/to/kallsito/index \
    --go /go/annotation/file \
    --gene_length /gene/length/file \
    --kegg_blast /gene/blast/to/kegg/pep/tab/outfile \
    --kegg_abbr species_kegg_abbr \
    rnaseq

# you can also combine the module by yourself
mrna \
    ... # parameters required for each module
    module1 \
    module2

omrnaseq's People

Contributors

bioshaun avatar

Watchers

James Cloos avatar

Forkers

jaysonwujq

omrnaseq's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.