Giter VIP home page Giter VIP logo

chilin's Introduction

Cistrome ChiLin

It is a python package for one-in-all solution of processing ChIP-seq and DNase-seq data.

Quick Start

See if you have gcc, g++, java, R, python-dev installed (http://cistrome.org/chilin/Installation.html#dependent-software-list).

First, clone:

git clone https://github.com/cfce/chilin && cd chilin

then install through:

python setup.py clean && python setup.py install -f

source virtual environment and use:

source chilin_env/bin/activate
chilin -h

fetch hg19 reference data, and test on demo data:

under the root of the chilin code.

# change to default directory
mkdir -p db

cd db

# all hg19 reference data
wget -c http://cistrome.org/chilin/_downloads/hg19.tgz
wget -c http://cistrome.org/chilin/_downloads/hg19.tgz.md5 ## check md5
md5sum -c hg19.tgz
tar xvfz hg19.tgz
# download mycoplasma for judgement of contamination in your samples
wget -c http://cistrome.org/chilin/_downloads/mycoplasma.tgz
wget -c http://cistrome.org/chilin/_downloads/mycoplasma.tgz.md5
md5sum -c mycoplasma.tgz.md5
tar xvfz mycoplasma.tgz

# check all installation
cd .. && python setup.py -l
cd demo && bash foxa1

Usage

Demo data command is as follows:

   chilin  simple -p narrow -t foxa1_t1.fastq  -c foxa1_c1.fastq -i local -o local -s hg19  --skip 10,12 --dont_remove

See skip_ option for details.

This is major and the easiest mode to run ChiLin for single end data with default bwa mapper, for single end data using comma to separate sample replicates for IP and input ChIP-seq sample:

  chilin  simple -u your_name -s your_species --threads 8 -i id -o output -t treat1.fastq,treat2.fastq -c control1.fastq,control2.fastq  -p narrow -r tf

For pair end data, use semicolon to separate sample replicates, use comma to separate pairs, do not forget to add quotes(") of your sample file path:

  chilin simple --threads 8 -i H3K27me3_PairEnd -o H3K27me3_PairEnd -u you -s mm9 -t "GSM905438.fastq_R1.gz,GSM905438.fastq_R2.gz" -c "GSM905434.fastq_R1.gz,GSM905434.fastq_R2.gz;GSM905436.fastq_R1.gz,GSM905436.fastq_R2.gz" -p both --pe

Currently, only bwa support pair end processing. bwa supports both fastq.gz and fastq file, bowtie only support fastq file, the pipeline should use the corresponding aligner's genome index configured in the configuration files.

Update the configuration

If you modify the code or update any part of the configuration file chilin.conf.filled, such as different aligner's genome index, union DHS BED file, reinstall the package itself only.

source chilin_env/bin/activate && python setup.py install 

Uninstall

python setup.py clean
deactivate

Troubleshooting

If any error of the dependent software occur, try to upgrade the corresponding software. Those warnings generated in pdflatex step is ok. There is one known issue of mm9 chrom-info in CentOS. ChiLin is suggested to be used under Ubuntu. If sys_platform error occurs, uninstall the system setuptools and install the latest setuptools manually.

Documentation

full documentation: http://cistrome.org/chilin

github wiki: https://github.com/cfce/chilin/wiki

Perry changes

The initial version expected each control fq file to be unique. I have instances where the same control file is used for different replicate IPs. This might introduce a problem when the control files are merged, depending on how duplicate reads are handled. In the worst case, more merging is done than required. I've altered the merging steps so that only unique control files are merged.

chilin's People

Contributors

chenfeiwang avatar qinqian avatar samesense avatar taoliu avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.