Giter VIP home page Giter VIP logo

scurgen's Introduction

Overview

scurgen (/skur'jən/) is a command line and GUI tool for plotting and exploring genomic data using scpace-filling curves such as the Hilbert Curve.

Dependencies

scurgen depends upon bedtools, pybedtools, numpy, matplotlib and PIL. However, except for bedtools, all of these dependencies should be automatically installed (if not already) as part of the installation process (below).

Installation

After cloning or downloading the source code, you will need to run the following:

python setup.py build

python setup.py install

At this point, if there are no errors, you should be able to type scurgen --help on the command line and see the following. If so, you are in business:

scurgen --help
usage: scurgen [-h] {plot,combine} ...

optional arguments:
-h, --help      show this help message and exit

[sub-commands]:
{plot,combine}
plot          plot a single dataset
combine       combine two datasets

The plot module

This module allows you to make a basic plot of genomic data using a Hilbert curve.:

scurgen plot  --chrom chr10 \
                        --cmap Greens \
                        --format png \
                        --dim 128 \
                        data/cpg-islands.hg19.chr10.bed

yields a PNG image named cpg-islands.hg19.chr10.bed.png:

https://raw.github.com/arq5x/scurgen/master/cpg-islands.hg19.chr10.bed.png

And let's do the same with 5' UTRs.:

scurgen plot  --chrom chr10 \
                        --cmap Blues \
                        --format png \
                        --dim 128 \
                        data/refseq.chr10.exons.bed

Voila:

https://raw.github.com/arq5x/scurgen/master/refseq.chr10.exons.bed.png

Now, when one combines the two, one can see the co-localization (or lack of) of the two sets in the genome. The following image was merged manually with image software, but I am working on dependable methods for doing this programmatically:

https://raw.github.com/arq5x/scurgen/master/merge.png

By default, we assume the chromosomes are humand (build 37 or hg19). However, other genomes are allowed:

scurgen plot  --chrom chr10 \
                        --cmap Greens \
                        --format png \
                        --genome mm9 \
                        data/cpg-islands.hg19.chr10.bed

Support for quantitative data

In the case of quantitative experiments such a ChIP-seq, we often have quantitative values associated with genomic intervals (e.g., "peaks"). The inc-col option allows one to use a specific column in a text file such as BEDGRAPH to increment the scores associated with each cell in the curve matrix. Here is an example of using the pValue column from an ENCODE narrow peak (http://genome.ucsc.edu/FAQ/FAQformat#format12) file:

scurgen plot  --chrom chr10 \
                        --cmap Reds \
                        --format png \
                        --dim 512 \
                        --inc_col 8 \
                        data/wgEncodeSydhTfbsGm12878Nrf1IggmusPk.narrowPeak

yielding:

https://raw.github.com/arq5x/scurgen/master/wgEncodeSydhTfbsGm12878Nrf1IggmusPk.narrowPeak.png

Full help for plot module

scurgen plot --help usage: scurgen plot [-h] [--genome STRING] [--chrom STRING]

[--inc_col INC_COL] [--dim INTEGER] [--min_mask INTEGER] [--cmap STRING] [--format STRING] [--dpi INTEGER] file
positional arguments:
file The name of the file to be plotted.
optional arguments:
-h, --help show this help message and exit
--genome STRING
 The genome the dataset comes from (e.g., hg19)?
--chrom STRING The chrom that should be plotted (e.g., chr1)
--inc_col INC_COL
 Use a specific column for incrementing file.
--dim INTEGER The dimensions of the curve. A power of 2.
--min_mask INTEGER
 The minimum value allowed in a cell before it is set to the masking color (white).
--cmap STRING The name of the matplotlib color map that should be used. See scipy.org/Cookbook/Matplotlib/Show_colormaps for options.
--format STRING
 The type of output figure to create.
--dpi INTEGER The resolution (in DPI) of the output.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.