Giter VIP home page Giter VIP logo

compbio's Introduction

compbio

Python libraries and utilities for computational biology.

About

This package contains algorithms related to several areas of genomics, phylogenetics, and population genetics. Some of the highlights include:

  • reading, writing, and manipulating phylogenetic trees
  • reconciling gene-trees with species-trees
  • inferring gene duplications, losses, and horizontal transfers
  • methods for coalescent processes, incomplete lineage sorting
  • methods for ancestral recombination graphs (ARGs)
  • finding syntenic regions (i.e. co-linear orthology) between genomes
  • processing common file formats: FASTA, PHYLIP, newick, nexus, etc.

In addition to computational biology-specific methods, this package also contains general utilities for working with scientific data:

  • sparse matrix file formats
  • reading, writing, and manipulating tables of data
  • working with intervals (e.g. intersection, union, etc)
  • plotting (Gnuplot, Rpy)
  • statistics
  • general data-structures and algorithms: quad trees, Union-Find, HHMs, clustering

Download

The compbio package is available for download from several sources:

Requirements

Most modules in this package can be used without any additional dependencies.

For plotting modules, the dependencies include:

For some scientific methods, the dependencies include:

For development of the compbio package itself, dependencies can be installed with pip:

pip install -r requirements-dev.txt

INSTALL

The compbio package is available on pypi, and can be installed using pip:

pip install compbio

These packages can be installed from the source directory using:

python setup.py install

Optionally, the libraries can be used directly from the source directory by configuring one's environment variables as follows (assuming bash shell):

export PATH=$PATH:path/to/compbio/bin
export PYTHONPATH=$PYTHONPATH:path/to/compbio

Author

These libraries were built up over the course of the Ph.D. of the author, Matthew D. Rasmussen (http://mattrasmus.com, [email protected]). Many of the methods here were utilized in several published software projects including:

  • ARGweaver: Rasmussen, Siepel. Genome-wide inference of ancestral recombination graphs. ArXiv. 2013.
  • DLCoal: Rasmussen, Kellis. Unified modeling of gene duplication, loss, and coalescence using a locus tree. Genome Research. 2012.
  • SPIMAP: Rasmussen, Kellis. A Bayesian approach for fast and accurate gene tree reconstruction. Molecular Biology and Evolution. 2010.
  • SPIDIR: Rasmussen, Kellis. Accurate gene-tree reconstruction by learning gene- and species-specific substitution rates across multiple complete genomes. Genome Research. 2007.

Minor note: Although the libraries of this package supports each of these software packages, this package is not a required dependency. Instead each software package contains its own private copy of modules taken from this package.

compbio's People

Contributors

hmc-cs-crane avatar hmc-cs-hdu avatar mdrasmus avatar morgan-carothers avatar morgancarothers avatar wutron avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.