Giter VIP home page Giter VIP logo

short-read-tax-assignment's Introduction

TAX CREdiT: TAXonomic ClassifieR Evaluation Tool

A standardized and extensible evaluation framework for taxonomic classifiers

To view static versions of the reports presented in Bokulich, et al., (Microbiome, under review), start here.

Environment

This repository contains python-3 code and Jupyter notebooks, but some taxonomy assignment methods (e.g., using QIIME-1 legacy methods) may require different python or software versions. Hence, we use conda parallel environments to support comparison of myriad methods in a single framework.

The first step is to create a conda environment with the necessary dependencies. This requires installing miniconda 3 to manage parallel python environments. After miniconda (or another conda version) is installed, proceed with installing QIIME 2.

An example of how to load different environments to support other methods can be see in the QIIME-1 taxonomy assignment notebook.

Setup and install

The library code and IPython Notebooks are then installed as follows:

cd $HOME/projects
git clone https://github.com/gregcaporaso/short-read-tax-assignment.git
cd $HOME/projects/short-read-tax-assignment/code
sudo pip install .

To run the unit tests, you should install run:

cd $HOME/projects/short-read-tax-assignment/code
nosetests .

Finally, download and unzip the reference databases:

cd $HOME/ref_dbs/
wget https://unite.ut.ee/sh_files/sh_qiime_release_20.11.2016.zip
wget ftp://greengenes.microbio.me/greengenes_release/gg_13_5/gg_13_8_otus.tar.gz
unzip sh_qiime_release_20.11.2016.zip
tar -xzf gg_13_8_otus.tar.gz

Equipment

The analyses included here can all be run in standard, modern laptop, provided you don't mind waiting a few hours on the most memory-intensive step (taxonomy classification of millions of sequences). All analyses presented in tax-credit were run in a single afternoon using a MacBook Pro with the following specifications: OS OS X 10.11.6 "El Capitan" Processor 2.3 GHz Intel Core i7 Memory 8 GB 1600 MHz DDR3

Using the Jupyter Notebooks included in this repository

To view and interact with Jupyter Notebook, change into the /short-read-tax-assignment/ipynb directory and run Jupyter Notebooks from the terminal with the command:

jupyter notebook index.ipynb

The notebooks menu should open in your browser. From the main index, you can follow the menus to browse different analyses, or use File --> Open from the notebook toolbar to access the full file tree.

Citing

If you use any of the data or code included in this repository, please cite with:

Bokulich NA, Rideout JR, Kopylova E, Bolyen E, Patnode J, Ellett Z, McDonald D, Wolfe B, Maurice CF, Dutton RJ, Turnbaugh PJ, Knight R, Caporaso JG. (2015) A standardized, extensible framework for optimizing classification improves marker-gene taxonomic assignments. PeerJ PrePrints 3:e1156 https://dx.doi.org/10.7287/peerj.preprints.934v1

short-read-tax-assignment's People

Contributors

gregcaporaso avatar jairideout avatar nbokulich avatar benkaehler avatar zellett avatar ebolyen avatar

Watchers

James Cloos avatar  avatar

short-read-tax-assignment's Issues

mock community data generation notebook?

@gregcaporaso @BenKaehler should we add a notebook demonstrating how to generate (from raw mock community fastqs) empty biom tables and seq rep sets ready for taxonomy assignment?

My initial feeling is that this may be unnecessary, given that it is such a basic process and we just supply these materials for multiple datasets ready for analysis.

However, we might want to update these materials anyway, using QIIME2 instead of QIIME1 (though this carries the risk that we may need to change some functions in the existing notebooks).

Thoughts?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.