Giter VIP home page Giter VIP logo

mgs_sop's Introduction

Metagenomic Standard Operating Procedure

๐Ÿšง The SOP is not fully implemented here. See the Langille Lab's wiki for the most up-to-date version.

The Langille Lab's metagenomic SOP, implemented as a WDL workflow.

Prerequisites

This workflow is designed to be run on a Unix-based system.

Install dependencies

To ease the installation of dependencies, we use conda to manage the execution environment:

$ conda env create
$ source activate mgs_sop

If you'd like to install the tools manually, see the environment.yml file for the project dependencies.

Set input variables

The workflow relies on several user supplied inputs to complete its tasks. Before running the workflow, you'll want to copy the input template and populate it with the details for your project and environment. For example:

$ cp inputs.template.json inputs.json

Then open the file with your preferred text editor and fill in the values. Some values describe the location of required databases (see following section).

Download required databases

A mapping database for screening reads can be downloaded from the Langille Lab (Note: the file is ~3.5GB):

$ curl \
    http://kronos.pharmacology.dal.ca/public_files/GRCh38_PhiX_bowtie2_index.tar.gz \
    > /tmp/GRCh38_PhiX_bowtie2_index.tar.gz
$ tar xvzf GRCh38_PhiX_bowtie2_index.tar.gz -C databases
$ rm /tmp/GRCh38_PhiX_bowtie2_index.tar.gz

# Copy and paste the absolute path of the database directory in your
# inputs.json. If on Linux, you can get the full path by:
$ readlink -f databases/GRCh38_PhiX_bowtie2_index/

Running the workflow

Run the workflow with Cromwell (assumes the populated inputs template was named inputs.json):

$ cromwell run workflow.wdl --inputs inputs.json

Testing the workflow

Test data can be obtained from NCBI using the SRA Toolkit. The following commands will download a small sample set of FASTQs (from an HMP project, accessible via NCBI with project accession PRJNA46333:

$ fastq-dump \
    --readids \
    --split-files \
    --maxSpotId 10 \
    --outdir tests/fastq/ \
    SRR3644404

Once the test data has been downloaded, run the workflow with the test input file:

$ cromwell run workflow.wdl --inputs tests/inputs.test.json

mgs_sop's People

Contributors

karlrl avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.