Giter VIP home page Giter VIP logo

circleseq's Introduction

circleseq

Primary analysis pipeline for ultra-accurate sequencing data

Dependencies

External Software

bwa v0.7.17 bedtools v2.26.0 samtools v1.7 snakemake v5.7.1 optional: conda 4.7.12

Python Packages

scikit-bio v0.5.5 biopython v1.74 optional: matplotlib v3.1.0 seaborn v0.9.0

Installation

Copy the git directory:

git clone github.com/jmcbroome/circleseq

Ensure external software dependences are installed and on your shell's path.

Package dependencies can be installed independently or the circleseq.yml environment may be used via conda.

conda create -c conda-forge -c bioconda -n circleseq snakemake scikit-bio biopython seaborn samtools bedtools bwa
conda activate circleseq

Formatting Files

The snakefile as it stands expects input files in the format of {sample}_R1.fa and {sample}_R2.fa under the "input" file folder. Reference data is expected under references/{reference_genome}.fa, replacing bracketed values with the specific values of your sample and the reference genome name.

The file structure should look like this:

Directory with scripts
    input
         {sample}_R1.fq.gz
         {sample}_R2.fq.gz
    references
         {reference_genome}.fa

Usage

First, ensure your reference of choice is indexed with bwa.

bwa index references/{reference_genome}

Then simply call:

snakemake -j {max_threads} {sample}_{reference_genome}.txt

To include the final optional graphing step (dependencies are matplotlib and seaborn), instead call:

snakemake -j {max_threads} {sample}_{reference_genome}.png

Or run the graph_mutations.py script separately on the error table resulting from the above pipeline.

Again, replacing bracketed values with the name of your sample, the name of your reference genome file, and with the maximum threads value being the maximum number of threads available to the pipeline for processing. Default value for max_threads is 1. Add the argument "--use-conda circleseq.yml" as an alternative to global installation of requisite packages, or activate the environment with conda and call snakemake from within it.

Note that the mutations here may include unfiltered mismapping errors and similar. Downstream analysis should be generally performed using the constructed consensus bam, or the accompanying pileup variants.txt.

Test Case

Reference and simulated input data have been provided to run an example to ensure correctly installed dependencies.

To call the test case, simply input at the command line:

snakemake -c1 simulated_yeast.txt

circleseq's People

Contributors

jmcbroome avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.