Giter VIP home page Giter VIP logo

genomicamicrob / nextera_cleaner Goto Github PK

View Code? Open in Web Editor NEW
0.0 3.0 2.0 17 KB

Script to clean Illumina pair-end sequences produced with the Nextera kit. Bases below Q30, Ns, and Nextera adapters are removed. Bases can also be removed at the beginning and end of each sequence. At the end, clean files can be analyzed with FastQC.

License: MIT License

Shell 100.00%
nextera nextera-adapters fastqc paired-end

nextera_cleaner's Introduction

nextera_cleaner

Bash script to clean Illumina pair-end sequences produced with the Nextera kit.

This script will process both pair-end sequences, asks for a common name for the resulting sequences, trims bases below a Phred score value, removes N's and sequences below 20 bases. It can also deletes bases from the beggining of the sequences and also trim the sequences to a certain length by removing bases from the 3' end of the sequence. If files are compressed (.gz) it will automatically decompress them. After this, it will ask whether you want to merge the pair-end sequences (with flash), convert them to fasta, and run FastQC on resulting files.

INSTALLATION

  1. Download the latest release to any directory in your system.
  2. Decompress tar xzf nextera_cleaner.v0.1.0.tar.gz
  3. Make it executable: chmod +x nextera_cleaner.v0.1.0.sh

Be sure to keep the nextera_adapter.tsv and the contaminants.tsv files in the same folder as the script; this files are desirable for FastQC.

You can then create a symbolic link to the script so you call it from any directory.

USAGE

$ nextera_cleaner.v0.1.0.sh file_R1.fastq file_R2.fastq

Where file_R1.fastq file_R2.fastq are the files provided by the Illumina sequencer.

The script will ask if you want to trim some bases at the beginning of the sequences and also at the end. In order to give an appropriate number in both cases, it is recommended first to run FastQC with the raw secuences (file_R1.fastq file_R2.fastq), check the output and then decide if you need to trim.

DEPENDENCIES

You need the following programs in your PATH:

-Cutadapt

And if you want to merge the sequences:

-flash

Finally, FastQC is optional

-FastQC

nextera_cleaner's People

Contributors

genomicamicrob avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.