Giter VIP home page Giter VIP logo

seqtk's Introduction

Introduction

Seqtk is a fast and lightweight tool for processing sequences in the FASTA or FASTQ format. It seamlessly parses both FASTA and FASTQ files which can also be optionally compressed by gzip. To install seqtk,

git clone https://github.com/lh3/seqtk.git;
cd seqtk; make

The only library dependency is zlib.

Seqtk Examples

  • Convert FASTQ to FASTA:

      seqtk seq -a in.fq.gz > out.fa
    
  • Convert ILLUMINA 1.3+ FASTQ to FASTA and mask bases with quality lower than 20 to lowercases (the 1st command line) or to N (the 2nd):

      seqtk seq -aQ64 -q20 in.fq > out.fa
      seqtk seq -aQ64 -q20 -n N in.fq > out.fa
    
  • Fold long FASTA/Q lines and remove FASTA/Q comments:

      seqtk seq -Cl60 in.fa > out.fa
    
  • Convert multi-line FASTQ to 4-line FASTQ:

      seqtk seq -l0 in.fq > out.fq
    
  • Reverse complement FASTA/Q:

      seqtk seq -r in.fq > out.fq
    
  • Extract sequences with names in file name.lst, one sequence name per line:

      seqtk subseq in.fq name.lst > out.fq
    
  • Extract sequences in regions contained in file reg.bed:

      seqtk subseq in.fa reg.bed > out.fa
    
  • Mask regions in reg.bed to lowercases:

      seqtk seq -M reg.bed in.fa > out.fa
    
  • Subsample 10000 read pairs from two large paired FASTQ files (remember to use the same random seed to keep pairing):

      seqtk sample -s100 read1.fq 10000 > sub1.fq
      seqtk sample -s100 read2.fq 10000 > sub2.fq
    
  • Trim low-quality bases from both ends using the Phred algorithm:

      seqtk trimfq in.fq > out.fq
    
  • Trim 5bp from the left end of each read and 10bp from the right end:

      seqtk trimfq -b 5 -e 10 in.fa > out.fa
    

seqtk's People

Contributors

lh3 avatar kdm9 avatar vsbuffalo avatar bwlang avatar cjain7 avatar kloetzl avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.