Giter VIP home page Giter VIP logo

awesome-one-liners's Introduction

One line, when multiple are not enough ๐Ÿ‘

Give a 2 column, tab separated list of read no. and read length from fastq file

zcat whatever.fastq.gz | paste - - - - | awk '{print NR " " (length($3))}'


Count total reads in a fastq file

zcat whatever.fastq.gz | wc -l | awk '{print $1/4}'


Change extension of multiple files at once.

In below example, the extension changes from *.scafSeq to *.fa

for f in *.scafSeq; do mv "$f" "$(basename "$f" .scafSeq).fa"; done

rename command can also come handy in such cases. For e.g. Rename all .fastq files as .fasta

rename .fastq .fasta *.fastq


Get A T G C counts for all sequences from a multi fasta file

echo -e "seq_id\tA\tU\tG\tC"; while read line; do echo $line | grep ">" | sed 's/>//g'; for i in A U G C;do echo $line | grep -v ">" | grep -o $i | wc -l | grep -v "^0"; done; done < test.fa | paste - - - - -

Counting number of sequences in a fasta file:

grep -c "^>" file.fa


Add something to end of all header lines:

sed 's/>.*/&WHATEVERYOUWANT/' file.fa > outfile.fa


Clean up a fasta file so only first column of the header is outputted:

awk '{print $1}' file.fa > output.fa


Count the number of sequences in clusters generating using CD-HIT:

for i in *.clstr; do echo $i ; grep ">Cluster" -B 1 $i --no-group-separator | paste - - | awk '{print $1"_"$2 " "$3+1}' > $i.count.txt ; done

Change extension of fastq files in batch

rename 's/_fastp.fastq.gz/.fq.gz/' *.gz 

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.