Giter VIP home page Giter VIP logo

cut_run--hiebert-lab-how-to's Introduction

CUT_RUN: Hiebert Lab How to

A vignette of how to analyze CUT&RUN data for the Hiebert Lab

Trimmomatic

phred33: specifies the base quality encoding

LEADING: Cut bases off the start of a read, if below a threshold quality

TRAILING: Cut bases off the end of a read, if below a threshold quality

MINLEN: Drop the read if it is below a specified length

threads: number of processors or threads to use

Option 1:
java -classpath {SOFTWARE_DIR}/trimmomatic-0.39.jar org.usadellab.trimmomatic.TrimmomaticPE -phred33 -threads 8 \
5176-MB-1-TCGGATTC-CGCAACTA_S01_L005_R1_001.fastq.gz 5176-MB-1-TCGGATTC-CGCAACTA_S01_L005_R2_001.fastq.gz \
5176-MB-1_R1.nodap.paired.txt 5176-MB-1_R1.noadap.unpaired.txt 5176-MB-1_R2.noadap.paired.txt  5176-MB-1_R2.noadap.unpaired.txt \
ILLUMINACLIP:TruSeq_CD_adapter.txt:2:30:7 LEADING:15 TRAILING:15 MINLEN:15 
Option 2: run in the background using nohup
nohup java -classpath {SOFTWARE_DIR}/trimmomatic-0.39.jar org.usadellab.trimmomatic.TrimmomaticPE -phred33 -threads 8 \
5176-MB-1-TCGGATTC-CGCAACTA_S01_L005_R1_001.fastq.gz 5176-MB-1-TCGGATTC-CGCAACTA_S01_L005_R2_001.fastq.gz \
5176-MB-1_R1.nodap.paired.txt 5176-MB-1_R1.noadap.unpaired.txt 5176-MB-1_R2.noadap.paired.txt  5176-MB-1_R2.noadap.unpaired.txt \
ILLUMINACLIP:TruSeq_CD_adapter.txt:2:30:7 LEADING:15 TRAILING:15 MINLEN:15 > 5176-trim-1.out &
Option 3: Refer to shell script loop example

Bowtie2

From CUT&RUN protocol.io For mapping spike-in fragments, we also use the --no-overlap --no-dovetail options to avoid cross-mapping of the experimental genome to that of the spike-in DNA.(https://www.protocols.io/view/cut-amp-run-targeted-in-situ-genome-wide-profiling-14egnr4ql5dy/v3?step=113)

local: Local alignment searches for the best alignment of a substring of the input sequence. While it can find an alignment for the entire sequence, if another, shorter, alignment has a higher score, it will be chosen. End-to-end will compute the score over the entire matching of the input sequence and its alignment with the reference. If there are adapters/long mismatches/indels etc. the local will work best. If you have a good reason to believe that the input sequence should be fully matched to the reference, then select end-to-end

very-sensitive-local: Same as: -D 20 -R 3 -N 0 -L 20 -i S,1,0.50

no-unal: Suppress SAM records for reads that failed to align.

no-mixed: By default, when bowtie2 cannot find a concordant or discordant alignment for a pair, it then tries to find alignments for the individual mates. This option disables that behavior.

no-discordant: A discordant alignment is an alignment where both mates align uniquely, but that does not satisfy the paired-end constraints

no-overlap: If one mate alignment overlaps the other at all, consider that to be non-concordant

no-dovetail

I: The minimum fragment length for valid paired-end alignments.

X: The maximum fragment length for valid paired-end alignments.

p: threads or processors to use will running

Option 1:
bowtie2 -p 8 --local --very-sensitive-local --no-unal --no-mixed --no-discordant --phred33 -I 10 -X 700 --no-overlap --no-dovetail -x {GENOME_DIR}/hg19_ec \
-1 4617-MB-1_R1.noadap.paired.txt -2 4617-MB-1_R2.noadap.paired.txt -S 4617-MB-1.hg19scer.sam 
Option 2: Run with nohup
Option 3: Refer to shell script loop example

Samtools

Option 1:
##### sam to bam file
samtools view -S -b -@ 14  ${ALIGN}/9631-MB-${i}.hg19ec.sam -o ${BAM}/9631-MB-${i}.hg19ec.bam
##### read quality filter
samtools view -b -F 4 -q 10 -@ 14 ${BAM}/9631-MB-${i}.hg19ec.bam -o ${BAM}/9631-MB-${i}.hg19ec.F4q10.bam
##### sort 
samtools sort -@ 12 ${BAM}/9631-MB-${i}.hg19ec.F4q10.bam -o ${BAM}/9631-MB-${i}.hg19ec.F4q10.sorted.bam
##### index
samtools index ${BAM}/9631-MB-${i}.hg19ec.F4q10.sorted.bam
##### selecting only the human chromosomes minus chrM
samtools view -@ 12 -bh ${BAM}/9631-MB-${i}.hg19ec.F4q10.sorted.bam chr1 chr2 chr3 chr4 chr5 chr6 chr7 chr8 chr9 chr10 chr11 chr12 chr13 chr14 chr15 chr16 chr17 chr18 chr19 chr20 chr21 chr22 chrX chrY > ${BAM}/9631-MB-${i}.hg19.F4q10.sorted.bam
##### index
samtools index ${BAM}/9631-MB-${i}.hg19.F4q10.sorted.bam

cut_run--hiebert-lab-how-to's People

Contributors

monnieb92 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.