Giter VIP home page Giter VIP logo

intro-to-chipseq's Introduction

NOTE: The materials in this repository are no longer actively maintained. More recent content can be found at: https://hbctraining.github.io/Intro-to-ChIPseq-flipped/

OLD - Introduction to ChIP-seq using high performance computing

Audience Computational Skills Prerequisites Duration
Biologists Beginner/Intermediate None 3-day workshop (~19.5 hours of trainer-led time)

Description

This repository has teaching materials for a 3-day Introduction to ChIP-sequencing data analysis workshop. This workshop focuses on teaching basic computational skills to enable the effective use of an high-performance computing environment to implement a ChIP-seq data analysis workflow. It includes an introduction to shell (bash) and shell scripting. In addition to running the ChIP-seq workflow from FASTQ files to peak calls and nearest gene annotations, the workshop covers best practice guidlelines for ChIP-seq experimental design and data organization/management and quality control.

These materials were developed for a trainer-led workshop, but are also amenable to self-guided learning.

Learning Objectives

  1. Understand the necessity for, and use of, the command line interface (bash) and HPC for analyzing high-throughput sequencing data.
  2. Understand best practices for designing a ChIP-seq experiment and analysis the resulting data.

Lessons

Click here for links to lessons and the suggested schedule

Dataset

Installation Requirements

Download the most recent versions of R and RStudio for your laptop:

NOTE: When installing the following packages, if you are asked to select (a/s/n) or (y/n), please select “a” or "y" as applicable.

(1) Install the below packages on your laptop from CRAN. You DO NOT have to go to the CRAN webpage; you can use the following function to install them:

install.packages("BiocManager")
install.packages("tidyverse")

Note that these package names are case sensitive!

(2) Install the below packages from Bioconductor. Load BiocManager, then run BiocManager's install() function 7 times for the 7 packages:

library(BiocManager)
install("insert_first_package_name_in_quotations")
install("insert_second_package_name_in_quotations")
& so on ...

Note that these package names are case sensitive!

ChIPQC
ChIPseeker
DiffBind
clusterProfiler
AnnotationDbi
TxDb.Hsapiens.UCSC.hg19.knownGene
EnsDb.Hsapiens.v75
org.Hs.eg.db

NOTE: The library used for the annotations associated with genes (here we are using TxDb.Hsapiens.UCSC.hg19.knownGene and EnsDb.Hsapiens.v75) will change based on organism (e.g. if studying mouse, would need to install and load TxDb.Mmusculus.UCSC.mm10.knownGene). The list of different organism packages are given here.

(3) Finally, please check that all the packages were installed successfully by loading them one at a time using the library() function.

library(tidyverse)
library(ChIPQC)
library(ChIPseeker)
library(DiffBind)
library(clusterProfiler)
library(AnnotationDbi)
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
library(EnsDb.Hsapiens.v75)

(4) Once all packages have been loaded, run sessionInfo().

sessionInfo()

These materials have been developed by members of the teaching team at the Harvard Chan Bioinformatics Core (HBC). These are open access materials distributed under the terms of the Creative Commons Attribution license (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

intro-to-chipseq's People

Contributors

cjfields avatar hackdna avatar marypiper avatar mistrm82 avatar rkhetani avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

intro-to-chipseq's Issues

update unix lesssons

  • create a new directory on O2 where raw_fastq contains the chipseq files
  • change mentions of rna-seq
  • specifically modify the data organization markdown (merge with Intro to Unix)

Adding details on alignment

  • Lecture on alignment like we do in RNA-seq? Yes. - A shortened version (i.e. remove the suffix array stuff).
  • Add in BWA and ask Rory about including some benchmark results.

Add in BWA

and ask Rory about including some benchmark result

adding details pulldown to selected lessons

This is an HTML tag taht would be useful for "try it on your own" sections, where the code is hidden initially but clicking on it will make it available

```bash $ idr --samples Pou5f1-rep1_sorted_peaks.narrowPeak Pou5f1-rep2_sorted_peaks.narrowPeak \ --input-file-type narrowPeak \ --rank p.value \ --output-file Pou5f1-idr \ --plot \ --log-output-file pou5f1.idr.log ```

csaw lesson?

We've been using csaw for histone modification studies as an alternative to peak-calling or focusing on specific features like TSS regions. It might be useful as a lesson.

add in greylisting info

From Rory:
"greylist regions are regions where the input exceeds a threshold, where peak-callers sometimes call spurious peaks. threshold is calculated by calculating depth over the input, sampling it repeatedly and estimating negative binomial parameters and then taking the threshold as the .99 quantile of the NB"

https://github.com/roryk/chipseq-greylist i just copied what the chipseqgreylist R package does

Samtools and sambamba in QC lesson

  • Samtools is being used in QC lesson, needs more detail to introduce it.
  • Add a note on similarities between samtools and sambamba. Only using sambamba where necessary.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.