Giter VIP home page Giter VIP logo

rheum-plier-data's Introduction

A data repository for rheumatic disease gene expression data

This repository contains data and processing code for use in a project examining gene expression patterns in autoimmune/rheumatic diseases.

Datasets

A note on processing

The processing code for each dataset (or compendium in the case of sle-wb) is contained within each subdirectory (if applicable). For more information on our data processing strategy, see sle-wb/README.md.

Within this repository, we obtain recount2 data through the recount bioconductor package, further process it, and apply PLIER.

The recount2 data and results are too large to be stored with Git LFS, so we have placed them on figshare. DOI: 10.6084/m9.figshare.5716033.v4. This version is current as of 978c379.

Citations:

Collado-Torres L, Nellore A, Kammers K, et al. Reproducible RNA-seq analysis using recount2. Nature Biotechnology, 2017. doi: 10.1038/nbt.3838.

Mao W, Chikina M. Pathway-Level Information ExtractoR (PLIER): a generative model for gene expression data. bioRxiv, 2017. doi: 10.1101/116061

Granulomatosis with polyangiitis

Two GPA (Wegener's) datasets are included in this repository:

  • NARES -- a dataset that consists of nasal brushings from patients with GPA with or without a history of nasal disease.
  • GSE18885 -- a blood (fractions) dataset; we use submitter-processed data from GEO.

Citations:

Grayson PC, Steiling K, Platt M, et al. Defining the Nasal Transcriptome in Granulomatosis with Polyangiitis. Arthritis & Rheumatology, 2015. doi: 10.1002/art.39185.

Cheadle C, Berger AE, Andrade F, et al. Transcription of PR3 and Related Myelopoiesis Genes in Peripheral Blood Mononuclear Cells in Active Wegener’s Granulomatosis. Arthritis & Rheumatism, 2010. doi: 10.1002/art.27398.

Systemic lupus erythematosus whole blood

See sle-wb for more information (including citations).

Low density granulocytes

GSE26975 is a dataset that includes the following isolated cell type populations: healthy neutrophils, normal density neutrophils from patients with lupus, and low density granulocytes (LDGs) from patients with lupus.

Citation:

Villanueva E, Yalavarthi S, Berthier CC, Hodgin JB et al. Netting neutrophils induce endothelial damage, infiltrate tissues, and expose immunostimulatory molecules in systemic lupus erythematosus. J Immunol. 2011. doi: 10.4049/jimmunol.1100450

Diffuse intrinsic pontine glioma (DIPG)

Two datasets:

Citations:

Paugh BS, Broniscer A, Qu C, et al. Genome-wide analyses identify recurrent amplifications of receptor tyrosine kinases and cell-cycle regulatory genes in diffuse intrinsic pontine glioma. J Clin Oncol. 2011;29(30):3999-4006.

Buczkowicz P, Hoeman C, Rakopoulos P, et al. Genomic analysis of diffuse intrinsic pontine gliomas identifies three molecular subgroups and recurrent activating ACVR1 mutations. Nat Genet. 2014;46(5):451-6.

Medulloblastoma

GSE37382 is medulloblastoma data that was processed via refine.bio (using SCANfast).

Citation:

Northcott PA, Shih DJ, Peacock J, et al. Subgroup-specific structural variation across 1,000 medulloblastoma genomes. Nature. 2012;488(7409):49-56.

Docker

All the dependences for this processing pipeline are included on a Docker image. This can be obtained by installing Docker and pulling the appropriate tagged images from Dockerhub:

Microarray data processing

The Docker image used for microarray data processing is tagged v1.

docker pull jtaroni/multi-plier:v1

For the Dockerfile and a list of user-installed R packages, see docker/v1.

The R scripts in isolated-cell-pop, NARES, and the sle-wb pipeline were run in the jtaroni/multi-plier:v1 container as of 28a1249.

recount2 data processing

The Docker image used for microarray data processing is tagged recount.

docker pull jtaroni/multi-plier:recount

For the Dockerfile and a list of user-installed R packages, see docker/recount.

The Rscripts in recount2/ were run in the jtaroni/multi-plier:recount container as of 978c379.

RNA-seq

We use Salmon and tximport for our RNA-seq processing pipeline.

Salmon

The Docker image used for building a Salmon index and quantification with Salmon:

docker pull combinelab/salmon:0.9.1

tximport

Following quantification with Salmon, we summarize to the gene-level using tximport in the following Docker image (docker/summarize_tx/Dockerfile):

docker pull jtaroni/summarize_tx:3.4.3

License

This repository is dual licensed as BSD 3-Clause (source code) and CC0 1.0 (figures, documentation, and our arrangement of the facts contained in the underlying data).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.