Ribosomal dysregulation: A conserved pathophysiological mechanism in human depression and mouse chronic stress.
This repository contains the code utilized for the paper:
Title: Ribosomal dysregulation: A conserved pathophysiological mechanism in human depression and mouse chronic stress
Authors: Xiaolu Zhang, Mahmoud A Eladawi, William George Ryan, Xiaoming Fan, Stephen Prevoznik, Trupti Devale, Barkha Ramnani, Malathi A Krishnamurthy, Etienne Sibille, Robert McCullumsmith, Toshifumi Tomoda, Rammohan Shukla
DOI: TBD
The file Alignment/Hisat_alignment_code.sh contains the Linux shell scripts utilized to align the fastq files of the RNA-Seq data
The following Linux modules need to be installed to run this script:
It is highly recommended to use a multiprocessing environement for parallel processing. All our alignment processing was performed on Ohio Supercomputers Center (OSC) with two Nodes and 6 CPUs per task.
- Built the index for Hisat2 using the reference genome.
- Change idx_dir to the path to your reference genome index directory.
- Use the Hisat command with the option --avoid-pseudogene to avoid pseudogenes.
- Built the index for Hisat2 using the composite genome.
- Change idx_dir to the path to your composite genome index directory.
- OPTIONAL: use the Hisat command with the option --avoid-pseudogene to avoid pseudogenes.
The file R/ParentChild.R contains the R code utilized to generate:
- Get the frquencies of the ancestors of the GO terms of interest.
- Get the truth tables of the ancestors of interest.
The file R/seedGenes_analysis.R contains the R code utilized to generate the correlation analysis files.
Four inpput files are needed:
- seed_genes_up.csv that contains a list up regulated seed genes.
- seed_genes_down.csv that contains a list down regulated seed genes.
- seed_genes_non.csv that contains a list not sigficantly differerent seed genes.
- nData.csv that contains the normalized genes expression data
The following output files will be generated for each seed genes category (up, down, and non):
- r_sig contains the significant Pearson correlation values (p < 0.05)
- r_sig_neg contains only the negative significant Pearson correlation values (p < 0.05)
- r_sig_pos contains only the positive significant Pearson correlation values (p < 0.05)
- r_sig_neg_count contains the count of the genes that are (significantly) negatively correlated with each seed gene.
- r_sig_pos_count contains the count of the genes that are (significantly) positively correlated with each seed gene.