Giter VIP home page Giter VIP logo

ricpipe's Introduction

RICpipe

Softwares required for RIC-seq analysis

Download and install STAR (v.020201) for short reads mapping from:

https://github.com/alexdobin/STAR

Download and install SAMtools (v.0.1.19) from:

https://github.com/samtools/samtools

Download and install BEDtools (v2.28.0) from:

https://github.com/arq5x/bedtools2

Download and install FastQC for quality control of sequencing reads from:

https://www.bioinformatics.babraham.ac.uk/projects/fastqc

Download and install Trimmomatic (v.0.36) for trimming adapter from:

http://www.usadellab.org/cms/?page=trimmomatic

Download and install cutadapt (v.1.15) for cropping low-complexity fragments from:

https://cutadapt.readthedocs.io/en/stable

To extract pairwise splicing site from gtf files: #step1: perl gtf_to_bed.pl gencode.v19.annotation.gtf > gencode.v19.annotation.bed #step2: perl creat_junction_bed.pl gencode.v19.annotation.bed > gencode.v19.all_exon_junction.bed This is an example bash to obtain pairwise splicing sites from genocode.v19 annotation files.  These two perl scripts are also uploaded to the scripts folder.

Important:

Add these programs to the PATH environment variable.

ricpipe's People

Contributors

caochch avatar

Stargazers

 avatar Zhang, Feng avatar Arya avatar

Watchers

 avatar

ricpipe's Issues

Speed of STAR aligner is extremely low

Hi cao,

Thanks for your great work! I met a question that when I mapped the reads to rRNA using STAR (2.5.3b) configuration the same as yours, the speed is less than 2 M/h and it takes several days and still not finish. I tried to change machines but it didn't work. The load of CPUs is full during the process. Is it normal for RIC-seq data with your configuration or I should check rRNA reference or STAR version or machine again ?

Thanks!

Time Speed Read Read Mapped Mapped Mapped Mapped Unmapped Unmapped Unmapped Unmapped
M/hr number length unique length MMrate multi multi+ MM short other
Dec 05 05:02:03 0.2 1049165 128 1.7% 134.7 1.3% 21.6% 0.0% 0.0% 58.7% 18.0%
Dec 05 05:04:00 0.4 2092755 128 1.7% 134.6 1.4% 21.2% 0.0% 0.0% 59.1% 18.0%
Dec 05 05:07:15 0.5 3135946 128 1.6% 134.4 1.4% 21.0% 0.0% 0.0% 59.3% 18.1%
Dec 05 05:09:53 0.7 4178318 128 1.6% 134.4 1.4% 20.8% 0.0% 0.0% 59.5% 18.1%
Dec 05 05:11:31 1.0 6260778 128 1.6% 134.2 1.4% 20.5% 0.0% 0.0% 59.8% 18.2%
Dec 05 05:12:41 1.2 7302778 128 1.6% 134.1 1.5% 20.4% 0.0% 0.0% 59.8% 18.2%
Dec 05 05:13:41 1.5 9378538 128 1.6% 134.2 1.5% 20.2% 0.0% 0.0% 60.0% 18.2%

How to combine the alignment result of BWA and STAR?

Hi,
Thanks for your work! I met a question. In your article, you wrote that all the mapped reads from STAR and bwa alignments were combined for the subsequent chimeric read analysis using the RICpipe software. You filted the secondary alignment, but I have no idea about how to deal with the Supplementary Alignment of the alignment of bwa. Did you use option -M to flag shorter split hits as secondary?

Thank you.

issues when running the step5.screen_high-confidence_intermolecular

Hello,
Thanks for the code. But when I run the code of recalibrate_pvalue part, I got a mistake. Illegal division by zero at /public/data1/zhoumy/RIC/code/step5.screen_high-confidence_intermolecular/scripts/5.recalibrate_pvalue/2.multiple_testing_correction.pl line 244, line 1281814. The data is processed as described in your article. Can I delete this line? Or how should I handle this situation? Run the simulation again?

Thank you.

How to prepare the gene annotation files

Hi,
Thank you for providing this nice pipeline for RIC data analysis.
I have some trouble in preparing gene annotation files. I use the reference hg38. How can I get the hg19.integrated.NCExon.bed.gz hg19.integrated.NCIntron.bed.gz hg19.integrated.PC3UTR.bed.gz hg19.integrated.PC5UTR.bed.gz hg19.integrated.PCCDS.bed.gz hg19.integrated.PCIntron.bed.gz? which is the meaning of PC?
Thank you!

issues while calling intermolecular interaction

Hi,
Thank you for providing this nice pipeline for RIC data analysis.
I met one issue while I tried to call intermolecular interaction.
In step: sh 3.calculate_pvalue/2.base_on_random/run.sh

The error messages are listed below:

Use of uninitialized value in addition (+) at /home/software/RICpipe-master/step5.screen_high-confidence_intermolecular/scripts/5.recalibrate_pvalue/merge_pvalue.pl line 40, line 24.
Illegal division by zero at /home/software/RICpipe-master/step5.screen_high-confidence_intermolecular/scripts/5.recalibrate_pvalue/merge_pvalue.pl line 45, line 24.
1 significant interactions
129079 other types interactions

Could you try to help? Thank you so much.

Y

count_link_for_each_kind.pl cannot get gapped reads number

Hi, Cao:

Thanks for your great work! I met one problem when running the script step1.collect_pair_tags/scripts/count_link_for_each_kind.pl, I cannot get the gapped number. After checking script, I found you define a subfunction to calculate gapped number:
sub part_count{
my $file=shift;
my $part_num;
open(IN,$file) || die;
while(my $line=){
chomp $line;
my @sub=split/\s+/,$line;
if($sub[0] =~ /Part_from_Align_Read/){
$part_num+=$sub[1];
}
}
close IN;
return $part_num;
}

This function aims to find the reads' name started with "Part_from_Align_Read" and sum the second column. But I have non reads which starts with "Part_from_Align_Read". The STAR vesion is 2.7.10b。 Only "Head" and "Tail" can be found at the beginning of reads' name. And I don't know what's the function of this subfunction, can it calculate the gapped reads number?
Looking forward to your reply

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.