egaffo / circompara2 Goto Github PK

Improved bioinformatic pipeline to identify and quantify circRNA expression from RNA-seq data by combining multiple circRNA detection methods

License: Other

Python 69.35% R 29.85% Awk 0.24% Shell 0.56%

circompara2's People

Contributors

Stargazers

Watchers

circompara2's Issues

scons: *** [samples/S4/processings/circRNAs/tophat_out/accepted_hits.bam] Error 1

Hello, I have the following error:
scons: *** [samples/S4/processings/circRNAs/tophat_out/accepted_hits.bam] Error 1
which appears during the tophat process.

Indeed I have the following:

[2023-10-11 15:12:46] Reporting output tracks
[FAILED]
Error running path/softwares/circompara2/tools/tophat-2.1.0.Linux_x86_64/tophat_reports --min-anchor 8 --splice-mismatches 0 --min-report-intron 50 --max-report-intron 500000 --min-isoform-fraction 0.15 --output-dir samples/S4/processings/circRNAs/tophat_out/ --max-multihits 20 --max-seg-multihits 40 --segment-length 25 --segment-mismatches 2 --min-closure-exon 100 --min-closure-intron 50 --max-closure-intron 5000 --min-coverage-intron 50 --max-coverage-intron 20000 --min-segment-intron 50 --max-segment-intron 500000 --read-mismatches 2 --read-gap-length 2 --read-edit-dist 2 --read-realign-edit-dist 3 --max-insertion-length 3 --max-deletion-length 3 --bowtie1 --fusion-search --fusion-anchor-length 20 --fusion-min-dist 10000000 --fusion-read-mismatches 2 --fusion-multireads 2 --fusion-multipairs 2 -z gzip -p4 --gtf-annotations path/softwares/circompara2/test_circompara/annotation/CFLAR_HIPK3.gtf --gtf-juncs samples/S4/processings/circRNAs/tophat_out/tmp/CFLAR_HIPK3.juncs --no-closure-search --no-coverage-search --no-microexon-search --sam-header samples/S4/processings/circRNAs/tophat_out/tmp/CFLAR_HIPK3_genome.bwt.samheader.sam --report-discordant-pair-alignments --report-mixed-alignments --samtools=pathsoftwares/circompara2/tools/tophat-2.1.0.Linux_x86_64/samtools_0.1.18 --bowtie2-max-penalty 6 --bowtie2-min-penalty 2 --bowtie2-penalty-for-N 1 --bowtie2-read-gap-open 5 --bowtie2-read-gap-cont 3 --bowtie2-ref-gap-open 5 --bowtie2-ref-gap-cont 3 path/softwares/circompara2/test_circompara/analysis_fl/dbs/indexes/indexes/bowtie/CFLAR_HIPK3.fa samples/S4/processings/circRNAs/tophat_out/junctions.bed samples/S4/processings/circRNAs/tophat_out/insertions.bed samples/S4/processings/circRNAs/tophat_out/deletions.bed samples/S4/processings/circRNAs/tophat_out/fusions.out samples/S4/processings/circRNAs/tophat_out/tmp/accepted_hits samples/S4/processings/circRNAs/tophat_out/tmp/left_kept_reads.bam
Loaded 74 GFF junctions from samples/S4/processings/circRNAs/tophat_out/tmp/CFLAR_HIPK3.juncs.

It is hard to understand as the logs do not provide much more info. I am trying to run circompara2 on my data (it was working with the test data you provide).
Any idea on what could be the problem?

Thanks a bunch!

Error in strsplit(grep("Sequence length", x = fastqc_data.txt, value = T), : subscript out of bounds

Hi egaffo,
I'm trying to run circompara on my samples, but I got the following error:

echo "No reads in /home/aaa/Desktop/circompara2/test_circompara/reads/SRR12383421_2.fastq.gz" > samples/SRR12383421/read_statistics/fastqc_stats/SRR12383421_2_fastqc.html && echo "No reads in /home/aaa/Desktop/circompara2/test_circompara/reads/SRR12383421_2.fastq.gz" > samples/SRR12383421/read_statistics/fastqc_stats/SRR12383421_2_fastqc/fastqc_data.txt && fastqc /home/aaa/Desktop/circompara2/test_circompara/reads/SRR12383421_2.fastq.gz -o samples/SRR12383421/read_statistics/fastqc_stats --extract > samples/SRR12383421/read_statistics/fastqc_stats/SRR12383421_2.fastq_fastqc.log 2> samples/SRR12383421/read_statistics/fastqc_stats/SRR12383421_2.fastq_fastqc.err
get_stringtie_rawcounts.R -g samples/SRR12383421/processings/stringtie/SRR12383421_transcripts.gtf -f /home/aaa/Desktop/circompara2/test_circompara/analysis/samples/SRR12383421/read_statistics/fastqc_stats/SRR12383421_1_fastqc/fastqc_data.txt,/home/aaa/Desktop/circompara2/test_circompara/analysis/samples/SRR12383421/read_statistics/fastqc_stats/SRR12383421_2_fastqc/fastqc_data.txt -o samples/SRR12383421/processings/stringtie/SRR12383421_
Error in strsplit(grep("Sequence length", x = fastqc_data.txt, value = T), :
subscript out of bounds
Calls: mean -> sapply -> lapply -> FUN -> mean -> strsplit
Execution halted
scons: *** [samples/SRR12383421/processings/stringtie/SRR12383421_gene_expression_rawcounts.csv] Error 1
scons: building terminated because of errors.

What could I do to solve it?

Best
Yaming

resume command ???

IS there any option to resume the circompara2 comand from any last point ???

I am having trouble generating the back_spliced_junction.bed file during the CIRCexplorer2_tophat fusion step while using the circompara2 docker.

@egaffo Hello, I am having trouble generating the back_spliced_junction.bed file during the CIRCexplorer2_tophat fusion step while using the circompara2 docker.I have 12 samples, and 11 of them run smoothly, only the one named H7 sample fails to run.

This is a sample that runs normally.

This is the sample H7

I don’t know why only the H7 sample would produce an error.

There seems to be plenty of memory space left.

I split the H7 sample into two parts.

Interestingly, H701 can run normally while H702 encountered an error.

After converting the BAM file of H7 into fastq, the entire process can be successfully completed.But this step will lose some information.

I hope to receive your help, thank you!

Update mappers

Some of the software dependencies in the Docker container are several versions behind a recent version.
Would it be possible to update the Dockerfile to work with these updated versions?

At least for some versions of the mappers this would be convenient as not to have to rerun the indexing script.

Software	Website	used_version	latest_release
Scons	http://www.scons.org	3.1.2	4.5.2
Trimmomatic	http://www.usadellab.org/cms/?page=trimmomatic	0.39	0.39
FASTQC	http://www.bioinformatics.babraham.ac.uk/projects/fastqc/	0.11.9	0.12.1
HISAT2	http://ccb.jhu.edu/software/hisat2/index.shtml	2.1.0	2.2.1
STAR	http://github.com/alexdobin/STAR	2.6.1e	2.7.11
BWA	http://bio-bwa.sourceforge.net/	0.7.15-r1140	0.7.17
Bowtie2	http://bowtie-bio.sourceforge.net/bowtie2/index.shtml	2.4.1	2.5.1
Bowtie	http://bowtie-bio.sourceforge.net/index.shtml	1.1.2	1.3.1
TopHat	http://ccb.jhu.edu/software/tophat/index.shtml	2.1.0	2.1.1
Segemehl	http://www.bioinf.uni-leipzig.de/Software/segemehl/	0.3.4	0.3.4
CIRI	http://ciri.sourceforge.io/	2.0.6	2.1.1
CIRCexplorer2	http://github.com/YangLab/CIRCexplorer2	2.3.8	2.3.8
find_circ	http://github.com/marvin-jens/find_circ	1.2	1.2
BEDtools	http://bedtools.readthedocs.io	2.29.2	2.31.0
Samtools	http://www.htslib.org/	1.10	1.18

Thanks in advance
Jasper

Error message in docker

Hi Egaffo,
since I had some trouble with the old version of samtools in combination with SUSE, I've switched to your given docker image. This is the first docker image I'm using. After starting you docker image using docker run egaffo/circompara2:v0.1.2.1 I get the following message and the process gets aborted:

IOError: [Errno 2] No such file or directory: '/data/meta.csv':
  File "/circompara2/src/sconstructs/main.py", line 447:
    with open(env['META']) as csvfile:

Do you know how to continue? Thanks!
Bernd

Running out of RAM when using STAR to generate genome in circompara2 pipeline

Hi there,
I am trying to apply circompara2 to detect circRNA in human RNAseq dataset, but now I ran into a problem as follows:

The step terminated:
cd dbs/indexes/indexes/star/ref-transcripts && STAR --runMode genomeGenerate --runThreadN 1 --genomeFastaFiles /annotation/ref-transcripts.fa --genomeDir . && cd /home
Feb 15 01:28:06 ..... started STAR run
Feb 15 01:28:06 ... starting to generate Genome files
scons: building terminated because of errors.

The error information:
EXITING because of FATAL PARAMETER ERROR: limitGenomeGenerateRAM=31000000000is too small for your genome
SOLUTION: please specify --limitGenomeGenerateRAM not less than 144424593450 and make that much RAM available

Feb 15 01:29:04 ...... FATAL ERROR, exiting
scons: *** [dbs/indexes/indexes/star/ref-transcripts/chrLength.txt] Error 104

I guess this is because STAR will eat too much RAM when generate genome files, so I made a change in var.py to specify a larger RAM for STAR, but I still get the same error (still same information saying that limitGenomeGenerateRAM=31000000000is too small for your genome), so it seems like the STAR command I updated in var.py doesn't work:
META = 'meta.csv'
GENOME_FASTA = '../annotation/ref-transcripts.fa'
ANNOTATION = '../annotation/ref-transcripts.gtf'
CPUS = '1'
STAR_PARAMS = ['--limitGenomeGenerateRAM', '160424593450']

Could you please help me about this error? Thanks a lot!
Best,

Awesome tool!

I like it

Quick installation fails

hi @egaffo
installation file (install_circompara) is missing. Found it in bash folder, though fail when run.
thnks
santiago

error libncurses.so.5 related to segemehl.x

Hello,
When trying to run the analysis with PE reads, it won't work because of segemehl.x which seems to rely on libncurses.so.5.
I tried to remove segemehl.x from the vars.py and main.py to try one run without that mapper but I miss something because it stills need it and throw the same error.

But my main problem is that libncurses.so.5 is an old library, is there a way to make it work with ncurses version 6.4 or to force the usage of libncurses.so.6?

Thank you!

scons: *** [samples/sample_A/processings/circRNAs/dcc/CircRNACount] Error 1

When I run the test, whether cd test_circompara/analysis ../../circompara2 or cd test_circompara/analysis_se ../../circompara2 , it reported this error.
I install circompara2 in conda python=2.7 envs as #3 Alipe2021 did. I am new to this, so I don't know if it's installed properly yet or it is a normal result.
Anyone can give me a hand, thanks!

samtools view -F 4 samples/sample_A/processings/circRNAs/star_out/Aligned.sortedByCoord.out.bam | cut -f 1 | sort | uniq |
wc -l > samples/sample_A/processings/circRNAs/star_out/STAR_mapped_reads_count.txt
DCC -fg -M -F -Nr 1 1 -N -T 4 -D -O samples/sample_A/processings/circRNAs/dcc -t samples/sample_A/processings/circRNAs/dcc/_tmp_DCC samples/sample_A/processings/circRNAs/star_out/Chimeric.out.junction
Output folder samples/sample_A/processings/circRNAs/dcc already exists, reusing
DCC 0.4.8 started
6 CPU cores available, using 4
WARNING: File samples/sample_A/processings/circRNAs/star_out/Chimeric.out.junction is empty!
Junction files seem empty, skipping circRNA detection module.
circRNA detection skipped due to empty junction files
Filter mode for detected circRNAs enabled without detection module.
Combine with -f or -D.
scons: *** [samples/sample_A/processings/circRNAs/dcc/CircRNACount] Error 1
scons: building terminated because of errors.

TypeError: cannot concatenate 'str' and 'int' objects:

Hi egaffo, thanks for the prompt reply. I tried the new docker but I keep having a similar error:

user@NGS:~/CirComPara$ sudo docker run -u id -u --rm -it -v $(pwd):/data egaffo/circompara2:v0.1.2.1
scons: Reading SConscript files ...
TypeError: cannot concatenate 'str' and 'int' objects:
File "/circompara2/src/sconstructs/main.py", line 489:
exports = '''env_check_indexes''')
File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Script/SConscript.py", line 660:
return method(*args, **kw)
File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Script/SConscript.py", line 597:
return _SConscript(self.fs, *files, **subst_kw)
File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Script/SConscript.py", line 286:
exec(compile(scriptdata, scriptname, 'exec'), call_stack[-1].globals)
File "/circompara2/src/sconstructs/check_indexes.py", line 118:
exports = '''env_build_indexes ''')
File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Script/SConscript.py", line 660:
return method(*args, **kw)
File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Script/SConscript.py", line 597:
return _SConscript(self.fs, *files, **subst_kw)
File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Script/SConscript.py", line 286:
exec(compile(scriptdata, scriptname, 'exec'), call_stack[-1].globals)
File "/circompara2/src/sconstructs/build_indexes.py", line 67:
exports = '''env_index_hisat2 ''')
File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Script/SConscript.py", line 660:
return method(*args, **kw)
File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Script/SConscript.py", line 597:
return _SConscript(self.fs, *files, **subst_kw)
File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Script/SConscript.py", line 286:
exec(compile(scriptdata, scriptname, 'exec'), call_stack[-1].globals)
File "/circompara2/src/sconstructs/index_hisat2.py", line 44:
''' ${TARGETS[0].dir}''' + os.path.sep + target_basename + ''' ''' + EXTRA_PARAMS

collect_circrnas.py script problem？

I noticed that the command to extract intron in this script does not seem to be able to correctly generate the gtf file of intron.
As shown in the figure below, the intron interval extracted from the test data provided by you is consistent with the gene interval.

scons: Reading SConscript files ... TypeError: Tried to lookup Dir 'dbs/indexes' as a File.:

Is circompara2 allowing for STAR-generated bam file as input?

Hi there,
I'd like to try using circompara2 to map circRNA reads in human genome for my RNAseq project. But since my project has over 1000 samples and we've already done the alignment work for mRNA by STAR aligner (which took lots of time), so I just wonder whether circompara2 may allow for the STAR-generated BAM file as a input? Because if we rerun the genome alignment pipeline using pair-end fastq files once again, that could cost extra a lot of time..
Thanks a lot and looking forward to your kind reply!
Best regards,
Weiqian Jiang

scons: building terminated because of errors.

Hi egaffo, I am having the following error message:

[M::mem_process_seqs] Processed 1568628 reads in 49.756 CPU sec, 12.022 real sec
[M::process] read 1568628 sequences (80000028 bp)...
[M::mem_process_seqs] Processed 1568628 reads in 48.726 CPU sec, 8.547 real sec
[M::process] read 1568628 sequences (80000028 bp)...
[M::mem_process_seqs] Processed 1568628 reads in 51.428 CPU sec, 13.630 real sec
[M::process] read 1568628 sequences (80000028 bp)...
[M::mem_process_seqs] Processed 1568628 reads in 50.021 CPU sec, 13.461 real sec
[M::process] read 1215784 sequences (62004984 bp)...
[M::mem_process_seqs] Processed 1215784 reads in 38.014 CPU sec, 7.662 real sec
[main] Version: 0.7.15-r1140
[main] CMD: bwa mem -t 8 -T 19 /data/dbs/indexes/indexes/bwa/Homo_sapiens.hg38.dna_sm.chromosome.Y /data/samples/S1/processings/hisat2_out/S1_unmapped.fastq.gz
[main] Real time: 188.473 sec; CPU: 654.355 sec
[SEGEMEHL] Tue Feb 7 10:37:11 2023: threaded matching w/ suffixarray has taken 1046.000000 seconds.
[SEGEMEHL] Tue Feb 7 10:37:11 2023: Mapping stats:
total mapped (%) unique (%) multi (%) split (%)
all 20039320 957 0.00% 923 0.00% 34 0.00% 861 0.00%
[SEGEMEHL] Tue Feb 7 10:37:11 2023:
Goodbye.
"Hab' ich gerade inner Bild gelesen!" (Bienchen)
scons: building terminated because of errors.

I am running the pipeline with only one chr. Hovewer, I've alredy run the pipeline successfully with other samples (same chr genome, same chr gtf, ecc...). I have no idea what is going on, the error is very generic and . Could you help me with this?

-ignoreGroupsWithoutExons & singularity

Hi egaffo,
I'm trying to run circompara on my samples, but I got the following error:

finished circRNA detection from file samples/S1/processings/circRNAs/star_out/Chimeric.out.junction
dcc_fix_strand.R -c samples/S1/processings/circRNAs/dcc/CircRNACount -d samples/S1/processings/circRNAs/dcc/CircCoordinates -o samples/S1/processings/circRNAs/dcc/strandedCircRNACount
dcc_compare.R -l S1 -i samples/S1/processings/circRNAs/dcc/strandedCircRNACount -o circular_expression/circrna_collection/merged_samples_circrnas/dcc_compared.csv
gtfToGenePred -infoOut=dbs/indexes/genePred.transcripts.info /home/miguilab/circular/BL/ncbi_dataset/data/GCA_927797965.1/genomic.gtf dbs/indexes/genomic.genePred
no exons defined for group , feature gene (perhaps try -ignoreGroupsWithoutExons)
scons: *** [dbs/indexes/genePred.transcripts.info] Error 255
scons: building terminated because of errors.

What could I do to solve it? I don't understand where I could use the option -ignoreGroupsWithoutExons.
Further, do you think there's any way to run circompara2 through singularity? I've successfully used docker before, but now I don't have sudo privileges on my machine anymore.

Best,

Migui

Calls: mean -> sapply -> lapply -> FUN -> mean -> strsplit Execution halted

echo "No reads in /dell/muscle/sample9_2.fastq.gz" > samples/sample9/read_statistics/fastqc_stats/sample9_2_fastqc.html && echo "No reads in /dell/muscle/sample9_2.fastq.gz" > samples/sample9/read_statistics/fastqc_stats/sample9_2_fastqc/fastqc_data.txt && fastqc /dell/muscle/sample9_2.fastq.gz -o samples/sample9/read_statistics/fastqc_stats --extract > samples/sample9/read_statistics/fastqc_stats/sample9_2.fastq_fastqc.log 2> samples/sample9/read_statistics/fastqc_stats/sample9_2.fastq_fastqc.err
get_stringtie_rawcounts.R -g samples/sample9/processings/stringtie/sample9_transcripts.gtf -f /dell/circompara2/test_circompara/analysis/samples/sample9/read_statistics/fastqc_stats/sample9_1_fastqc/fastqc_data.txt,/dell/circompara2/test_circompara/analysis/samples/sample9/read_statistics/fastqc_stats/sample9_2_fastqc/fastqc_data.txt -o samples/sample9/processings/stringtie/sample9_
Error in strsplit(grep("Sequence length", x = fastqc_data.txt, value = T), :
subscript out of bounds
Calls: mean -> sapply -> lapply -> FUN -> mean -> strsplit
Execution halted
scons: *** [samples/sample9/processings/stringtie/sample9_gene_expression_rawcounts.csv] Error 1
scons: building terminated because of errors.

How to solve this ?

Erro: unmapped2anchors.py: error: no such option: -Q

When I run the command circompara2 it reported an error:

BYPASS = ['linear']: skipping linear transcript analysis
scons: done reading SConscript files.
scons: Building targets ...
unmapped2anchors.py -Q <( zcat samples/CK_BML1234_00h1/processings/circRNAs/CK_BML1234_00h1.unmappedSE.fq.gz ) | bowtie2 --seed 123 -p 16 --reorder --score-min=C,-15,0 -q -x /MaizeLab/auhpc1/DataBase/Species/Zea_Mays/B73v4/Bowtie2Index/genome.fa -U - 2> samples/CK_BML1234_00h1/processings/circRNAs/find_circ_out/bt2_secondpass.log | find_circ.py -G /MaizeLab/auhpc1/DataBase/Species/Zea_Mays/B73v4/zma_dna_v4.fa -p CK_BML1234_00h1_ -s samples/CK_BML1234_00h1/processings/circRNAs/find_circ_out/find_circ.log -R samples/CK_BML1234_00h1/processings/circRNAs/find_circ_out/sites.reads > samples/CK_BML1234_00h1/processings/circRNAs/find_circ_out/sites.bed
Usage:

  unmapped2anchors.py <alignments.bam> > unmapped_anchors.qfa

Extract anchor sequences from unmapped reads. Optionally permute.


unmapped2anchors.py: error: no such option: -Q
Traceback (most recent call last):
  File "/opt/biosoft/find_circ-1.2/find_circ.py", line 608, in <module>
    sam = pysam.Samfile('-','r')
  File "pysam/libcalignmentfile.pyx", line 741, in pysam.libcalignmentfile.AlignmentFile.__cinit__
  File "pysam/libcalignmentfile.pyx", line 990, in pysam.libcalignmentfile.AlignmentFile._open
ValueError: file has no sequences defined (mode='r') - is it SAM/BAM format? Consider opening with check_sq=False

gzip: stdout: Broken pipe
scons: *** [samples/CK_BML1234_00h1/processings/circRNAs/find_circ_out/sites.bed] Error 1
scons: building terminated because of errors.

[samples/SRR5388395/processings/hisat2_out/SRR5388395_hisat2.bam] Explicit dependency `/home/mli/Desktop/Non_coding_RNA_Prediction/Circle_RNA/circompara2-master/genome_indexes/indexes/hisat2/GRCh38.1.ht2' not found

Dear author:

I was runing your pipeline using docker on centos7. I pre-built indexes using your scripts, but still get the error.

[samples/SRR5388395/processings/hisat2_out/SRR5388395_hisat2.bam] Explicit dependency `/home/mli/Desktop/Non_coding_RNA_Prediction/Circle_RNA/circompara2-master/genome_indexes/indexes/hisat2/GRCh38.1.ht2' not found

I do have the file. I am running this script : "docker run -u id -u --rm -it -v $(pwd):/data egaffo/circompara2:v0.1.2.1"

I do have meta.csv and vars.py under current directory.

Thank you so much ~

why three different mappers ??

circompara2 is using tophat, star and bowtie2. why all three? I mean is it necessary to run all three or One of them is enough ???

CircRNA junction sequence

Any idea for exporting the junction sequence for identified circRNAs? Which may help for down-stream analysis, such as quantification using ribo-seq data.

Error with setting HISAT2_EXTRA_PARAMS = "--rna-strandness RF"

Dear egaffo:

I am sorry to bother you again.

I was running circompara2 on CentOS 7 using docker.

Here is my command : "docker run -u id -u --rm -it -v $(pwd):/data egaffo/circompara2:v0.1.2.1 -j 2", it worked perfect. However, when I uncommented HISAT2_EXTRA_PARAMS = "--rna-strandness RF" , it gave me the following error:

scons:building terminated because of errors.

Our circRNA-seq library was build using stand-specific protocols.

Do you have any idea why this would occur? Thank you very much!

Best
Mian

address 0x7f6156ce83d8, cause 'invalid permissions'

Hi Egaffo, I have the following error:
"[SEGEMEHL] Fri Dec 29 08:21:58 2023: starting 65 threads.
[SEGEMEHL] Fri Dec 29 18:04:02 2023: threaded matching w/ suffixarray has taken 34924.000000 seconds.
[SEGEMEHL] Fri Dec 29 18:04:05 2023: Mapping stats:
total mapped (%) unique (%) multi (%) split (%)
all 22946498 12515504 54.54% 11889302 51.81% 626202 2.73% 6312497 27.51%
pair 11473249 2105160 18.35% 1879050 16.38% 226110 1.97% 392683 3.42%
[SEGEMEHL] Fri Dec 29 18:04:06 2023:
Goodbye.
"Ich hol' jetzt die Hilti!" (Ein verzweifelter Bauarbeiter)
filter_segemehl.R -i samples/processings/circRNAs/segemehl/unmapped_1.fastq.sngl.bed -t samples/processings/circRNAs/segemehl/unmapped_1.fastq.trns.txt -q median_1 -o samples/processings/circRNAs/testrealign/splicesites.bed -r samples/processings/circRNAs/testrealign/mpo.circular.reads.bed.gz -l samples/processings/circRNAs/testrealign/mpo.old.segemehl.format.bed

*** caught segfault ***
address 0x7f6156ce83d8, cause 'invalid permissions'
scons: *** [samples/processings/circRNAs/testrealign/splicesites.bed] Error -11
scons: building terminated because of errors." I run this with docker. Could you help me with this?

scons: warning: Two different environments were specified for target SRR5388395_1_fastqc.html

Dear author：

I was running circompara2 on CentOS 7 by docker.

Here is my var.py and run command : "docker run -u id -u --rm -it -v $(pwd):/data egaffo/circompara2:v0.1.2.1"

$ cat vars.py

META = 'meta.csv'
CPUS = '4'
## pre-computed index and annotation files
GENOME_FASTA="genome_indexes/human_ref/GRCh38.fa"
GENOME_INDEX="genome_indexes/indexes/hisat2/GRCh38"
SEGEMEHL_INDEX="genome_indexes/indexes/segemehl/GRCh38.idx"
BWA_INDEX="genome_indexes/indexes/bwa/GRCh38"
BOWTIE2_INDEX="genome_indexes/indexes/bowtie2/GRCh38"
BOWTIE_INDEX="genome_indexes/indexes/bowtie/GRCh38"
STAR_INDEX="genome_indexes/indexes/star/GRCh38"
GENEPRED="genome_indexes/annotation/GRCh38.genePred.wgn"
ANNOTATION="genome_indexes/human_ref/GRCh38.gtf"

PREPROCESSOR = 'trimmomatic'
PREPROCESSOR_PARAMS = 'MAXINFO:40:0.5 LEADING:20 TRAILING:20 SLIDINGWINDOW:4:30 MINLEN:35 AVGQUAL:30'

I am getting the following warnings:

scons: Reading SConscript files ...

scons: warning: Two different environments were specified for target SRR5388395_1_fastqc.html,
but they appear to have the same action: echo "No reads in ${SOURCES[0]}" > ${TARGETS[0]} && echo "No reads in ${SOURCES[0]}" > ${TARGETS[4]} && fastqc $SOURCE -o ${TARGETS[0].dir} --extract > ${TARGETS[2]} 2> ${TARGETS[3]}
File "/circompara2/src/sconstructs/fastqc.py", line 66, in

scons: warning: Two different environments were specified for target SRR5388395_1_fastqc.zip,
but they appear to have the same action: echo "No reads in ${SOURCES[0]}" > ${TARGETS[0]} && echo "No reads in ${SOURCES[0]}" > ${TARGETS[4]} && fastqc $SOURCE -o ${TARGETS[0].dir} --extract > ${TARGETS[2]} 2> ${TARGETS[3]}
File "/circompara2/src/sconstructs/fastqc.py", line 66, in

scons: warning: Two different environments were specified for target SRR5388395_1.fastq_fastqc.log,
but they appear to have the same action: echo "No reads in ${SOURCES[0]}" > ${TARGETS[0]} && echo "No reads in ${SOURCES[0]}" > ${TARGETS[4]} && fastqc $SOURCE -o ${TARGETS[0].dir} --extract > ${TARGETS[2]} 2> ${TARGETS[3]}
File "/circompara2/src/sconstructs/fastqc.py", line 66, in

scons: warning: Two different environments were specified for target SRR5388395_1.fastq_fastqc.err,
but they appear to have the same action: echo "No reads in ${SOURCES[0]}" > ${TARGETS[0]} && echo "No reads in ${SOURCES[0]}" > ${TARGETS[4]} && fastqc $SOURCE -o ${TARGETS[0].dir} --extract > ${TARGETS[2]} 2> ${TARGETS[3]}
File "/circompara2/src/sconstructs/fastqc.py", line 66, in

scons: warning: Two different environments were specified for target SRR5388395_1_fastqc/fastqc_data.txt,
but they appear to have the same action: echo "No reads in ${SOURCES[0]}" > ${TARGETS[0]} && echo "No reads in ${SOURCES[0]}" > ${TARGETS[4]} && fastqc $SOURCE -o ${TARGETS[0].dir} --extract > ${TARGETS[2]} 2> ${TARGETS[3]}
File "/circompara2/src/sconstructs/fastqc.py", line 66, in

scons: warning: Two different environments were specified for target SRR5388395_1.fq.P.qtrim_fastqc.html,
but they appear to have the same action: echo "No reads in ${SOURCES[0]}" > ${TARGETS[0]} && echo "No reads in ${SOURCES[0]}" > ${TARGETS[4]} && fastqc $SOURCE -o ${TARGETS[0].dir} --extract > ${TARGETS[2]} 2> ${TARGETS[3]}
File "/circompara2/src/sconstructs/fastqc.py", line 66, in

scons: warning: Two different environments were specified for target SRR5388395_1.fq.P.qtrim_fastqc.zip,
but they appear to have the same action: echo "No reads in ${SOURCES[0]}" > ${TARGETS[0]} && echo "No reads in ${SOURCES[0]}" > ${TARGETS[4]} && fastqc $SOURCE -o ${TARGETS[0].dir} --extract > ${TARGETS[2]} 2> ${TARGETS[3]}
File "/circompara2/src/sconstructs/fastqc.py", line 66, in

scons: warning: Two different environments were specified for target SRR5388395_1.fq.P.qtrim_fastqc.log,
but they appear to have the same action: echo "No reads in ${SOURCES[0]}" > ${TARGETS[0]} && echo "No reads in ${SOURCES[0]}" > ${TARGETS[4]} && fastqc $SOURCE -o ${TARGETS[0].dir} --extract > ${TARGETS[2]} 2> ${TARGETS[3]}
File "/circompara2/src/sconstructs/fastqc.py", line 66, in

scons: warning: Two different environments were specified for target SRR5388395_1.fq.P.qtrim_fastqc.err,
but they appear to have the same action: echo "No reads in ${SOURCES[0]}" > ${TARGETS[0]} && echo "No reads in ${SOURCES[0]}" > ${TARGETS[4]} && fastqc $SOURCE -o ${TARGETS[0].dir} --extract > ${TARGETS[2]} 2> ${TARGETS[3]}
File "/circompara2/src/sconstructs/fastqc.py", line 66, in

scons: warning: Two different environments were specified for target SRR5388395_1.fq.P.qtrim_fastqc/fastqc_data.txt,
but they appear to have the same action: echo "No reads in ${SOURCES[0]}" > ${TARGETS[0]} && echo "No reads in ${SOURCES[0]}" > ${TARGETS[4]} && fastqc $SOURCE -o ${TARGETS[0].dir} --extract > ${TARGETS[2]} 2> ${TARGETS[3]}
File "/circompara2/src/sconstructs/fastqc.py", line 66, in

scons: warning: Two different environments were specified for target SRR5388395_1.fq.U.qtrim_fastqc.html,
but they appear to have the same action: echo "No reads in ${SOURCES[0]}" > ${TARGETS[0]} && echo "No reads in ${SOURCES[0]}" > ${TARGETS[4]} && fastqc $SOURCE -o ${TARGETS[0].dir} --extract > ${TARGETS[2]} 2> ${TARGETS[3]}
File "/circompara2/src/sconstructs/fastqc.py", line 66, in

scons: warning: Two different environments were specified for target SRR5388395_1.fq.U.qtrim_fastqc.zip,
but they appear to have the same action: echo "No reads in ${SOURCES[0]}" > ${TARGETS[0]} && echo "No reads in ${SOURCES[0]}" > ${TARGETS[4]} && fastqc $SOURCE -o ${TARGETS[0].dir} --extract > ${TARGETS[2]} 2> ${TARGETS[3]}
File "/circompara2/src/sconstructs/fastqc.py", line 66, in

scons: warning: Two different environments were specified for target SRR5388395_1.fq.U.qtrim_fastqc.log,
but they appear to have the same action: echo "No reads in ${SOURCES[0]}" > ${TARGETS[0]} && echo "No reads in ${SOURCES[0]}" > ${TARGETS[4]} && fastqc $SOURCE -o ${TARGETS[0].dir} --extract > ${TARGETS[2]} 2> ${TARGETS[3]}
File "/circompara2/src/sconstructs/fastqc.py", line 66, in

scons: warning: Two different environments were specified for target SRR5388395_1.fq.U.qtrim_fastqc.err,
but they appear to have the same action: echo "No reads in ${SOURCES[0]}" > ${TARGETS[0]} && echo "No reads in ${SOURCES[0]}" > ${TARGETS[4]} && fastqc $SOURCE -o ${TARGETS[0].dir} --extract > ${TARGETS[2]} 2> ${TARGETS[3]}
File "/circompara2/src/sconstructs/fastqc.py", line 66, in

scons: warning: Two different environments were specified for target SRR5388395_1.fq.U.qtrim_fastqc/fastqc_data.txt,
but they appear to have the same action: echo "No reads in ${SOURCES[0]}" > ${TARGETS[0]} && echo "No reads in ${SOURCES[0]}" > ${TARGETS[4]} && fastqc $SOURCE -o ${TARGETS[0].dir} --extract > ${TARGETS[2]} 2> ${TARGETS[3]}
File "/circompara2/src/sconstructs/fastqc.py", line 66, in
scons: done reading SConscript files.
scons: Building targets ...

Is this matters ?

Question 2: How can I specify -j 2 using docker image ?

Thank you so much for your time.

an error when running on Docker

Hi, Thanks for your work! I moved reads/annotations directory to analysis directory and edit files meta.csv and var.py as mentioned in README and I ran below command
docker run -u `id -u` --rm -it -v $(pwd):/data egaffo/circompara2:v0.1.2.1
However, I got below error:

TypeError: Tried to lookup Dir 'dbs/indexes' as a File.:
  File "/circompara2/src/sconstructs/main.py", line 489:
    exports = '''env_check_indexes''')
  File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Script/SConscript.py", line 660:
    return method(*args, **kw)
  File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Script/SConscript.py", line 597:
    return _SConscript(self.fs, *files, **subst_kw)
  File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Script/SConscript.py", line 286:
    exec(compile(scriptdata, scriptname, 'exec'), call_stack[-1].globals)
  File "/circompara2/src/sconstructs/check_indexes.py", line 118:
    exports = '''env_build_indexes ''')
  File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Script/SConscript.py", line 660:
    return method(*args, **kw)
  File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Script/SConscript.py", line 597:
    return _SConscript(self.fs, *files, **subst_kw)
  File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Script/SConscript.py", line 286:
    exec(compile(scriptdata, scriptname, 'exec'), call_stack[-1].globals)
  File "/circompara2/src/sconstructs/build_indexes.py", line 60:
    env_index_hisat2['GENOME'] = ','.join([File(f).abspath for f in env['GENOME'].split(',')])
  File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Script/SConscript.py", line 660:
    return method(*args, **kw)
  File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Environment.py", line 2092:
    return self.fs.File(s, *args, **kw)
  File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Node/FS.py", line 1382:
    return self._lookup(name, directory, File, create)
  File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Node/FS.py", line 1361:
    return root._lookup_abs(p, fsclass, create)
  File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Node/FS.py", line 2406:
    result.must_be_same(klass)
  File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Node/FS.py", line 618:
    (self.__class__.__name__, self.get_internal_path(), klass.__name__))```

```var.py
META            = 'meta.csv'
GENOME_FASTA    = '/annotation/CFLAR_HIPK3.fa'
ANNOTATION      = '/annotation/CFLAR_HIPK3.gtf' 
CPUS            = '4'

file,sample,adapter  
/reads/readsA_1.fastq.gz,sample_A,/circompara2/tools/Trimmomatic-0.39/adapters/TruSeq3-PE-2.fa  
/reads/readsA_2.fastq.gz,sample_A,/circompara2/tools/Trimmomatic-0.39/adapters/TruSeq3-PE-2.fa  
/reads/readsB_1.fastq.gz,sample_B,/circompara2/tools/Trimmomatic-0.39/adapters/TruSeq3-PE-2.fa  
/reads/readsB_2.fastq.gz,sample_B,/circompara2/tools/Trimmomatic-0.39/adapters/TruSeq3-PE-2.fa

It only generated an empty directory dbs, and my system is macOS .what did I miss? Appreciated with any reply. Thank u.

egaffo / circompara2 Goto Github PK

circompara2's People

Contributors

Stargazers

Watchers

circompara2's Issues

Recommend Projects

Recommend Topics

Recommend Org