Comments (9)
Thanks for the bug report.
I'm afraid it would be very hard to fix the problem without the specific
data. It would be the best if you could send me the fraction (e.g. 1K)
reads causing the problem; otherwise, maybe you could tell me more about
the reads (e.g. length and number) but this information may not help much.
In addition, you could try to change the distanceLow and distanceHigh
options from 1000 to "insert length minus 1000" and "insert length plus
1000" as shown in the manual. I'm not sure if this is cause of the
problem, since the 1000 parameter setting only causes bad performance but
not segment fault on my various test data.
I'm getting a segfault on 1 TB RAM machine comparing two small bacterial
genomes.Here's the stdout:
cmd-> cat alignGraph.out
AlignGraph: algorithm for secondary de novo genome assembly guided by
closely related references
By Ergude Bao, CS Department, UC-Riverside. All Rights Reserved(0) Alignment finished
CHROMOSOME 0:
(1) chromosome loaded
(2) contig alignment loadedHere's error:
[2]+ Segmentation fault
/sgi/asmopt/src/AlignGraph/AlignGraph/AlignGraph --read1 all_R1.fasta
--read2 all_R2.fasta --contig contigs.fasta --genome chromosome.fasta
--distanceLow 1000 --distanceHigh 1000 --extendedContig extendedContigs.fa
--remainingContig remainingContigs.fa > alignGraph.out 2> alignGraph.errAlso, you should not hardcode the number of processors for bowtie2 to 8 -
we have 64, the prog should pick max at runtime.
Reply to this email directly or view it on GitHub:
#1
from aligngraph.
Hi Bao,
I think you should provide a couple of command-line examples as I struggled to get it right.
The instructions sound like the command should include signed integers which is problematic.
I have a mean insert size of 590 with a SD 200 so here are the params I used:
'--distanceLow -410 --distanceHigh 1590'; this did not work because of negative sign so changed it to '--distanceLow 100 --distanceHigh 1590'.
The segmentation fault occurs after the bowtie2 completed mapping and blat has completed its alignment, when AlignGraph is processing (as confirmed by '(2) contig alignment loaded' in log).
There's not much I could send to you as the files in tmp dir are large:
total 15G
0 -rw-rw-r-- 1 dbrami employees 0 Jun 25 10:07 _chaff.fa
4.0M -rw-rw-r-- 1 dbrami employees 4.0M Jun 25 10:07 _contigs.fa
8.0K -rw-rw-r-- 1 dbrami employees 5.9K Jun 25 10:12 _contigs_genome.0.psl
4.9M -rw-rw-r-- 1 dbrami employees 4.8M Jun 25 10:07 _genome.0.fa
5.6M -rw-rw-r-- 1 dbrami employees 5.6M Jun 25 10:07 _genome.1.bt2
1.2M -rw-rw-r-- 1 dbrami employees 1.2M Jun 25 10:07 _genome.2.bt2
0 -rw-rw-r-- 1 dbrami employees 17 Jun 25 10:07 _genome.3.bt2
1.2M -rw-rw-r-- 1 dbrami employees 1.2M Jun 25 10:07 _genome.4.bt2
4.9M -rw-rw-r-- 1 dbrami employees 4.8M Jun 25 10:07 _genome.fa
5.6M -rw-rw-r-- 1 dbrami employees 5.6M Jun 25 10:07 _genome.rev.1.bt2
1.2M -rw-rw-r-- 1 dbrami employees 1.2M Jun 25 10:07 _genome.rev.2.bt2
8.0K -rw-rw-r-- 1 dbrami employees 4.9K Jun 25 10:14 _initial_contigs.0.fa
1.3G -rw-rw-r-- 1 dbrami employees 1.3G Jun 25 10:06 _reads_1.fa
1.1G -rw-rw-r-- 1 dbrami employees 1.1G Jun 25 10:07 _reads_2.fa
2.4G -rw-rw-r-- 1 dbrami employees 2.4G Jun 25 10:04 _reads.fa
3.0G -rw-rw-r-- 1 dbrami employees 3.0G Jun 25 10:14 _reads_genome.0.bowtie
7.4G -rw-rw-r-- 1 dbrami employees 7.4G Jun 25 10:12 _reads_genome.bowtie
from aligngraph.
Hi, Daniel,
I find your left read file has a different size from the right read file.
I think this could be the cause of the problem, since I didn't consider
this situation that two pairs have different lengths.
I have made the corresponding updates to AlignGraph to fit this situation,
so you could try with the current software version and see if the problem
has been solved.
I have also updated the manual to make it clearer.
Thanks,
Bao
Hi Bao,
I think you should provide a couple of command-line examples as I
struggled to get it right.
The instructions sound like the command should include signed integers
which is problematic.
I have a mean insert size of 590 with a SD 200 so here are the params I
used:
'--distanceLow -410 --distanceHigh 1590'; this did not work because of
negative sign so changed it to '--distanceLow 100 --distanceHigh 1590'.
The segmentation fault occurs after the bowtie2 completed mapping and blat
has completed its alignment, when AlignGraph is processing (as confirmed
by '(2) contig alignment loaded' in log).There's not much I could send to you as the files in tmp dir are large:
total 15G
0 -rw-rw-r-- 1 dbrami employees 0 Jun 25 10:07 _chaff.fa
4.0M -rw-rw-r-- 1 dbrami employees 4.0M Jun 25 10:07 _contigs.fa
8.0K -rw-rw-r-- 1 dbrami employees 5.9K Jun 25 10:12 _contigs_genome.0.psl
4.9M -rw-rw-r-- 1 dbrami employees 4.8M Jun 25 10:07 _genome.0.fa
5.6M -rw-rw-r-- 1 dbrami employees 5.6M Jun 25 10:07 _genome.1.bt2
1.2M -rw-rw-r-- 1 dbrami employees 1.2M Jun 25 10:07 _genome.2.bt2
0 -rw-rw-r-- 1 dbrami employees 17 Jun 25 10:07 _genome.3.bt2
1.2M -rw-rw-r-- 1 dbrami employees 1.2M Jun 25 10:07 _genome.4.bt2
4.9M -rw-rw-r-- 1 dbrami employees 4.8M Jun 25 10:07 _genome.fa
5.6M -rw-rw-r-- 1 dbrami employees 5.6M Jun 25 10:07 _genome.rev.1.bt2
1.2M -rw-rw-r-- 1 dbrami employees 1.2M Jun 25 10:07 _genome.rev.2.bt2
8.0K -rw-rw-r-- 1 dbrami employees 4.9K Jun 25 10:14 _initial_contigs.0.fa
1.3G -rw-rw-r-- 1 dbrami employees 1.3G Jun 25 10:06 _reads_1.fa
1.1G -rw-rw-r-- 1 dbrami employees 1.1G Jun 25 10:07 _reads_2.fa
2.4G -rw-rw-r-- 1 dbrami employees 2.4G Jun 25 10:04 _reads.fa
3.0G -rw-rw-r-- 1 dbrami employees 3.0G Jun 25 10:14
_reads_genome.0.bowtie
7.4G -rw-rw-r-- 1 dbrami employees 7.4G Jun 25 10:12 _reads_genome.bowtie
Reply to this email directly or view it on GitHub:
#1 (comment)
from aligngraph.
Yes trimmed reads often have different lengths. I will try tomorrow.
On Wed, Jun 25, 2014 at 4:47 PM, Bao [email protected] wrote:
Hi, Daniel,
I find your left read file has a different size from the right read file.
I think this could be the cause of the problem, since I didn't consider
this situation that two pairs have different lengths.
I have made the corresponding updates to AlignGraph to fit this situation,
so you could try with the current software version and see if the problem
has been solved.I have also updated the manual to make it clearer.
Thanks,
BaoHi Bao,
I think you should provide a couple of command-line examples as I
struggled to get it right.
The instructions sound like the command should include signed integers
which is problematic.
I have a mean insert size of 590 with a SD 200 so here are the params I
used:
'--distanceLow -410 --distanceHigh 1590'; this did not work because of
negative sign so changed it to '--distanceLow 100 --distanceHigh 1590'.
The segmentation fault occurs after the bowtie2 completed mapping and
blat
has completed its alignment, when AlignGraph is processing (as confirmed
by '(2) contig alignment loaded' in log).There's not much I could send to you as the files in tmp dir are large:
total 15G
0 -rw-rw-r-- 1 dbrami employees 0 Jun 25 10:07 _chaff.fa
4.0M -rw-rw-r-- 1 dbrami employees 4.0M Jun 25 10:07 _contigs.fa
8.0K -rw-rw-r-- 1 dbrami employees 5.9K Jun 25 10:12
_contigs_genome.0.psl
4.9M -rw-rw-r-- 1 dbrami employees 4.8M Jun 25 10:07 _genome.0.fa
5.6M -rw-rw-r-- 1 dbrami employees 5.6M Jun 25 10:07 _genome.1.bt2
1.2M -rw-rw-r-- 1 dbrami employees 1.2M Jun 25 10:07 _genome.2.bt2
0 -rw-rw-r-- 1 dbrami employees 17 Jun 25 10:07 _genome.3.bt2
1.2M -rw-rw-r-- 1 dbrami employees 1.2M Jun 25 10:07 _genome.4.bt2
4.9M -rw-rw-r-- 1 dbrami employees 4.8M Jun 25 10:07 _genome.fa
5.6M -rw-rw-r-- 1 dbrami employees 5.6M Jun 25 10:07 _genome.rev.1.bt2
1.2M -rw-rw-r-- 1 dbrami employees 1.2M Jun 25 10:07 _genome.rev.2.bt2
8.0K -rw-rw-r-- 1 dbrami employees 4.9K Jun 25 10:14
_initial_contigs.0.fa
1.3G -rw-rw-r-- 1 dbrami employees 1.3G Jun 25 10:06 _reads_1.fa
1.1G -rw-rw-r-- 1 dbrami employees 1.1G Jun 25 10:07 _reads_2.fa
2.4G -rw-rw-r-- 1 dbrami employees 2.4G Jun 25 10:04 _reads.fa
3.0G -rw-rw-r-- 1 dbrami employees 3.0G Jun 25 10:14
_reads_genome.0.bowtie
7.4G -rw-rw-r-- 1 dbrami employees 7.4G Jun 25 10:12 _reads_genome.bowtie
Reply to this email directly or view it on GitHub:
#1 (comment)—
Reply to this email directly or view it on GitHub
#1 (comment).
from aligngraph.
It has stil crashed with a 'Segmentation fault'. Here's the stdout:
(0) Alignment finished
CHROMOSOME 0:
(1) chromosome loaded
(2) contig alignment loaded
from aligngraph.
There's may be a confounding factor here that most of the reads Will not map to my assembled contigs. I have selected a couple of contigs from a small metagenomic assembly but supplied the program with all the reads.
from aligngraph.
Can you send me this time's printout by "ll tmp" just like yesterday?
There's may be a confounding factor here that most of the reads Will not
map to my assembled contigs. I have selected a couple of contigs from a
small metagenomic assembly but supplied the program with all the reads.
Reply to this email directly or view it on GitHub:
#1 (comment)
from aligngraph.
Hello,
I believe I am having this same issue. I have added several couts and uncommented some of the ones you had in place to try to locate where the issue occurs. It definitely makes it to the updateKMer function, and the updateKBases function (around line 1184). I think it is in the "goto cont" statement on line 1187 (I added some more prints so not exact). The last print statement before the segfault is just before the first if() control flow of the cont: statement (line 1284). I did have a print if it passed the first if() statment here and it did not print, but I did not have prints for the other if()s. I just added them, recompiled and am re-running. I will update when it's available unless you think this is pretty much useless.
from aligngraph.
Ok, here is an update...
Here is a snippet of your code mixed with my prints (starting around line 1275):
cont:
cout << "cont: A" << endl;
k2.traversed = 0;
k2.s = nextS;
k2.chromosomeID0 = nextID0;
k2.chromosomeOffset0 = nextOffset0;
k2.coverage = 0;
k2.A = k2.C = k2.G = k2.T = k2.N = 0;
cout << "cont: B" << endl;
cout << "nextid: " << nextID << ", nextOffset: " << nextOffset << endl;
cout << genome[nextID][nextOffset].contiMer.size() << endl;
cout << "nextid0: " << nextID0 << ", nextOffset0: " << nextOffset0 << endl;
cout << genome[nextID0][nextOffset0].contiMer.size() << endl;
I have additional cout statements at the beginning of each if() statement and just before each if() statements; however the last few lines of the std out are:
cont: A
cont: B
nextid: 0, nextOffset: 28150056
0
nextid0: 4294967295, nextOffset0: 4294967295
So, it appears that for some reason the segfault is caused by trying to lookup indices that don't exist (genome[nextID0][nextOffset0]). Any thoughts?
from aligngraph.
Related Issues (20)
- Parallelizing read grouping post alignment HOT 4
- Galaxy wrapper for AlignGraph
- BLAT/PBLAT issue "Maximum single piece size (5000) exceeded" HOT 6
- Having trouble resuming HOT 1
- BLAT CALL FAILED! (New problem for existing issue) HOT 9
- Time-consuming step in AlignGraph HOT 2
- Crash when creating _short_initial_contigs_extended_contigs.* files HOT 1
- BOWTIE2 CALL FAILED HOT 2
- Extended Output HOT 2
- Aligngraph comes back with 'CANNOT OPEN FILE!' HOT 1
- Reference Genome for Eukaryotes HOT 1
- AlignGraph stop without error message
- alignGraph failed with error: INCONSISTENT PE FILES! HOT 1
- distanceLow and insertLow parameter calculation from paired-end reads
- How to make AlignGraph to use pBLAT by specifying the number of threads. HOT 2
- Makefile
- Get stuck in bowtie-align HOT 1
- A suggestion
- BLAT ERROR
- Command line specification
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from aligngraph.