Giter VIP home page Giter VIP logo

Comments (9)

baoe avatar baoe commented on May 18, 2024

Thanks for the bug report.

I'm afraid it would be very hard to fix the problem without the specific
data. It would be the best if you could send me the fraction (e.g. 1K)
reads causing the problem; otherwise, maybe you could tell me more about
the reads (e.g. length and number) but this information may not help much.

In addition, you could try to change the distanceLow and distanceHigh
options from 1000 to "insert length minus 1000" and "insert length plus
1000" as shown in the manual. I'm not sure if this is cause of the
problem, since the 1000 parameter setting only causes bad performance but
not segment fault on my various test data.

I'm getting a segfault on 1 TB RAM machine comparing two small bacterial
genomes.

Here's the stdout:
cmd-> cat alignGraph.out
AlignGraph: algorithm for secondary de novo genome assembly guided by
closely related references
By Ergude Bao, CS Department, UC-Riverside. All Rights Reserved

(0) Alignment finished

CHROMOSOME 0:
(1) chromosome loaded
(2) contig alignment loaded

Here's error:
[2]+ Segmentation fault
/sgi/asmopt/src/AlignGraph/AlignGraph/AlignGraph --read1 all_R1.fasta
--read2 all_R2.fasta --contig contigs.fasta --genome chromosome.fasta
--distanceLow 1000 --distanceHigh 1000 --extendedContig extendedContigs.fa
--remainingContig remainingContigs.fa > alignGraph.out 2> alignGraph.err

Also, you should not hardcode the number of processors for bowtie2 to 8 -
we have 64, the prog should pick max at runtime.


Reply to this email directly or view it on GitHub:
#1

from aligngraph.

dbrami avatar dbrami commented on May 18, 2024

Hi Bao,
I think you should provide a couple of command-line examples as I struggled to get it right.
The instructions sound like the command should include signed integers which is problematic.
I have a mean insert size of 590 with a SD 200 so here are the params I used:
'--distanceLow -410 --distanceHigh 1590'; this did not work because of negative sign so changed it to '--distanceLow 100 --distanceHigh 1590'.
The segmentation fault occurs after the bowtie2 completed mapping and blat has completed its alignment, when AlignGraph is processing (as confirmed by '(2) contig alignment loaded' in log).

There's not much I could send to you as the files in tmp dir are large:

total 15G
0 -rw-rw-r-- 1 dbrami employees 0 Jun 25 10:07 _chaff.fa
4.0M -rw-rw-r-- 1 dbrami employees 4.0M Jun 25 10:07 _contigs.fa
8.0K -rw-rw-r-- 1 dbrami employees 5.9K Jun 25 10:12 _contigs_genome.0.psl
4.9M -rw-rw-r-- 1 dbrami employees 4.8M Jun 25 10:07 _genome.0.fa
5.6M -rw-rw-r-- 1 dbrami employees 5.6M Jun 25 10:07 _genome.1.bt2
1.2M -rw-rw-r-- 1 dbrami employees 1.2M Jun 25 10:07 _genome.2.bt2
0 -rw-rw-r-- 1 dbrami employees 17 Jun 25 10:07 _genome.3.bt2
1.2M -rw-rw-r-- 1 dbrami employees 1.2M Jun 25 10:07 _genome.4.bt2
4.9M -rw-rw-r-- 1 dbrami employees 4.8M Jun 25 10:07 _genome.fa
5.6M -rw-rw-r-- 1 dbrami employees 5.6M Jun 25 10:07 _genome.rev.1.bt2
1.2M -rw-rw-r-- 1 dbrami employees 1.2M Jun 25 10:07 _genome.rev.2.bt2
8.0K -rw-rw-r-- 1 dbrami employees 4.9K Jun 25 10:14 _initial_contigs.0.fa
1.3G -rw-rw-r-- 1 dbrami employees 1.3G Jun 25 10:06 _reads_1.fa
1.1G -rw-rw-r-- 1 dbrami employees 1.1G Jun 25 10:07 _reads_2.fa
2.4G -rw-rw-r-- 1 dbrami employees 2.4G Jun 25 10:04 _reads.fa
3.0G -rw-rw-r-- 1 dbrami employees 3.0G Jun 25 10:14 _reads_genome.0.bowtie
7.4G -rw-rw-r-- 1 dbrami employees 7.4G Jun 25 10:12 _reads_genome.bowtie

from aligngraph.

baoe avatar baoe commented on May 18, 2024

Hi, Daniel,

I find your left read file has a different size from the right read file.
I think this could be the cause of the problem, since I didn't consider
this situation that two pairs have different lengths.
I have made the corresponding updates to AlignGraph to fit this situation,
so you could try with the current software version and see if the problem
has been solved.

I have also updated the manual to make it clearer.

Thanks,
Bao

Hi Bao,
I think you should provide a couple of command-line examples as I
struggled to get it right.
The instructions sound like the command should include signed integers
which is problematic.
I have a mean insert size of 590 with a SD 200 so here are the params I
used:
'--distanceLow -410 --distanceHigh 1590'; this did not work because of
negative sign so changed it to '--distanceLow 100 --distanceHigh 1590'.
The segmentation fault occurs after the bowtie2 completed mapping and blat
has completed its alignment, when AlignGraph is processing (as confirmed
by '(2) contig alignment loaded' in log).

There's not much I could send to you as the files in tmp dir are large:

total 15G
0 -rw-rw-r-- 1 dbrami employees 0 Jun 25 10:07 _chaff.fa
4.0M -rw-rw-r-- 1 dbrami employees 4.0M Jun 25 10:07 _contigs.fa
8.0K -rw-rw-r-- 1 dbrami employees 5.9K Jun 25 10:12 _contigs_genome.0.psl
4.9M -rw-rw-r-- 1 dbrami employees 4.8M Jun 25 10:07 _genome.0.fa
5.6M -rw-rw-r-- 1 dbrami employees 5.6M Jun 25 10:07 _genome.1.bt2
1.2M -rw-rw-r-- 1 dbrami employees 1.2M Jun 25 10:07 _genome.2.bt2
0 -rw-rw-r-- 1 dbrami employees 17 Jun 25 10:07 _genome.3.bt2
1.2M -rw-rw-r-- 1 dbrami employees 1.2M Jun 25 10:07 _genome.4.bt2
4.9M -rw-rw-r-- 1 dbrami employees 4.8M Jun 25 10:07 _genome.fa
5.6M -rw-rw-r-- 1 dbrami employees 5.6M Jun 25 10:07 _genome.rev.1.bt2
1.2M -rw-rw-r-- 1 dbrami employees 1.2M Jun 25 10:07 _genome.rev.2.bt2
8.0K -rw-rw-r-- 1 dbrami employees 4.9K Jun 25 10:14 _initial_contigs.0.fa
1.3G -rw-rw-r-- 1 dbrami employees 1.3G Jun 25 10:06 _reads_1.fa
1.1G -rw-rw-r-- 1 dbrami employees 1.1G Jun 25 10:07 _reads_2.fa
2.4G -rw-rw-r-- 1 dbrami employees 2.4G Jun 25 10:04 _reads.fa
3.0G -rw-rw-r-- 1 dbrami employees 3.0G Jun 25 10:14
_reads_genome.0.bowtie
7.4G -rw-rw-r-- 1 dbrami employees 7.4G Jun 25 10:12 _reads_genome.bowtie


Reply to this email directly or view it on GitHub:
#1 (comment)

from aligngraph.

dbrami avatar dbrami commented on May 18, 2024

Yes trimmed reads often have different lengths. I will try tomorrow.

On Wed, Jun 25, 2014 at 4:47 PM, Bao [email protected] wrote:

Hi, Daniel,

I find your left read file has a different size from the right read file.
I think this could be the cause of the problem, since I didn't consider
this situation that two pairs have different lengths.
I have made the corresponding updates to AlignGraph to fit this situation,
so you could try with the current software version and see if the problem
has been solved.

I have also updated the manual to make it clearer.

Thanks,
Bao

Hi Bao,
I think you should provide a couple of command-line examples as I
struggled to get it right.
The instructions sound like the command should include signed integers
which is problematic.
I have a mean insert size of 590 with a SD 200 so here are the params I
used:
'--distanceLow -410 --distanceHigh 1590'; this did not work because of
negative sign so changed it to '--distanceLow 100 --distanceHigh 1590'.
The segmentation fault occurs after the bowtie2 completed mapping and
blat
has completed its alignment, when AlignGraph is processing (as confirmed
by '(2) contig alignment loaded' in log).

There's not much I could send to you as the files in tmp dir are large:

total 15G
0 -rw-rw-r-- 1 dbrami employees 0 Jun 25 10:07 _chaff.fa
4.0M -rw-rw-r-- 1 dbrami employees 4.0M Jun 25 10:07 _contigs.fa
8.0K -rw-rw-r-- 1 dbrami employees 5.9K Jun 25 10:12
_contigs_genome.0.psl
4.9M -rw-rw-r-- 1 dbrami employees 4.8M Jun 25 10:07 _genome.0.fa
5.6M -rw-rw-r-- 1 dbrami employees 5.6M Jun 25 10:07 _genome.1.bt2
1.2M -rw-rw-r-- 1 dbrami employees 1.2M Jun 25 10:07 _genome.2.bt2
0 -rw-rw-r-- 1 dbrami employees 17 Jun 25 10:07 _genome.3.bt2
1.2M -rw-rw-r-- 1 dbrami employees 1.2M Jun 25 10:07 _genome.4.bt2
4.9M -rw-rw-r-- 1 dbrami employees 4.8M Jun 25 10:07 _genome.fa
5.6M -rw-rw-r-- 1 dbrami employees 5.6M Jun 25 10:07 _genome.rev.1.bt2
1.2M -rw-rw-r-- 1 dbrami employees 1.2M Jun 25 10:07 _genome.rev.2.bt2
8.0K -rw-rw-r-- 1 dbrami employees 4.9K Jun 25 10:14
_initial_contigs.0.fa
1.3G -rw-rw-r-- 1 dbrami employees 1.3G Jun 25 10:06 _reads_1.fa
1.1G -rw-rw-r-- 1 dbrami employees 1.1G Jun 25 10:07 _reads_2.fa
2.4G -rw-rw-r-- 1 dbrami employees 2.4G Jun 25 10:04 _reads.fa
3.0G -rw-rw-r-- 1 dbrami employees 3.0G Jun 25 10:14
_reads_genome.0.bowtie
7.4G -rw-rw-r-- 1 dbrami employees 7.4G Jun 25 10:12 _reads_genome.bowtie


Reply to this email directly or view it on GitHub:
#1 (comment)


Reply to this email directly or view it on GitHub
#1 (comment).

from aligngraph.

dbrami avatar dbrami commented on May 18, 2024

It has stil crashed with a 'Segmentation fault'. Here's the stdout:

(0) Alignment finished

CHROMOSOME 0:
(1) chromosome loaded
(2) contig alignment loaded

from aligngraph.

dbrami avatar dbrami commented on May 18, 2024

There's may be a confounding factor here that most of the reads Will not map to my assembled contigs. I have selected a couple of contigs from a small metagenomic assembly but supplied the program with all the reads.

from aligngraph.

baoe avatar baoe commented on May 18, 2024

Can you send me this time's printout by "ll tmp" just like yesterday?

There's may be a confounding factor here that most of the reads Will not
map to my assembled contigs. I have selected a couple of contigs from a
small metagenomic assembly but supplied the program with all the reads.


Reply to this email directly or view it on GitHub:
#1 (comment)

from aligngraph.

kmhernan avatar kmhernan commented on May 18, 2024

Hello,

I believe I am having this same issue. I have added several couts and uncommented some of the ones you had in place to try to locate where the issue occurs. It definitely makes it to the updateKMer function, and the updateKBases function (around line 1184). I think it is in the "goto cont" statement on line 1187 (I added some more prints so not exact). The last print statement before the segfault is just before the first if() control flow of the cont: statement (line 1284). I did have a print if it passed the first if() statment here and it did not print, but I did not have prints for the other if()s. I just added them, recompiled and am re-running. I will update when it's available unless you think this is pretty much useless.

from aligngraph.

kmhernan avatar kmhernan commented on May 18, 2024

Ok, here is an update...

Here is a snippet of your code mixed with my prints (starting around line 1275):

cont:
    cout << "cont: A" << endl;
    k2.traversed = 0;
    k2.s = nextS;
    k2.chromosomeID0 = nextID0;
    k2.chromosomeOffset0 = nextOffset0;
    k2.coverage = 0;
    k2.A = k2.C = k2.G = k2.T = k2.N = 0;

    cout << "cont: B" << endl;
    cout << "nextid: " << nextID << ", nextOffset: " << nextOffset << endl;
    cout << genome[nextID][nextOffset].contiMer.size() << endl;
    cout << "nextid0: " << nextID0 << ", nextOffset0: " << nextOffset0 << endl;
    cout << genome[nextID0][nextOffset0].contiMer.size() << endl;

I have additional cout statements at the beginning of each if() statement and just before each if() statements; however the last few lines of the std out are:

cont: A
cont: B
nextid: 0, nextOffset: 28150056
0
nextid0: 4294967295, nextOffset0: 4294967295

So, it appears that for some reason the segfault is caused by trying to lookup indices that don't exist (genome[nextID0][nextOffset0]). Any thoughts?

from aligngraph.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.