marbl / canu Goto Github PK
View Code? Open in Web Editor NEWA single molecule sequence assembler for genomes large and small.
Home Page: http://canu.readthedocs.io/
A single molecule sequence assembler for genomes large and small.
Home Page: http://canu.readthedocs.io/
When canu is launched inside an interactive grid login, it configures jobs as if they are to be run on that machine. However, on the first parallel execution, it will submit jobs to the grid. The shell script is written for local execution and doesn't know how to parse out the job id, and all jobs fail.
Hi there,
Just wanted to know whether I was the only one bumping into this error when trying to use the binary release for Linux:
Can't locate Filesys/Df.pm in @INC (@INC contains: /root/mydisk/canu-1.0/Linux-amd64/bin/lib/canu/lib64/perl5 /root/mydisk/canu-1.0/Linux-amd64/bin/lib/canu/lib/perl5 /root/mydisk/canu-1.0/Linux-amd64/bin/lib /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at /root/mydisk/canu-1.0/Linux-amd64/bin/lib/canu/Defaults.pm line 45.
BEGIN failed--compilation aborted at /root/mydisk/canu-1.0/Linux-amd64/bin/lib/canu/Defaults.pm line 45.
Compilation failed in require at canu-1.0/Linux-amd64/bin/canu line 48.
BEGIN failed--compilation aborted at canu-1.0/Linux-amd64/bin/canu line 48.
The module does exist in your tar ball (In /module/apps/canu/1.0/Linux-amd64/bin/lib/canu/lib/perl5/x86_64-linux-thread-multi-ld/Filesys/Df.pm and /module/apps/canu/1.0/Linux-amd64/bin/lib/canu/lib64/perl5/5.8.8/x86_64-linux-thread-multi/Filesys/Df.pm ), but those directories don't seem to be added to @inc upon execution. Now if I modify the $PERL5LIB to hold those 2 dirs, I get the error:
perl: symbol lookup error: /root/mydisk/canu-1.0/Linux-amd64/bin/lib/canu/lib64/perl5/5.8.8/x86_64-linux-thread-multi/auto/Filesys/Df/Df.so: undefined symbol: Perl_Tstack_sp_ptr
Do I really need to install the module via cpan -i? It works, but I'm trying to allow the use of Canu in VMs created on the fly, and having to set up the cpan environment every time (or even once and for all) feels somewhat overkill.
Thanks for your help
Em
On a dataset with 3 reads > 130kbp, Canu gatekeeper crashes with the error:
*** glibc detected *** /home/korens/devel/canu/Linux-amd64/bin/gatekeeperCreate: double free or corruption (out): 0x00002adb4c683010 ***
[0] /home/korens/devel/canu/Linux-amd64/bin/gatekeeperCreate::AS_UTL_catchCrash(int, siginfo*, void*) + 0x2d [0x4060ed]
[1] /lib64/libpthread.so.0 [0x3e1e60eca0]
[2] /lib64/libc.so.6::(null) + 0x35 [0x3e1d62ffc5]
[3] /lib64/libc.so.6::(null) + 0x110 [0x3e1d631a70]
[4] /lib64/libc.so.6 [0x3e1d66994b]
[5] /lib64/libc.so.6 [0x3e1d6714af]
[6] /lib64/libc.so.6::(null) + 0x4b [0x3e1d6757ab]
[7] /home/korens/devel/canu/Linux-amd64/bin/gatekeeperCreate [0x4033cf]
[8] /home/korens/devel/canu/Linux-amd64/bin/gatekeeperCreate [0x404690]
[9] /lib64/libc.so.6::(null) + 0xf4 [0x3e1d61d9f4]
[10] /home/korens/devel/canu/Linux-amd64/bin/gatekeeperCreate::(null) + 0xd1 [0x401b59]
Properly handle the long reads and also increase max limit.
When running in local mode, Canu reports:
-- Found 139736676 16-mers; 81516113 distinct and 70555151 unique. Largest count 13740.
--
-- OVERLAPPER (mhap) (correction)
--
-- Given 6 GB, can fit 9000 reads per block.
-- For 4 blocks, set stride to 2 blocks.
-- Logging partitioning to /ecoli/correction/1-overlapper/partitioning.log'.
-- Computed seed length 500 from desired output coverage 40 and genome size 4800000
-- Configured 3 mhap precompute jobs.
-- Configured 3 mhap overlap jobs.
--
-- 3 mhap precompute jobs failed:
-- job /ecoli/correction/1-overlapper/blocks/000001.dat FAILED.
-- job /ecoli/correction/1-overlapper/blocks/000002.dat FAILED.
-- job /ecoli/correction/1-overlapper/blocks/000003.dat FAILED.
--
-- mhap precompute attempt 2 begins with 0 finished, and 3 to compute.
Somewhere, the iteration counter is incremented and not reset. This does not happen in grid-based runs. To reproduce, run an example dataset as
curl -L -o oxford.fasta http://nanopore.s3.climb.ac.uk/MAP006-PCR-1_2D_pass.fasta
canu -p asm -d ecoli genomeSize=4.8m -nanopore-raw oxford.fasta
Please provide a make install
target to ease installation and separate the compiled binaries from the source code. It would be helpful to install the executables to the directory specified by the environment variables $(DESTDIR)$(PREFIX)/bin
and documents to $(DESTDIR)$(PREFIX)/share/doc/canu
to agree with the Linux file hierarchy standard and autotools conventions.
Hi,
I installed canu with brew but am getting this error:
/usr/local/Cellar/canu/1.0/Darwin-amd64/bin/generateCorrectionLayouts
-G /Users/laurencowley/Ecoli_644/Ecoli644_canu_correct//correction/Ecoli_644_canu.gkpStore
-O /Users/laurencowley/Ecoli_644/Ecoli644_canu_correct//correction/Ecoli_644_canu.ovlStore
-S /Users/laurencowley/Ecoli_644/Ecoli644_canu_correct//correction/2-correction/Ecoli_644_canu.globalScores
-C 80 \
-- Finished on Thu Feb 18 12:07:23 2016 (0 seconds) with 220.2 GB free disk space
gnuplot < /Users/laurencowley/Ecoli_644/Ecoli644_canu_correct//correction/2-correction/Ecoli_644_canu.estimate.original-x-correctedLength.gp \
/dev/null 2>&1
ERROR: Failed with signal 127
-- Starting concurrent execution on Thu Feb 18 12:07:24 2016 with 220.2 GB free disk space (4 processes; 2 concurrently)
/Users/laurencowley/Ecoli_644/Ecoli644_canu_correct//correction/2-correction/correctReads.sh 1 > /Users/laurencowley/Ecoli_644/Ecoli644_canu_correct//correction/2-correction/correctReads.000001.out 2>&1
/Users/laurencowley/Ecoli_644/Ecoli644_canu_correct//correction/2-correction/correctReads.sh 2 > /Users/laurencowley/Ecoli_644/Ecoli644_canu_correct//correction/2-correction/correctReads.000002.out 2>&1
/Users/laurencowley/Ecoli_644/Ecoli644_canu_correct//correction/2-correction/correctReads.sh 3 > /Users/laurencowley/Ecoli_644/Ecoli644_canu_correct//correction/2-correction/correctReads.000003.out 2>&1
-- Finished on Thu Feb 18 12:07:24 2016 (0 seconds) with 220.2 GB free disk space
-- Starting concurrent execution on Thu Feb 18 12:07:24 2016 with 220.2 GB free disk space (4 processes; 2 concurrently)
/Users/laurencowley/Ecoli_644/Ecoli644_canu_correct//correction/2-correction/correctReads.sh 1 > /Users/laurencowley/Ecoli_644/Ecoli644_canu_correct//correction/2-correction/correctReads.000001.out 2>&1
/Users/laurencowley/Ecoli_644/Ecoli644_canu_correct//correction/2-correction/correctReads.sh 2 > /Users/laurencowley/Ecoli_644/Ecoli644_canu_correct//correction/2-correction/correctReads.000002.out 2>&1
/Users/laurencowley/Ecoli_644/Ecoli644_canu_correct//correction/2-correction/correctReads.sh 3 > /Users/laurencowley/Ecoli_644/Ecoli644_canu_correct//correction/2-correction/correctReads.000003.out 2>&1
Don't panic, but a mostly harmless error occurred and canu failed.
canu failed with 'can't open '/Users/laurencowley/Ecoli_644/Ecoli644_canu_correct//correction/2-correction/corjob.files' for reading: No such file or directory'.```
Do you know what could be causing this?
Thanks,
Lauren
Hi!
I am getting an error from CANU for a 1.9gb fastq pacbio bacterial dataset. It is running on a Ubuntu system
Linux thompson.sgn.cornell.edu 3.19.0-43-generic #49~14.04.1-Ubuntu SMP Thu Dec 31 15:44:49 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
CANU command:
~/tools/canu/Linux-amd64/bin/canu -p PS_14 -d PS_14-def genomeSize=6.5m -useGrid=0 -maxThreads=30 -maxMemory=500 -pacbio-raw Pseudomonas_14_BP_P6_010516_MB_170pM/all.fastq
Error:
Modification of non-creatable array value attempted, subscript -453 at /data/home/surya/tools/canu/Linux-amd64/bin/lib/canu/CorrectReads.pm line 980, <L> line 4.
I have attached the log files to help track it down. Please let me know if you need anything else from me. Thanks much!
-Surya
Since we're now mostly configuring on the fly, the list of options needs to stop reporting the default values. This needs to be documented, along with 'showNext'.
Using nanopore raw data input and the following command I receive a "failed with signal HUP(1) error" between correcting and trimming the reads (illegal division by 0 since the corrected reads are not written to the corrected reads file). I thought this might be because of low coverage but it also happens after applying the low coverage options.
I'm trying to run canu on my test dataset. On one server, it run without problem. However, on another one, I get an error (on the same dataset). Here is the last part of the output:
-- BEGIN CORRECTION
--
--
-- GATEKEEPER (correction)
--
----------------------------------------
-- Starting command on Mon Feb 8 15:25:26 2016 with 222.9 GB free disk space
/home/kresimir/src/canu/Linux-amd64/bin/gatekeeperCreate \
-minlength 1000 \
-o /home/kresimir/benchmark/results/canu/dataset5/correction/d5.gkpStore.BUILDING \
/home/kresimir/benchmark/results/canu/dataset5/correction/d5.gkpStore.gkp \
> /home/kresimir/benchmark/results/canu/dataset5/correction/d5.gkpStore.err 2>&1
-- Finished on Mon Feb 8 15:25:30 2016 (4 seconds) with 222.8 GB free disk space
----------------------------------------
runCommandSilently()
gnuplot < /home/kresimir/benchmark/results/canu/dataset5/correction/d5.gkpStore/readlengths.gp \
> /dev/null 2>&1
ERROR: Failed with signal 127
I'm guessing that I'm missing some required program, but I can't figure out what.
Running a job where the input data has Ns in the sequence causes falcon_sense to hang on that input sequence. Canu needs to filter out non ACGT bases before passing to falcon_sense.
Bogart computes repeat boundaries and possibly splits repeats on those boundaries. If a split occurs, the reads/unitigs that result are not marked as being from a repeat region, nor are the unsplit regions marked as possibly containing a repeat. Both markings should be captured.
Hi,
Thank you for your great work on canu. I am trying to assemble PacBio reads but am running into some trouble and was wondering if you could give any insight as to what could be going wrong. Thanks in advance!
This is the command I used:
export PATH="/path/to/jre1.8.0_66/bin/:$PATH"
export JAVA_HOME="//path/to/jre1.8.0_66"
#correct, trim, assemble with one command
~/path/to/canu/Linux-amd64/bin/canu \
gridOptions='--time=72:00:00 [email protected] --mail-type=BEGIN --mail-type=END --mail-type=FAIL' \
-p human_assembled_pacbio -d human_assembled_pacbio \
genomeSize=3.3g \
-pacbio-raw ~/path/to/reads.fastq
In the "canu-scripts" directory that is created, there is "canu.01.out", "canu.02.out", and " "canu.03.out".
canu.01.out shows:
131 -- MERYL (correction)
132 -- Meryl attempt 1 begins.
133 --
134 -- Starting command on Wed Jan 6 18:45:30 2016 with 583697.3 GB free disk space
135 --
136 sbatch \
137 --time=72:00:00 --mail-user=myemail --mail-type=BEGIN --mail-type=END --mail-type=FAIL --mem=64g --cpus-per-task=24 \
138 -D `pwd` -J "meryl_human_assembled_pacbio" \
139 -a 1-1 \
140 -o /path/to/human_assembled_pacbio/correction/0-mercounts/meryl.%A_%a.out \
141 path/to/human_assembled_pacbio/correction/0-mercounts/meryl.sh
142
143 Submitted batch job 9234744
144 --
145 -- Finished on Wed Jan 6 18:45:30 2016 (0 seconds) with 583697.3 GB free disk space
146 --
147 -- Starting command on Wed Jan 6 18:45:31 2016 with 583697.3 GB free disk space
148 --
149 sbatch \
150 --mem=12g \
151 --cpus-per-task=1 \
152 --time=72:00:00 \
153 --mail-user=myemail \
154 --mail-type=BEGIN \
155 --mail-type=END \
156 --mail-type=FAIL \
157 --depend=afterany:9234744 \
158 -D `pwd` \
159 -J "canu_human_assembled_pacbio" \
160 -o /path/to/human_assembled_pacbio/canu-scripts/canu.02.out
161 Submitted batch job 9234745
162 --
163 -- Finished on Wed Jan 6 18:45:31 2016 (0 seconds) with 583697.3 GB free disk space
164 slurmstepd: Exceeded step memory limit at some point.
165 slurmstepd: Exceeded job memory limit at some point.
canu.02.out shows:
118 -- MERYL (correction)
119 --
120 -- meryl failed.
121 --
122 -- Meryl attempt 2 begins.
123 --
124 -- Starting command on Wed Jan 6 20:50:31 2016 with 583655.8 GB free disk space
125 --
126 sbatch \
127 --time=72:00:00 --mail-user=myemail --mail-type=BEGIN --mail-type=END --mail-type=FAIL --mem=64g --cpus-per-task=24 \
128 -D `pwd` -J "meryl_human_assembled_pacbio" \
129 -a 1-1 \
130 -o /path/to/correction/0-mercounts/meryl.%A_%a.out \
131 /path/to/correction/0-mercounts/meryl.sh
132
133 Submitted batch job 9235177
134 --
135 -- Finished on Wed Jan 6 20:50:31 2016 (0 seconds) with 583655.8 GB free disk space
136 --
137 -- Starting command on Wed Jan 6 20:50:32 2016 with 583655.8 GB free disk space
138 --
139 sbatch \
140 --mem=12g \
141 --cpus-per-task=1 \
142 --time=72:00:00 \
143 --mail-user=myemail \
144 --mail-type=BEGIN \
145 --mail-type=END \
146 --mail-type=FAIL \
147 --depend=afterany:9235177 \
148 -D `pwd` \
149 -J "canu_human_assembled_pacbio" \
150 -o /path/to/canu-scripts/canu.03.out /path/to
151 Submitted batch job 9235178
152 --
153 -- Finished on Wed Jan 6 20:50:32 2016 (0 seconds) with 583655.8 GB free disk space118 --
and canu.03.out shows:
118 -- MERYL (correction)
119 --
120 -- meryl failed.
121 --
122 ================================================================================
123 Don't panic, but a mostly harmless error occurred and canu failed.
124
125 canu failed with 'failed to generate mer counts. Made 2 attempts, jobs still failed'.
Thanks so much!
Make parallel ovlStore building the default and increase # of sorting jobs (but limited by the minimum of # open user files and # processes). Auto-detect memory that will be required for the sort step and request that amount rather than a fixed limit.
Hi,
a harmless error occured:
canu failed with 'didn't find '.../unitigging/3-overlapErrorAdjustment/oea.files' to add to store, yet overlapper finished'. Full log was pasted to http://jpst.it/ESKP
I believe, that the problem was in this subroutine overlapErrorAdjustmentCheck, because, there are not "FAILED" jobs or failed attempts to open "oea.files" for writing in the log file. It seems, that @success job was just empty and therefore no file was created.
Hello,
I am running Canu on a 16 processor server (181 Gb RAM) with a 10 Mb yeast genome (one SMRT cell of data, 420 Mb). Since more than 2 days it is stuck at the overlapInCore step. A file (trimming/1-overlapper/overlap.000001.out) was written a couple of hours ago, and below there is the STDOUT. I am wondering if this duration is normal for this software with this amount of data and on this machine. If so, even assembling a couple of good SMRT cells will be a very long process. With bacterial data I was able to complete assemblies in a few hours.
Thanks,
Dario
-- Meryl attempt 0 begins.
----------------------------------------
-- Starting concurrent execution on Wed Feb 10 10:57:40 2016 with 655.6 GB free disk space (1 processes; 4 concurrently)
/opt/Sma_Processing/Dario_Analysis/yeast_canu/test1/trimming/0-mercounts/meryl.sh 1 > /opt/Sma_Processing/Dario_Analysis/yeast_canu/test1/trimming/0-mercounts/meryl.000001.out 2>&1
-- Finished on Wed Feb 10 10:58:14 2016 (34 seconds) with 655.5 GB free disk space
----------------------------------------
-- Meryl finished successfully.
----------------------------------------
-- Starting command on Wed Feb 10 10:58:14 2016 with 655.5 GB free disk space
/opt/canu/Linux-amd64/bin/meryl \
-Dh \
-s /opt/Sma_Processing/Dario_Analysis/yeast_canu/test1/trimming/0-mercounts/Y11946_1.ms22 \
> /opt/Sma_Processing/Dario_Analysis/yeast_canu/test1/trimming/0-mercounts/Y11946_1.ms22.histogram \
2> /opt/Sma_Processing/Dario_Analysis/yeast_canu/test1/trimming/0-mercounts/Y11946_1.ms22.histogram.info
-- Finished on Wed Feb 10 10:58:14 2016 (lickety-split) with 655.5 GB free disk space
----------------------------------------
----------------------------------------
-- Starting command on Wed Feb 10 10:58:14 2016 with 655.5 GB free disk space
/opt/canu/Linux-amd64/bin/estimate-mer-threshold \
-m /opt/Sma_Processing/Dario_Analysis/yeast_canu/test1/trimming/0-mercounts/Y11946_1.ms22 \
> /opt/Sma_Processing/Dario_Analysis/yeast_canu/test1/trimming/0-mercounts/Y11946_1.ms22.estMerThresh.out \
2> /opt/Sma_Processing/Dario_Analysis/yeast_canu/test1/trimming/0-mercounts/Y11946_1.ms22.estMerThresh.err
-- Finished on Wed Feb 10 10:58:14 2016 (lickety-split) with 655.5 GB free disk space
----------------------------------------
----------------------------------------
-- Starting command on Wed Feb 10 10:58:14 2016 with 655.5 GB free disk space
/opt/canu/Linux-amd64/bin/meryl \
-Dt \
-n 1774 \
-s /opt/Sma_Processing/Dario_Analysis/yeast_canu/test1/trimming/0-mercounts/Y11946_1.ms22 \
> /opt/Sma_Processing/Dario_Analysis/yeast_canu/test1/trimming/0-mercounts/Y11946_1.ms22.frequentMers.fasta \
2> /opt/Sma_Processing/Dario_Analysis/yeast_canu/test1/trimming/0-mercounts/Y11946_1.ms22.frequentMers.fasta.err
-- Finished on Wed Feb 10 10:58:15 2016 (1 second) with 655.5 GB free disk space
----------------------------------------
-- Reset obtOvlMerThreshold from auto to 1774.
--
-- Found 246325943 22-mers; 29779099 distinct and 14354894 unique. Largest count 16314.
--
-- OVERLAPPER (normal) (trimming) erate=0.54
--
----------------------------------------
-- Starting command on Wed Feb 10 10:58:15 2016 with 655.6 GB free disk space
/opt/canu/Linux-amd64/bin/overlapInCorePartition \
-g /opt/Sma_Processing/Dario_Analysis/yeast_canu/test1/trimming/Y11946_1.gkpStore \
-bl 100000000 \
-bs 0 \
-rs 2000000 \
-rl 0 \
-ol 500 \
-o /opt/Sma_Processing/Dario_Analysis/yeast_canu/test1/trimming/1-overlapper/Y11946_1.partition \
> /opt/Sma_Processing/Dario_Analysis/yeast_canu/test1/trimming/1-overlapper/Y11946_1.partition.err 2>&1
-- Finished on Wed Feb 10 10:58:15 2016 (lickety-split) with 655.6 GB free disk space
----------------------------------------
--
-- Configured 3 overlapInCore jobs.
-- overlapInCore attempt 0 begins with 0 finished, and 3 to compute.
----------------------------------------
-- Starting concurrent execution on Wed Feb 10 10:58:15 2016 with 655.6 GB free disk space (3 processes; 2 concurrently)
/opt/Sma_Processing/Dario_Analysis/yeast_canu/test1/trimming/1-overlapper/overlap.sh 1 > /opt/Sma_Processing/Dario_Analysis/yeast_canu/test1/trimming/1-overlapper/overlap.000001.out 2>&1
/opt/Sma_Processing/Dario_Analysis/yeast_canu/test1/trimming/1-overlapper/overlap.sh 2 > /opt/Sma_Processing/Dario_Analysis/yeast_canu/test1/trimming/1-overlapper/overlap.000002.out 2>&1
--
----------------------------------------
-- Starting command on Wed Feb 10 10:58:15 2016 with 655.6 GB free disk space
/opt/canu/Linux-amd64/bin/overlapInCorePartition \
-g /opt/Sma_Processing/Dario_Analysis/yeast_canu/test1/trimming/Y11946_1.gkpStore \
-bl 100000000 \
-bs 0 \
-rs 2000000 \
-rl 0 \
-ol 500 \
-o /opt/Sma_Processing/Dario_Analysis/yeast_canu/test1/trimming/1-overlapper/Y11946_1.partition \
> /opt/Sma_Processing/Dario_Analysis/yeast_canu/test1/trimming/1-overlapper/Y11946_1.partition.err 2>&1
-- Finished on Wed Feb 10 10:58:15 2016 (lickety-split) with 655.6 GB free disk space
----------------------------------------
--
-- Configured 3 overlapInCore jobs.
-- overlapInCore attempt 0 begins with 0 finished, and 3 to compute.
----------------------------------------
-- Starting concurrent execution on Wed Feb 10 10:58:15 2016 with 655.6 GB free disk space (3 processes; 2 concurrently)
/opt/Sma_Processing/Dario_Analysis/yeast_canu/test1/trimming/1-overlapper/overlap.sh 1 > /opt/Sma_Processing/Dario_Analysis/yeast_canu/test1/trimming/1-overlapper/overlap.000001.out 2>&1
/opt/Sma_Processing/Dario_Analysis/yeast_canu/test1/trimming/1-overlapper/overlap.sh 2 > /opt/Sma_Processing/Dario_Analysis/yeast_canu/test1/trimming/1-overlapper/overlap.000002.out 2>&1
On RHEL the canu master script fails:
$ ./Linux-amd64/bin/canu
-bash: ./Linux-amd64/bin/canu: perl: bad interpreter: No such file or directory
$ head -n 1 ./Linux-amd64/bin/canu
#!perl
$ perl -v
This is perl 5, version 16, subversion 3 (v5.16.3) built for x86_64-linux-thread-multi
I've never seen a hash-bang line with a bare exe name. Perhaps #!/usr/bin/env perl
would be more compatible?
PS. canu compiled cleanly in 20 seconds on my new server make -j 72
(2 x 18 h/t cores) !!!
Boost dependency is only used to have an adjacency list with lists of nodes, edges, and in/out edges for a graph. This could be implemented without using boost to simplify the code and remove the large list of include files.
Currently, Canu looks for a parallel environment which has
allocation_rule $pe_slots
control_slaves FALSE
However, I think it will work with just $pe_slots and control_slaves set to TRUE. I suggest using the current requirements as a first pass then relaxing to ignore control_slaves if none are found.
Killing meryl in local mode isn't being handled correctly. After the first kill, the attempt counter is somehow zero. After the second kill, canu progresses on to dumping the (nonexistent) counts.
-- MERYL (correction)
-- Meryl attempt 1 begins.
--
-- Starting concurrent execution on Thu Jan 7 03:03:57 2016 with 4462.4 GB free disk space (1 processes; 3 concurrently)
/work/canuassemblies/test/ecoli/correction/0-mercounts/meryl.sh 1 > /work/canuassemblies/test/ecoli/correction/0-mercounts/meryl.000001.out 2>&1
--
-- Finished on Thu Jan 7 03:03:59 2016 (2 seconds) with 4462.4 GB free disk space
-- Meryl attempt 0 begins.
--
-- Starting concurrent execution on Thu Jan 7 03:03:59 2016 with 4462.4 GB free disk space (1 processes; 3 concurrently)
/work/canuassemblies/test/ecoli/correction/0-mercounts/meryl.sh 1 > /work/canuassemblies/test/ecoli/correction/0-mercounts/meryl.000001.out 2>&1
--
-- Finished on Thu Jan 7 03:04:01 2016 (2 seconds) with 4462.4 GB free disk space
--
-- Starting command on Thu Jan 7 03:04:01 2016 with 4462.4 GB free disk space
--
/work/canu/FreeBSD-amd64/bin/meryl \
-Dh \
-s /work/canuassemblies/test/ecoli/correction/0-mercounts/asm.ms16 \
> /work/canuassemblies/test/ecoli/correction/0-mercounts/asm.ms16.histogram \
2> /work/canuassemblies/test/ecoli/correction/0-mercounts/asm.ms16.histogram.info
--
-- Finished on Thu Jan 7 03:04:01 2016 (0 seconds) with 4462.4 GB free disk space
ERROR: Failed with signal HUP (1)
================================================================================
Please panic. canu failed, and it shouldn't have.
Stack trace:
at /work/canu/FreeBSD-amd64/bin/lib/canu/Defaults.pm line 220.
canu::Defaults::caFailure("meryl histogram failed", "/work/canuassemblies/test/ecoli/correction/0-mercounts/asm.ms"...) called at /work/canu/FreeBSD-amd64/bin/lib/canu/Meryl.pm line 382
canu::Meryl::merylProcess("/work/canuassemblies/test/ecoli", "asm", "cor") called at /work/canu/FreeBSD-amd64/bin/canu line 410
Last few lines of the relevant log file (/work/canuassemblies/test/ecoli/correction/0-mercounts/asm.ms16.histogram.info):
merylStreamReader()-- ERROR: /work/canuassemblies/test/ecoli/correction/0-mercounts/asm.ms16.mcidx is not a merylStream index file!
merylStreamReader()-- ERROR: /work/canuassemblies/test/ecoli/correction/0-mercounts/asm.ms16.mcdat is not a merylStream data file!
Hi,
I haven't seen this error before. Any thoughts?
This error was in canu.03.out in a run that otherwise went smooth up until that point.
-- Starting command on Fri Feb 12 02:34:01 2016 with 134537 GB free disk space
/gpfs_home/jurban/software/canu/canu/Linux-amd64/bin/generateCorrectionLayouts \
-G /users/jurban/scratch/canu/g250-default/correction/g250-default.gkpStore \
-O /users/jurban/scratch/canu/g250-default/correction/g250-default.ovlStore \
-S /users/jurban/scratch/canu/g250-default/correction/2-correction/g250-default.globalScores \
-C 80 \
-p /users/jurban/scratch/canu/g250-default/correction/2-correction/g250-default.estimate
ERROR: bogus overlap ' 1986708 2010308 N 0 0 121096 0 4294805431 0.185900'
generateCorrectionLayouts: correction/generateCorrectionLayouts.C:92: tgTig* generateLayout(gkStore_, uint32_, uint32, double, double, ovOverlap_, uint32, FILE_): Assertion `ovlLength < (((uint32)1 << 21) - 1)' failed.
Failed with 'Aborted'
Backtrace (mangled):
/gpfs_home/jurban/software/canu/canu/Linux-amd64/bin/generateCorrectionLayouts(_Z17AS_UTL_catchCrashiP7siginfoPv+0x27)[0x42c3f7]
/lib64/libpthread.so.0(+0xf710)[0x7f9490818710]
/lib64/libc.so.6(gsignal+0x35)[0x7f94904a7925]
/lib64/libc.so.6(abort+0x175)[0x7f94904a9105]
/lib64/libc.so.6(+0x2ba4e)[0x7f94904a0a4e]
/lib64/libc.so.6(__assert_perror_fail+0x0)[0x7f94904a0b10]
/gpfs_home/jurban/software/canu/canu/Linux-amd64/bin/generateCorrectionLayouts[0x40272f]
/gpfs_home/jurban/software/canu/canu/Linux-amd64/bin/generateCorrectionLayouts[0x403c08]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x7f9490493d1d]
/gpfs_home/jurban/software/canu/canu/Linux-amd64/bin/generateCorrectionLayouts[0x402299]
Backtrace (demangled):
[0] /gpfs_home/jurban/software/canu/canu/Linux-amd64/bin/generateCorrectionLayouts::AS_UTL_catchCrash(int, siginfo_, void_) + 0x27 [0x42c3f7]
[1] /lib64/libpthread.so.0::(null) + 0xf710 [0x7f9490818710]
[2] /lib64/libc.so.6::(null) + 0x35 [0x7f94904a7925]
[3] /lib64/libc.so.6::(null) + 0x175 [0x7f94904a9105]
[4] /lib64/libc.so.6::(null) + 0x2ba4e [0x7f94904a0a4e]
[5] /lib64/libc.so.6::(null) + 0 [0x7f94904a0b10]
[6] /gpfs_home/jurban/software/canu/canu/Linux-amd64/bin/generateCorrectionLayouts() [0x40272f]
[7] /gpfs_home/jurban/software/canu/canu/Linux-amd64/bin/generateCorrectionLayouts() [0x403c08]
[8] /lib64/libc.so.6::(null) + 0xfd [0x7f9490493d1d]
[9] /gpfs_home/jurban/software/canu/canu/Linux-amd64/bin/generateCorrectionLayouts() [0x402299]
GDB:
sh: line 5: 31461 Aborted /gpfs_home/jurban/software/canu/canu/Linux-amd64/bin/generateCorrectionLayouts -G /users/jurban/scratch/canu/g250-default/correction/g250-default.gkpStore -O /users/jurban/scratch/canu/g250-default/correction/g250-default.ovlStore -S /users/jurban/scratch/canu/g250-default/correction/2-correction/g250-default.globalScores -C 80 -p /users/jurban/scratch/canu/g250-default/correction/2-correction/g250-default.estimate
ERROR: Failed with signal ABRT (6)
Don't panic, but a mostly harmless error occurred and canu failed.
Disk space available: 134529.7 GB
canu failed with 'failed to generate estimated lengths of corrected reads'.
Just alerting you to some errors that occurred when trying to assemble insect data (~200-300 Mb genome).
Command used:
/path/to/canu
-p sciara -d sciara
genomeSize=210m
errorRate=0.06
-pacbio-raw $R
"gridOptions=--time 48:00:00"
"minReadLength=500" \
Compute grid uses SLURM.
The following files in unitigging/1-overlap stage seem to indicate that they died/were not successfully completed (note there were 66 jobs/files for this step -- just these 2 failed):
./unitigging/1-overlapper/overlap.10261437_4.out
slurmstepd: get_exit_code task 0 died by signal
./unitigging/1-overlapper/overlap.10261437_5.out
slurmstepd: get_exit_code task 0 died by signal
Later on there were errors in 5-consensus (3 out of ~70 jobs)-- all 3 of the following files had similar errors. Also note that Canu tried repeating them once or twice and failed with the same errors.
./unitigging/5-consensus/consensus.10298916_67.out
./unitigging/5-consensus/consensus.10298916_68.out
./unitigging/5-consensus/consensus.10298916_69.out
IncBaseCount c=0 '' r=4294967295 out of range.
utgcns: utgcns/libcns/abColumn.H:73: uint32 abBaseCount::IncBaseCount(char): Assertion `r <= 5' failed.
Failed with 'Aborted'
Backtrace (mangled):
/gpfs_home/jurban/software/canu/canu/Linux-amd64/bin/utgcns(_Z17AS_UTL_catchCrashiP7siginfoPv+0x27)[0x42b287]
/lib64/libpthread.so.0(+0xf710)[0x7f9e7685e710]
/lib64/libc.so.6(gsignal+0x35)[0x7f9e764ed925]
/lib64/libc.so.6(abort+0x175)[0x7f9e764ef105]
/lib64/libc.so.6(+0x2ba4e)[0x7f9e764e6a4e]
/lib64/libc.so.6(__assert_perror_fail+0x0)[0x7f9e764e6b10]
/gpfs_home/jurban/software/canu/canu/Linux-amd64/bin/utgcns[0x43a362]
/gpfs_home/jurban/software/canu/canu/Linux-amd64/bin/utgcns[0x43a772]
/gpfs_home/jurban/software/canu/canu/Linux-amd64/bin/utgcns[0x45f6c5]
/gpfs_home/jurban/software/canu/canu/Linux-amd64/bin/utgcns[0x43c949]
/gpfs_home/jurban/software/canu/canu/Linux-amd64/bin/utgcns[0x43fd35]
/gpfs_home/jurban/software/canu/canu/Linux-amd64/bin/utgcns[0x403fd8]
/lib64/libc.so.6(libc_start_main+0xfd)[0x7f9e764d9d1d]
/gpfs_home/jurban/software/canu/canu/Linux-amd64/bin/utgcns[0x4027e9]
Backtrace (demangled):
[0] /gpfs_home/jurban/software/canu/canu/Linux-amd64/bin/utgcns::AS_UTL_catchCrash(int, siginfo, void) + 0x27 [0x42b287]
[1] /lib64/libpthread.so.0::(null) + 0xf710 [0x7f9e7685e710]
[2] /lib64/libc.so.6::(null) + 0x35 [0x7f9e764ed925]
[3] /lib64/libc.so.6::(null) + 0x175 [0x7f9e764ef105]
[4] /lib64/libc.so.6::(null) + 0x2ba4e [0x7f9e764e6a4e]
[5] /lib64/libc.so.6::(null) + 0 [0x7f9e764e6b10]
[6] /gpfs_home/jurban/software/canu/canu/Linux-amd64/bin/utgcns() [0x43a362]
[7] /gpfs_home/jurban/software/canu/canu/Linux-amd64/bin/utgcns() [0x43a772]
[8] /gpfs_home/jurban/software/canu/canu/Linux-amd64/bin/utgcns() [0x45f6c5]
[9] /gpfs_home/jurban/software/canu/canu/Linux-amd64/bin/utgcns() [0x43c949]
[10] /gpfs_home/jurban/software/canu/canu/Linux-amd64/bin/utgcns() [0x43fd35]
[11] /gpfs_home/jurban/software/canu/canu/Linux-amd64/bin/utgcns() [0x403fd8]
[12] /lib64/libc.so.6::(null) + 0xfd [0x7f9e764d9d1d]
[13] /gpfs_home/jurban/software/canu/canu/Linux-amd64/bin/utgcns() [0x4027e9]
GDB:
/var/spool/slurmd/job10298916/slurm_script: line 48: 2108 Aborted $bin/utgcns -G /users/jurban/scratch/canu/t1/sciara/unitigging/sciara.gkpStore -T /users/jurban/scratch/canu/t1/sciara/unitigging/sciara.tigStore 1 $jobid -O /users/jurban/scratch/canu/t1/sciara/unitigging/5-consensus/$jobid.cns.WORKING -L /users/jurban/scratch/canu/t1/sciara/unitigging/5-consensus/$jobid.layout.WORKING -F /users/jurban/scratch/canu/t1/sciara/unitigging/5-consensus/$jobid.fastq.WORKING -maxcoverage 2
Finally, another file during the 5-consensus stage took a lot longer than all the other jobs (by >>24 hours) and exceeded the 48 hour time limit (there was one other outlier as well that just made it). Now Canu seems to be running multiple instances of trying to repeat this job… right now there are 2 running already ~24 hours into the 48 hours. I don't expect they will finish… so perhaps Canu will be stuck here re-generating those jobs until I abort it..?
./unitigging/5-consensus/consensus.10290488_57.out
slurmstepd: *** JOB 10290546 CANCELLED AT 2015-12-12T00:51:22 DUE TO TIME LIMIT on node461 ***
Overall, I was wondering if the failures in the earlier 1-overlapping stage affect later stages…? If so, maybe Canu should check that all finished successfully before moving on.
For grids with 'shell_start_mode' and/or 'shell' (qconf -sq all.q) set incorrectly (whatever that means) we need to somehow make our shell scripts use not-csh (qsub -S).
If unix_behavior, it uses the #! line.
If posix_compliant, it uses the 'qsub -S' or 'qconf -sq' 'shell' setting.
The ideal behavior would be to correct only the raw reads and then combine them with the corrected inputs for trimming/assembly. Alternatively, we can print an error message if users mix these types.
Hi,
I'm having a curious and annoying problem where the .dat files all stay empty while my fasta files are around 700mb in size. Despite the fact that this is the case, the pipeline continues as it checks for the existence of the dat files, rather than the size.
I do however for the moment have no idea why they do not end up having the suffic FAILED, but the out log file of the precompute.sh is not telling me anything. It just ends with the phrase
Time (s) to read filter file: 0.06630411900000001
Processing FASTA files for binary compression...
There is nothing after that, and also nothing of the rest of the shell script is executed. So it means the job as a whole fails instantaneously when the mhap preconfigure is started (since no reads are read, and the dat files are size 0). I've tried running the exact commands just for a single file by copying them, and they work perfectly, so I'll have to look why the job fails. I hardly think it's a memory issue, since than I would expect them to die after loading in at least some data.
But in any case, the check for failure should ideally be extended to the size of the dat files.
Hi,
We use LSF as our grid engine (I realise that support for LSF is untested, of course). Our systems team have set a requirement, when setting memory limits that, in addition to setting "-M MEMORY", we also need to set a couple of -R options (-R 'select [mem>MEMORY] rusage [mem=MEMORY]') so, for example, if a job requests 6144MB of memory , we need to set the options:
-M 6144 -R 'select[mem>6144] rusage [mem=6144]'
If the -R option is not present, or, if the memory values for -R differ from that set with -M, a pre-check on all LSF submissions causes the job to fail (actually, it isn't submitted).
What changes would we need to make to include this in our checkout of canu? I'd, maybe naively, assumed that just changing the line "setGlobalIfUndef("gridEngineMemoryOption", "-M MEMORY");" in Grid_LSF.pm might be the only requirement but, unless I'm missing something, other changes are required?
We can obviously set 'useGrid=0' and then bsub the job, thus running outside of the grid set up of canu nut still under LSF, but this seems like an inefficient use of resources, outside of testing things. (I did this earlier today and it worked once I'd tweaked the required number of threads)
Any help would be much appreciated!
Thanks in advance,
Martin
Congrats on Canu V1.0.
Last I downloaded Canu was a little over a week ago - just before pbdagcon was made the default consensus module. I never had problems with Canu installation prior to today, which was a major benefit of using Canu in my opinion.
Today I tried to install the latest commit and got the following error message:
$ git clone https://github.com/marbl/canu.git
$ cd canu/src
$ make
In file included from utgcns/libpbutgcns/Alignment.C:8:
utgcns/libpbutgcns/Alignment.H:17: error: ‘uint32_t’ does not name a type
utgcns/libpbutgcns/Alignment.H:19: error: ‘uint32_t’ does not name a type
utgcns/libpbutgcns/Alignment.H:33: error: ‘uint32_t’ does not name a type
utgcns/libpbutgcns/Alignment.C: In constructor ‘dagcon::Alignment::Alignment()’:
utgcns/libpbutgcns/Alignment.C:13: error: class ‘dagcon::Alignment’ does not have any field named ‘start’
utgcns/libpbutgcns/Alignment.C:14: error: class ‘dagcon::Alignment’ does not have any field named ‘end’
utgcns/libpbutgcns/Alignment.C: In function ‘dagcon::Alignment normalizeGaps(dagcon::Alignment&)’:
utgcns/libpbutgcns/Alignment.C:79: error: ‘class dagcon::Alignment’ has no member named ‘start’
utgcns/libpbutgcns/Alignment.C:79: error: ‘class dagcon::Alignment’ has no member named ‘start’
make: *** [../Linux-amd64/obj/libcanu.a/utgcns/libpbutgcns/Alignment.o] Error 1
Any ideas on how to successfully install Canu?
After unitgs are created, the reads in the gkpStore are split into multiple files for direct loading by consensus. If unitgs are regenerated after this occurs, the old gkpStore partitioning is supposed to be removed and rebuilt.
Hi,
I was wondering if you have a recommended way to extract fastq/fasta information from PacBio's bas/bax files, and if you recommend filtering of any sort (e.g. based on quality) or just using all subreads.
The way I have extracted fastq files is using bash5tools from pbh5tools (to get all subreads, no length or quality cutoffs):
$ bash5tools.py –readtype subread file.bas.h5
When I originally obtained the PacBio bas/bax dataset, they also provided "Filtered subreads" already extracted. These were filtered to be >= 500 bp and by some quality threshold. I unfortunately do not know exactly what was done to filter. Nonetheless, it is a subset of all reads.
I have tried Canu with both sets of reads described above (all and filtered) with various Canu parameters and with Canu specifying minReadLength = 500 or 1000. I get mixed results with a trend where using all reads gives higher max contig lengths and where using the filtered set gives slightly higher N50 values (using expected genome size to compare all). Ultimately though, they do not drastically differ. So, I am inclined to go with "all subreads" just because I know how that set of reads was obtained.
All in all, I am interested in what your standard practice is for extracting reads from the bas/bax files? -- i.e. what is your standard set of input reads to Canu? all? filtered? how?
Many many thanks,
John
I'm assembling a pool of BACs and I encounter the error below. I've tracked down the offending line to stores/ovStoreBuild.C:137, but have yet to fix it. I have pulled from the latest commit that fixes another double free issue, but this one persists.
overlap counts for 1024 reads from '/net/eichler/vol20/projects/whole_genome_assembly/BACs/assemblies/minRead8kb-minOvl6kb//trimming/1-overlappe
r/001/000001.counts'.
*** glibc detected *** /net/eichler/vol5/home/chrismh/src/canu-repo/Linux-amd64/bin/ovStoreBuild: double free or corruption (out): 0x0000000001481e70 ***
======= Backtrace: =========
/lib64/libc.so.6[0x327b275f4e]
/lib64/libc.so.6[0x327b278cf0]
/net/eichler/vol5/home/chrismh/src/canu-repo/Linux-amd64/bin/ovStoreBuild[0x4032c1]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x327b21ed5d]
/net/eichler/vol5/home/chrismh/src/canu-repo/Linux-amd64/bin/ovStoreBuild[0x401c99]
======= Memory map: ========
00400000-00419000 r-xp 00000000 00:1c 339428650 /net/eichler/vol5/home/chrismh/src/canu-repo/Linux-amd64/bin/ovStoreBuild
00618000-00619000 rw-p 00018000 00:1c 339428650 /net/eichler/vol5/home/chrismh/src/canu-repo/Linux-amd64/bin/ovStoreBuild
0147a000-0149b000 rw-p 00000000 00:00 0 [heap]
327ae00000-327ae20000 r-xp 00000000 fd:00 786676 /lib64/ld-2.12.so
327b01f000-327b020000 r--p 0001f000 fd:00 786676 /lib64/ld-2.12.so
327b020000-327b021000 rw-p 00020000 fd:00 786676 /lib64/ld-2.12.so
327b021000-327b022000 rw-p 00000000 00:00 0
327b200000-327b38a000 r-xp 00000000 fd:00 786678 /lib64/libc-2.12.so
327b38a000-327b58a000 ---p 0018a000 fd:00 786678 /lib64/libc-2.12.so
...
Hi,
There is no real emergency here - just documenting some areas where Canu fails in case it is of interest to you.
I have been running Canu a lot (with Slurm), and it almost never makes it all the way through by itself due to asking for too little memory at some stages (i.e. needs manual intervention). I just thought I'd create a thread to document the stages where this happens.
Recently I cleaned out a lot of runs without noting where this happened, but I think it was mostly or entirely confined to unitigging stages. For the runs going right now, the 3-overlapErrorAdjustment stage is the offender:
SLURM PROLOG ###############################################################
Job ID : 10517491
Job Name : oea_sciara_g310m-t1[1-7]
Nodelist : node024
CPUs :
Mem/Node : 4096 MB
Directory : /gpfs/scratch/jurban/canu/g310m/t1/sciara/unitigging/3-overlapErrorAdjustment
Started : Mon Dec 21 04:43:44 EST 2015
Initializing.
Opening gkpStore '/users/jurban/scratch/canu/g310m/t1/sciara/unitigging/sciara.gkpStore'.
Correcting reads 1 to 180871.
Reading 7334680 corrections from '/users/jurban/scratch/canu/g310m/t1/sciara/unitigging/3-overlapErrorAdjustment/red.red'.
Correcting 1183528474 bases with 5345712 indel adjustments.
Corrected 1182709611 bases with 124894 substitutions, 818971 deletions and 108 insertions.
Read_Olaps()-- Loading 34932437 overlaps from '/users/jurban/scratch/canu/g310m/t1/sciara/unitigging/sciara.ovlStore' for reads 1 to 180871
Read_Olaps()-- Loaded 34932437 overlaps -- 17613972 normal and 17318465 innie.
slurmstepd: Job 10517491 exceeded memory limit (4194412 > 4194304), being killed
slurmstepd: Exceeded job memory limit
slurmstepd: *** JOB 10517491 CANCELLED AT 2015-12-21T07:00:44 *** on node024
It is now on the second Canu iteration and the OEA jobs were called with the same parameters. So they will all undoubtedly fail due to exceeding the memory limit again. I will just manually cancel all the jobs and relaunch them with more memory.
Anyway, if this happens at other stages in these or future runs, I will let you know about them here. Thank you for the awesome tool. Enjoy your holidays!
best,
John
Dear Canu developers,
I would like to test various options of Bogart to eventually get higher continuity of sequences.
When looking at bogart run by default in the canu pipeline, i see the -repeatdetect option with three integers as parameters.
Could you tell me more about this option and how it effects the unitig-building?
Thank you,
Michel
Hello,
I am trying to assemble a heterozygous plant of ~1GB, but it fails after the first MHAP step, while building the overlap store. I have around 60x of pacbio coverage with a n50 subread length around 15kb.
..................... after bucketizing 265 ovb files:
bucketizing .....asm/correction/1-overlapper/results/000266.ovb.gz
overlap fate:
929294 SAVE - overlaps output (for unitigging)
924704 SAVE - overlaps output (for OBT)
929294 SAVE - overlaps output (for dedupe)
0 ERATE - low quality, more than 0.409 fraction error
0 OBT - not requested
4524 OBT - too similar
66 OBT - too short
0 DUP - dedupe not requested
0 DUP - different library
0 DUP - obviously not duplicates
bucketizing ....asm/correction/1-overlapper/results/000267.ovb.gz
overlap fate:
95720 SAVE - overlaps output (for unitigging)
94966 SAVE - overlaps output (for OBT)
95720 SAVE - overlaps output (for dedupe)
0 ERATE - low quality, more than 0.409 fraction error
0 OBT - not requested
752 OBT - too similar
2 OBT - too short
0 DUP - dedupe not requested
0 DUP - different library
0 DUP - obviously not duplicates
bucketizing DONE!
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
Failed with 'Aborted'
Backtrace (mangled):
/home/smrtanalysis/bin/canu/Linux-amd64/bin/ovStoreBuild(_Z17AS_UTL_catchCrashiP7siginfoPv+0x27)[0x405e87]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x10340)[0x7fbffefd5340]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x39)[0x7fbffec36cc9]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x148)[0x7fbffec3a0d8]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x155)[0x7fbfff96e535]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x5e6d6)[0x7fbfff96c6d6]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x5e703)[0x7fbfff96c703]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x5e922)[0x7fbfff96c922]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(_Znwm+0x7d)[0x7fbfff96ce0d]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(_Znam+0x9)[0x7fbfff96cea9]
/home/smrtanalysis/bin/canu/Linux-amd64/bin/ovStoreBuild[0x403025]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7fbffec21ec5]
/home/smrtanalysis/bin/canu/Linux-amd64/bin/ovStoreBuild[0x401a39]
Backtrace (demangled):
[0] /home/smrtanalysis/bin/canu/Linux-amd64/bin/ovStoreBuild::AS_UTL_catchCrash(int, siginfo*, void*) + 0x27 [0x405e87]
[1] /lib/x86_64-linux-gnu/libpthread.so.0::(null) + 0x10340 [0x7fbffefd5340]
[2] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0x39 [0x7fbffec36cc9]
[3] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0x148 [0x7fbffec3a0d8]
[4] /usr/lib/x86_64-linux-gnu/libstdc++.so.6::__gnu_cxx::__verbose_terminate_handler() + 0x155 [0x7fbfff96e535]
[5] /usr/lib/x86_64-linux-gnu/libstdc++.so.6::(null) + 0x5e6d6 [0x7fbfff96c6d6]
[6] /usr/lib/x86_64-linux-gnu/libstdc++.so.6::(null) + 0x5e703 [0x7fbfff96c703]
[7] /usr/lib/x86_64-linux-gnu/libstdc++.so.6::(null) + 0x5e922 [0x7fbfff96c922]
[8] /usr/lib/x86_64-linux-gnu/libstdc++.so.6::operator new(unsigned long) + 0x7d [0x7fbfff96ce0d]
[9] /usr/lib/x86_64-linux-gnu/libstdc++.so.6::operator new[](unsigned long) + 0x9 [0x7fbfff96cea9]
[10] /home/smrtanalysis/bin/canu/Linux-amd64/bin/ovStoreBuild() [0x403025]
[11] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0xf5 [0x7fbffec21ec5]
[12] /home/smrtanalysis/bin/canu/Linux-amd64/bin/ovStoreBuild() [0x401a39]
I tried this on several Linux computers, also varying in the amount of memory, but it always crashes at the end.
The command I run:
bin/canu/Linux-amd64/bin/ovStoreBuild -O ....asm/correction/taraxicum.ovlStore.BUILDING -G ......asm/correction/taraxicum.gkpStore -M 128 -L ..../asm/correction/1-overlapper/ovljob.files
The files which are used in this process ( I think..)
total 222592898
drwxrwxr-x 2 smrtanalysis smrtanalysis 536 Feb 25 02:13 ./
drwxrwxr-x 5 smrtanalysis smrtanalysis 388 Feb 25 02:13 ../
-rw-rw-r-- 1 smrtanalysis smrtanalysis 2189245844 Feb 24 17:51 000001.mhap
-rw-rw-r-- 1 smrtanalysis smrtanalysis 573999519 Feb 24 17:53 000001.ovb.gz
-rw-rw-r-- 1 smrtanalysis smrtanalysis 1784843387 Feb 24 17:44 000002.mhap
-rw-rw-r-- 1 smrtanalysis smrtanalysis 466818526 Feb 24 17:45 000002.ovb.gz
-rw-rw-r-- 1 smrtanalysis smrtanalysis 1596437326 Feb 24 17:42 000003.mhap
-rw-rw-r-- 1 smrtanalysis smrtanalysis 417161921 Feb 24 17:43 000003.ovb.gz
................
-rw-rw-r-- 1 smrtanalysis smrtanalysis 22624024 Feb 25 01:58 000264.ovb.gz
-rw-rw-r-- 1 smrtanalysis smrtanalysis 65975285 Feb 25 02:01 000265.mhap
-rw-rw-r-- 1 smrtanalysis smrtanalysis 17650554 Feb 25 02:01 000265.ovb.gz
-rw-rw-r-- 1 smrtanalysis smrtanalysis 27865363 Feb 25 02:00 000266.mhap
-rw-rw-r-- 1 smrtanalysis smrtanalysis 7423759 Feb 25 02:00 000266.ovb.gz
-rw-rw-r-- 1 smrtanalysis smrtanalysis 2833069 Feb 25 01:59 000267.mhap
-rw-rw-r-- 1 smrtanalysis smrtanalysis 767456 Feb 25 01:59 000267.ovb.gz
Does somebody have any clue on this?
Hello,
I am trying to run Canu in our PBS system, however, got error like this.
Do you have any idea to fix it?
Thanks!
___________________________________________________________________________
-- Detected 20 CPUs and 63 gigabytes of memory.
-- Detected Java(TM) Runtime Environment '1.8.0_60' (from 'java').
-- Detected PBS/Torque with 'pbsnodes' binary in /usr/local/bin/pbsnodes.
socket_connect_unix failed: 15137
pbsnodes: cannot connect to server melon, error=15137 (could not connect to trqauthd)
--
maxMemory 1048576 maxThreads 1024
maxMemory 1048576 maxThreads 1024
maxMemory 1048576 maxThreads 1024
maxMemory 1048576 maxThreads 1024
maxMemory 1048576 maxThreads 1024
maxMemory 1048576 maxThreads 1024
maxMemory 1048576 maxThreads 1024
maxMemory 1048576 maxThreads 1024
maxMemory 1048576 maxThreads 1024
maxMemory 1048576 maxThreads 1024
maxMemory 1048576 maxThreads 1024
maxMemory 1048576 maxThreads 1024
maxMemory 1048576 maxThreads 1024
maxMemory 1048576 maxThreads 1024
maxMemory 1048576 maxThreads 1024
--
-- Allowed to run under grid control, and use up to 4 compute threads and 16 GB memory for stage 'bogart (unitigger)'.
-- Allowed to run under grid control, and use up to 16 compute threads and 6 GB memory for stage 'mhap (overlapper)'.
-- Allowed to run under grid control, and use up to 16 compute threads and 6 GB memory for stage 'mhap (overlapper)'.
-- Allowed to run under grid control, and use up to 16 compute threads and 6 GB memory for stage 'mhap (overlapper)'.
-- Allowed to run under grid control, and use up to 4 compute threads and 8 GB memory for stage 'read error detection (overlap error adjustment)'.
-- Allowed to run under grid control, and use up to 1 compute thread and 2 GB memory for stage 'overlap error adjustment'.
-- Allowed to run under grid control, and use up to 4 compute threads and 32 GB memory for stage 'utgcns (consensus'.
-- Allowed to run under grid control, and use up to 1 compute thread and 4 GB memory for stage 'overlap store sequential building'.
-- Allowed to run under grid control, and use up to 1 compute thread and 4 GB memory for stage 'overlap store parallel bucketizer'.
-- Allowed to run under grid control, and use up to 1 compute thread and 16 GB memory for stage 'overlap store parallel sorting'.
-- Allowed to run under grid control, and use up to 1 compute thread and 6 GB memory for stage 'overlapper'.
-- Allowed to run under grid control, and use up to 8 compute threads and 8 GB memory for stage 'overlapper'.
-- Allowed to run under grid control, and use up to 8 compute threads and 8 GB memory for stage 'overlapper'.
-- Allowed to run under grid control, and use up to 4 compute threads and 8 GB memory for stage 'meryl (k-mer counting)'.
-- Allowed to run under grid control, and use up to 4 compute threads and 16 GB memory for stage 'falcon_sense (read correction)'.
--
-- This is canu parallel iteration #1, out of a maximum of 2 attempts.
--
-- Final error rates before starting pipeline:
--
-- genomeSize -- 4800000
-- errorRate -- 0.025
--
-- corOvlErrorRate -- 0.075
-- obtOvlErrorRate -- 0.075
-- utgOvlErrorRate -- 0.075
--
-- obtErrorRate -- 0.075
--
-- utgGraphErrorRate -- 0.05
-- utgBubbleErrorRate -- 0.0625
-- utgMergeErrorRate -- 0.0375
-- utgRepeatErrorRate -- 0.05
--
-- corErrorRate -- 0.30
-- cnsErrorRate -- 0.0625
--
--
-- BEGIN CORRECTION
--
--
-- GATEKEEPER (correction)
--
--
-- Starting command on Fri Feb 26 14:53:59 2016 with 908.7 GB free disk space
--
/share/workplace/home/zhangxt/software/canu-1.0/Linux-amd64/bin/gatekeeperCreate \
-minlength 1000 \
-o /share/bioinfo/zhangxt/test/Canu_test/ecoli-auto/correction/ecoli.gkpStore.BUILDING \
/share/bioinfo/zhangxt/test/Canu_test/ecoli-auto/correction/ecoli.gkpStore.gkp \
> /share/bioinfo/zhangxt/test/Canu_test/ecoli-auto/correction/ecoli.gkpStore.err 2>&1
--
-- Finished on Fri Feb 26 14:55:30 2016 (91 seconds) with 908.1 GB free disk space
gnuplot < /share/bioinfo/zhangxt/test/Canu_test/ecoli-auto/correction/ecoli.gkpStore/readlengths.gp \
> /dev/null 2>&1
ERROR: Failed with signal 127
--
-- In gatekeeper store '/share/bioinfo/zhangxt/test/Canu_test/ecoli-auto/correction/ecoli.gkpStore':
-- Found 12528 reads.
-- Found 115899341 bases (24.14 times coverage).
--
-- Read length histogram (one '*' equals 20.62 reads):
-- 0 999 0
-- 1000 1999 1444 **********************************************************************
-- 2000 2999 1328 ****************************************************************
-- 3000 3999 1065 ***************************************************
-- 4000 4999 774 *************************************
-- 5000 5999 668 ********************************
-- 6000 6999 619 ******************************
-- 7000 7999 618 *****************************
-- 8000 8999 607 *****************************
-- 9000 9999 560 ***************************
-- 10000 10999 523 *************************
-- 11000 11999 478 ***********************
-- 12000 12999 429 ********************
-- 13000 13999 379 ******************
-- 14000 14999 366 *****************
-- 15000 15999 353 *****************
-- 16000 16999 329 ***************
-- 17000 17999 297 **************
-- 18000 18999 294 **************
-- 19000 19999 283 *************
-- 20000 20999 251 ************
-- 21000 21999 195 *********
-- 22000 22999 152 *******
-- 23000 23999 132 ******
-- 24000 24999 75 ***
-- 25000 25999 66 ***
-- 26000 26999 56 **
-- 27000 27999 44 **
-- 28000 28999 35 *
-- 29000 29999 16
-- 30000 30999 21 *
-- 31000 31999 18
-- 32000 32999 11
-- 33000 33999 8
-- 34000 34999 6
-- 35000 35999 6
-- 36000 36999 10
-- 37000 37999 2
-- 38000 38999 3
-- 39000 39999 2
-- 40000 40999 2
-- 41000 41999 2
-- 42000 42999 1
-- MERYL (correction)
-- Meryl attempt 1 begins.
--
-- Starting command on Fri Feb 26 14:55:44 2016 with 907.9 GB free disk space
--
qsub \
-l mem=8g -l nodes=1:ppn=4 \
-d `pwd` -N "meryl_ecoli" \
-t 1-1 \
-j oe -o /share/bioinfo/zhangxt/test/Canu_test/ecoli-auto/correction/0-mercounts/meryl.\$PBS_ARRAYID.out \
/share/bioinfo/zhangxt/test/Canu_test/ecoli-auto/correction/0-mercounts/meryl.sh
socket_connect_unix failed: 15137
qsub: cannot connect to server (null) (errno=15137) could not connect to trqauthd
--
-- Finished on Fri Feb 26 14:55:46 2016 (2 seconds) with 907.9 GB free disk space
ERROR: Failed with signal NUM33 (33)
================================================================================
Please panic. canu failed, and it shouldn't have.
Stack trace:
at /share/workplace/home/zhangxt/software/canu-1.0/Linux-amd64/bin/lib/canu/Defaults.pm line 220
canu::Defaults::caFailure('Failed to submit batch jobs', undef) called at /share/workplace/home/zhangxt/software/canu-1.0/Linux-amd64/bin/lib/canu/Execution.pm line 1125
canu::Execution::submitOrRunParallelJob('/share/bioinfo/zhangxt/test/Canu_test/ecoli-auto', 'ecoli', 'meryl', '/share/bioinfo/zhangxt/test/Canu_test/ecoli-auto/correction/0...', 'meryl', 1) called at /share/workplace/home/zhangxt/software/canu-1.0/Linux-amd64/bin/lib/canu/Meryl.pm line 333
canu::Meryl::merylCheck('/share/bioinfo/zhangxt/test/Canu_test/ecoli-auto', 'ecoli', 'cor') called at /share/workplace/home/zhangxt/software/canu-1.0/Linux-amd64/bin/canu line 402
canu failed with 'Failed to submit batch jobs'.
When running locally, the best.* files from unitigging are written to the folder where Canu was launched not to the unitigging/4-unitigger folder, as they should be. This does not happen for grid runs.
program executed like this:
/home/ubuntu/canu/Linux-amd64/bin/canu -p canu -d canu genomeSize=3.4m errorRate=0.02 -pacbio-raw pacbio.c50.fq
Note ERROR
message after each task ends.
----------------------------------------BEGIN CORRECTION
----------------------------------------SPACE 1860.9 GB
----------------------------------------START Thu Oct 22 10:49:29 2015
/home/ubuntu/canu/Linux-amd64/bin/gatekeeperCreate \
-minlength 1000 \
-o /mnt/wgs/canu/correction/canu.gkpStore.BUILDING \
/mnt/wgs/canu/correction/canu.gkpStore.gkp \
> /mnt/wgs/canu/correction/canu.gkpStore.err 2>&1
----------------------------------------END Thu Oct 22 10:49:32 2015 (3 seconds)
ERROR: Failed with signal 127
-- In gatekeeper store `/mnt/wgs/canu/correction/canu.gkpStore`:
-- Found 15953 reads.
-- Found 189313441 bases (55.68 times coverage).
--
-- Read length histogram (one '*' equals 13.17 reads):
-- 0 999 0
-- 1000 1999 739 ********************************************************
-- 2000 2999 778 ***********************************************************
-- 3000 3999 801 ************************************************************
-- 4000 4999 922 **********************************************************************
-- 5000 5999 851 ****************************************************************
-- 6000 6999 844 ****************************************************************
-- 7000 7999 857 *****************************************************************
-- 8000 8999 821 **************************************************************
-- 9000 9999 859 *****************************************************************
-- 10000 10999 791 ************************************************************
-- 11000 11999 754 *********************************************************
-- 12000 12999 733 *******************************************************
-- 13000 13999 708 *****************************************************
-- 14000 14999 657 *************************************************
-- 15000 15999 571 *******************************************
-- 16000 16999 495 *************************************
-- 17000 17999 520 ***************************************
-- 18000 18999 433 ********************************
-- 19000 19999 435 *********************************
-- 20000 20999 345 **************************
-- 21000 21999 324 ************************
-- 22000 22999 293 **********************
-- 23000 23999 238 ******************
-- 24000 24999 225 *****************
-- 25000 25999 178 *************
-- 26000 26999 171 ************
-- 27000 27999 126 *********
-- 28000 28999 86 ******
-- 29000 29999 90 ******
-- 30000 30999 47 ***
-- 31000 31999 58 ****
-- 32000 32999 51 ***
-- 33000 33999 45 ***
-- 34000 34999 36 **
-- 35000 35999 18 *
-- 36000 36999 13
-- 37000 37999 9
-- 38000 38999 7
-- 39000 39999 3
-- 40000 40999 4
-- 41000 41999 5
-- 42000 42999 2
-- 43000 43999 3
-- 44000 44999 3
-- 45000 45999 1
-- 46000 46999 2
-- 47000 47999 1
----------------------------------------SPACE 1860.6 GB
----------------------------------------START Thu Oct 22 10:49:32 2015
/home/ubuntu/canu/Linux-amd64/bin/meryl \
-B -C -L 2 -v -m 16 -threads 16 -memory 19456 \
-s /mnt/wgs/canu/correction/canu.gkpStore \
-o /mnt/wgs/canu/correction/0-mercounts/canu.ms16 \
> /mnt/wgs/canu/correction/0-mercounts/meryl.err 2>&1
----------------------------------------END Thu Oct 22 10:49:59 2015 (27 seconds)
----------------------------------------SPACE 1860 GB
----------------------------------------START Thu Oct 22 10:49:59 2015
/home/ubuntu/canu/Linux-amd64/bin/meryl -Dh -s /mnt/wgs/canu/correction/0-mercounts/canu.ms16 > /mnt/wgs/canu/correction/0-mercounts/canu.ms16.histogram 2> /mnt/wgs/canu/correction/0-mercounts/canu.ms16.histogram.info
----------------------------------------END Thu Oct 22 10:49:59 2015 (0 seconds)
ERROR: Failed with signal 127
- Found 189074146 16-mers; 135404446 distinct and 111639232 unique. Largest count 22700.
For 2 blocks, set stride to 2 blocks.
JOB 1 BLOCK 1 vs (self)
-- Computed seed length 11381 from desired output coverage 40 and genome size 3400000
-- Configured 1 mhap precompute jobs.
-- Configured 1 mhap overlap jobs.
mhapPrecomputeCheck() -- attempt 1 begins with 0 finished, and 1 to compute.
----------------------------------------SPACE 1860.6 GB
----------------------------------------START CONCURRENT Thu Oct 22 10:50:01 2015 (1 processes; 2 concurrently)
/mnt/wgs/canu/correction/1-overlapper/precompute.sh 1 > /mnt/wgs/canu/correction/1-overlapper/precompute.000001.out 2>&1
When reducing coverage for read correction and unitig consensus, the longer contained reads are discarded before the shorter contained reads. This is backwards.
It seems the html is not completely generated, for instance:
<h2>Trimming</h2>
<h2>Trimmed Reads</h2>
are remains empty even the trimming process is completed.
Especially for things that read the whole overlap store, some indication of percent complete would be nice.
I ran an assembly with some long reads in excess of 65,535 bp. What was interesting is that after correction, the read lengths in the trimming and unitigging stages (i.e. in asm.gkpStore/readlengths.txt) reached up to exactly 65535, which was repeated 11 times. What makes this more curious is that the read length histogram shows that the >10 bins before "65000 - 65999" all have <= 5 counts, then it shoots up to 11 (all of which are the same exact length of 65535):
33000 33999 50
34000 34999 78
35000 35999 41
36000 36999 37
37000 37999 40
38000 38999 39
39000 39999 23
40000 40999 23
41000 41999 35
42000 42999 20
43000 43999 18
44000 44999 11
45000 45999 11
46000 46999 10
47000 47999 10
48000 48999 4
49000 49999 3
50000 50999 7
51000 51999 9
52000 52999 6
53000 53999 6
54000 54999 3
55000 55999 2
56000 56999 2
57000 57999 1
58000 58999 4
59000 59999 3
60000 60999 5
61000 61999 1
62000 62999 1
63000 63999 3
64000 64999 1
65000 65999 11
All in all, I am just wondering if there is a cap on read size (even if unintended) imposed during/after correction .. perhaps in the beginning of the trimming stages…
Hello,
I am running Canu on a small dataset of bacterial sequences (~22k short reads, from a custom library prep) using this command: canu -d /opt/Sma_Processing/Dario_Analysis/wasp_corinne -p canu_test1 genomeSize=950m minReadLength=200 minOverlapLength=100 errorRate=0.15 -pacbio-raw 017481-data-aligned_reads.fq
It crashes with this error:
Don't panic, but a mostly harmless error occurred and canu failed.
canu failed with 'can't find '/opt/Sma_Processing/Dario_Analysis/wasp_corinne/unitigging/5-consensus/cnsjob.files' for loading tigs into store: No such file or directory'.
Earlier we had issues with Gatekeeper (that now seem to be fixed): can it be that this is still related to it?
Thanks,
Dario
Since we have a C++ driver for falcon_sense, we can have it read binary stores directly rather than dumping an intermediate text format.
Canu currently doesn't have a way to report version, add a -v option to the main script
Hello,
We are having issues on running Canu on our cluster, the error says:
canu failed with 'can't configure for SGE'.
The command is: /opt/canu/Linux-amd64/bin/canu -d /home/../250k_assembly -p test250k genomeSize=380m corMinCoverage=2 errorRate=0.18 -pacbio-raw input_subreads.fa
and we have a PSSC cluster with 4 nodes, each with 12 core and 32 GB Ram, so a total of 48 cores and 128GB Ram. The SGE is GE 6.2u5p3 and it has CentOS release 6.5 (Final).
Following the documentation, we tested all the 6x2 options at Grid Engine Configuration (here one example)
754 $global{"gridEngineThreadsOption"} ="-pe make THREADS";
755 $global{"gridEngineMemoryOption"} ="-l h_vmem=MEMORY";
but always got the same error.
How do we set up the assembler to run on our cluster?
Will we then be able to modulate the resources (cores, cpus, memory), so that we can have other processes running at the same time?
Thanks
Hello I was eager to try this fork
Unfortunately the dependencies are less clear. For the java version the message was pretty clear, but now I seem to lack a falcon program.
I have currently the Falcon assembler from Jason here but it seems not to be sufficient. What is this
dependency exactly?
Thanks
With very large overlap stores (>20TB), overlap store stats and generateCorrectionLayouts crashes with an assertion of:
generateCorrectionLayouts: stores/ovStore.C:379: uint32 ovStore::readOverlaps(ovOverlap*, uint32, bool): Assertion `_offt._numOlaps <= maxOverlaps' failed.
(From issue #19)
Break the consensus outputs into three categories:
Filtering is done (consensusFilter() in Consensus.pm), and should be updated in tigStore, but the tgStoreDump (outputSequence() in Output.pm) doesn't know how to obey the markings.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.