Comments (7)
I tried sim
with a different set of control where I provided the directory to the actual fast5 files, but still got the same error.
/var/spool/slurmd/job10576865/slurm_script: line 34: 22175 Segmentation fault python UNCALLED/scripts/uncalled sim $bwa_prefix $path_ctl_fast5s --ctl-seqsum $path_ctl_seqsum --unc-seqsum $path_unc_seqsum --unc-paf $path_unc_paf -t 16 --enrich -c 3 --sim-speed 0.25 > uncalled_out.paf 2> uncalled_err.txt
Below are the codes I run:
bwa_prefix="sim_data/viral_genome_ref"
ref_genome="sim_data/viral.1.1.genomic.fna.gz"
path_ctl_fast5s="/athena/tilgnerlab/scratch/caf4010/3_8_23_UNCALLED/Example_Input/fast5_pass"
path_ctl_seqsum="/athena/tilgnerlab/scratch/caf4010/3_8_23_UNCALLED/Example_Input/sequencing_summary_PAG69730_bbfa25c2.txt"
path_unc_seqsum="sim_data/20191220_GM12878_seqsum.txt"
path_unc_paf="sim_data/20191220_GM12878_uncalled.paf"
python UNCALLED/scripts/uncalled sim $bwa_prefix $path_ctl_fast5s --ctl-seqsum $path_ctl_seqsum --unc-seqsum $path_unc_seqsum --unc-paf $path_unc_paf -t 16 --enrich -c 3 --sim-speed 0.25 > uncalled_out.paf 2> uncalled_err.txt
from uncalled.
This has come up before (#42), and unfortunately and I wasn't able to reproduce the error at the time. Was a "core dump" file produced for the segfault? If so, can you share it with me? They are not always produced by default, but you should be able configure your machine to generate one.
Unfortunately this is a hard problem to debug, since the simulator will produce different results depending on how fast your computer can map reads, so it's fundamentally non-deterministic. Plus it only seems to pop up in large datasets, and running a debugger slows everything down and seems to prevent the error from occurring.
from uncalled.
Is it possible for you to provide the control fast5 and control sequencing summary files that have run successful simulations before? I will test it on my end to ensure it is not caused by input files.
from uncalled.
Yes, the sequencing summary files for the two samples we've tested can be found here: https://labshare.cshl.edu/shares/schatzlab/www-data/UNCALLED/simulator_files/
And the raw signal files are here: https://www.ncbi.nlm.nih.gov/sra/SRX9270076[accn] https://www.ncbi.nlm.nih.gov/sra/SRX9568954[accn]
from uncalled.
Thank you very much. Just to confirm, is the reference genome the E.coli.fasta
? Also, is the sequencing summary for ctl and unc the same txt file?
from uncalled.
For the Zymo mock microbial simulation in the paper it was actually a reference containing all bacteria, which we "depleted" to increase the yeast yield using the parameters --deplete -C 10
. Here is the reference: zymo_bacteria.fa.gz (originally obtained from here)
And actually the control sequencing summary is different, sorry I forgot about that! Here it is:
zymo_control_sequencing_summary.txt.gz
from uncalled.
There is segmentation fault again after 2 hours.
err.txt
:
uncalled_err.txt
my script:
# MAIN SCRIPT
bwa_prefix="sim_data/E.coli"
ref_genome="sim_data/E.coli.fa"
path_ctl_fast5s="sim_data/20190809_zymo_control/fast5"
path_ctl_seqsum="sim_data/zymo_control_sequencing_summary.txt"
path_unc_seqsum="sim_data/20190809_zymo_seqsum.txt"
path_unc_paf="sim_data/20190809_zymo_uncalled.paf"
# python UNCALLED/scripts/uncalled index -o sim_data/E.coli sim_data/E.coli.fa
python UNCALLED/scripts/uncalled sim $bwa_prefix $path_ctl_fast5s \
--ctl-seqsum $path_ctl_seqsum \
--unc-seqsum $path_unc_seqsum \
--unc-paf $path_unc_paf \
-t 16 --enrich -c 3 --sim-speed 0.25 > uncalled_out.paf 2> uncalled_err.txt
from uncalled.
Related Issues (20)
- [QUERY] Can I test UNCALLED to always try and map only 2000 raw samples from FAST5 HOT 1
- Uncalled in ubuntu 16 HOT 3
- Segmentation fault in 'uncalled sim' HOT 3
- Floating point exception HOT 8
- uncalled failed to connect to minknow instance HOT 3
- Running Uncalled using Flongle flow cell HOT 2
- Updated Uncalled and Minknow No Longer Working HOT 4
- Generating sequencing summary from fast5 raw reads HOT 1
- [QUESTION] mapping to reference stringency HOT 1
- Installing UNCALLED4 error HOT 12
- [QUESTION] 10X run-time for reads of 500 raw signals HOT 12
- Visualising f5c resquiggle output in UNCALLED4 HOT 6
- Computer requirement for UNCALLED HOT 1
- Installation on Ubuntu (issue with compiler) HOT 5
- Sequenced reads are too short HOT 4
- New release HOT 2
- Remove mux scan windows from Flongle run HOT 1
- Error when trying example HOT 12
- Fast5 file `vbz` problem HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from uncalled.