Comments (3)
Hi,
It looks like your CPU has only has 4 cores, which isn't ideal since UNCALLED works better with more cores, although you could still get good results. Did you run with more than one thread? I'd recommend using 8 threads for you CPU (add the "-t 8" option).
Do you expect all of your 40k reads to align? If so then the 12.77% alignment rate is concerning. Otherwise I recommend separating your reads into those you expect to align (TPs) and those you don't (FPs). TPs should be much quicker to align because the algorithm stops as soon as it finds a location. When running in realtime you set a cutoff of when to give up on a read so you don't spend too much time on those that don't align.
Finally, the bp/sec speed is a bit low, so I'd recommend doing a few rounds of repeat masking (see the "masking/" subdirectory). I suspect there are some low complexity regions in your chromosome that's slowing you down. This could also help with your alignment rate.
Hope that helps!
Sam
from uncalled.
Thank you for the quick reply!
It looks like your CPU has only has 4 cores, which isn't ideal since UNCALLED works better with more cores, although you could still get good results. Did you run with more than one thread? I'd recommend using 8 threads for you CPU (add the "-t 8" option).
Yes, since it is a 4c8t cpu, I'm using -t 8
option. Would you please share your ideal spec of running UNCALLED in realtime? I see you used a 24c48t machine (2 Intel Xeon Gold 6136?) in the paper, is it a minimal requirement?
Do you expect all of your 40k reads to align? If so then the 12.77% alignment rate is concerning. Otherwise I recommend separating your reads into those you expect to align (TPs) and those you don't (FPs). TPs should be much quicker to align because the algorithm stops as soon as it finds a location. When running in realtime you set a cutoff of when to give up on a read so you don't spend too much time on those that don't align.
And yes, I used only one chromosome instead of whole genome as reference, so the alignment rate is quite rational to me.
Finally, the bp/sec speed is a bit low, so I'd recommend doing a few rounds of repeat masking (see the "masking/" subdirectory). I suspect there are some low complexity regions in your chromosome that's slowing you down. This could also help with your alignment rate.
I will try to do masking and run again. What is the minimal bp/sec you suggest?
When doing so, I found that the suggested kmer length of 10 is sometimes problematic, an odd number like 11 would be good.
Best,
Joseph
from uncalled.
Hi,
Sorry for the delay. Unfortunately we don't know what the minimal system requirements are. The Xeon Gold 6136 is certainly not required, but in general more cores is better. We've done some testing using only 8 cores and got good results, but it very much depends on your sample and flowcell conditions. Hopefully we can give better guidelines in the future, and I would be very interested to hear how your run goes if you do it.
For masking I would recommend doing as much as possible up until your mapping rate (true positive) starts to reduce. For our Xeon Gold ~6kbp/sec is more than fast enough, but we need to do more testing to get a better idea of the limits. Again, I'll never be able give an absolute minimum bp/sec because the requirements change with your number of cores, sample, and flowcell conditions, but I'm very interested to hear about results running on lower-end CPUs than ours.
Best of luck,
Sam
from uncalled.
Related Issues (20)
- [QUERY] Can I test UNCALLED to always try and map only 2000 raw samples from FAST5 HOT 1
- Uncalled in ubuntu 16 HOT 3
- Segmentation fault in 'uncalled sim' HOT 3
- Floating point exception HOT 8
- uncalled failed to connect to minknow instance HOT 3
- Running Uncalled using Flongle flow cell HOT 2
- Updated Uncalled and Minknow No Longer Working HOT 4
- Generating sequencing summary from fast5 raw reads HOT 1
- [QUESTION] mapping to reference stringency HOT 1
- Installing UNCALLED4 error HOT 12
- [QUESTION] 10X run-time for reads of 500 raw signals HOT 12
- Visualising f5c resquiggle output in UNCALLED4 HOT 6
- Computer requirement for UNCALLED HOT 1
- `sim` segmentation error HOT 7
- Installation on Ubuntu (issue with compiler) HOT 5
- Sequenced reads are too short HOT 4
- New release HOT 2
- Remove mux scan windows from Flongle run HOT 1
- Error when trying example HOT 12
- Fast5 file `vbz` problem HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from uncalled.