Giter VIP home page Giter VIP logo

Comments (3)

skovaka avatar skovaka commented on July 17, 2024

Hi,

It looks like your CPU has only has 4 cores, which isn't ideal since UNCALLED works better with more cores, although you could still get good results. Did you run with more than one thread? I'd recommend using 8 threads for you CPU (add the "-t 8" option).

Do you expect all of your 40k reads to align? If so then the 12.77% alignment rate is concerning. Otherwise I recommend separating your reads into those you expect to align (TPs) and those you don't (FPs). TPs should be much quicker to align because the algorithm stops as soon as it finds a location. When running in realtime you set a cutoff of when to give up on a read so you don't spend too much time on those that don't align.

Finally, the bp/sec speed is a bit low, so I'd recommend doing a few rounds of repeat masking (see the "masking/" subdirectory). I suspect there are some low complexity regions in your chromosome that's slowing you down. This could also help with your alignment rate.

Hope that helps!
Sam

from uncalled.

cgjosephlee avatar cgjosephlee commented on July 17, 2024

Thank you for the quick reply!

It looks like your CPU has only has 4 cores, which isn't ideal since UNCALLED works better with more cores, although you could still get good results. Did you run with more than one thread? I'd recommend using 8 threads for you CPU (add the "-t 8" option).

Yes, since it is a 4c8t cpu, I'm using -t 8 option. Would you please share your ideal spec of running UNCALLED in realtime? I see you used a 24c48t machine (2 Intel Xeon Gold 6136?) in the paper, is it a minimal requirement?

Do you expect all of your 40k reads to align? If so then the 12.77% alignment rate is concerning. Otherwise I recommend separating your reads into those you expect to align (TPs) and those you don't (FPs). TPs should be much quicker to align because the algorithm stops as soon as it finds a location. When running in realtime you set a cutoff of when to give up on a read so you don't spend too much time on those that don't align.

And yes, I used only one chromosome instead of whole genome as reference, so the alignment rate is quite rational to me.

Finally, the bp/sec speed is a bit low, so I'd recommend doing a few rounds of repeat masking (see the "masking/" subdirectory). I suspect there are some low complexity regions in your chromosome that's slowing you down. This could also help with your alignment rate.

I will try to do masking and run again. What is the minimal bp/sec you suggest?

When doing so, I found that the suggested kmer length of 10 is sometimes problematic, an odd number like 11 would be good.

Best,
Joseph

from uncalled.

skovaka avatar skovaka commented on July 17, 2024

Hi,

Sorry for the delay. Unfortunately we don't know what the minimal system requirements are. The Xeon Gold 6136 is certainly not required, but in general more cores is better. We've done some testing using only 8 cores and got good results, but it very much depends on your sample and flowcell conditions. Hopefully we can give better guidelines in the future, and I would be very interested to hear how your run goes if you do it.

For masking I would recommend doing as much as possible up until your mapping rate (true positive) starts to reduce. For our Xeon Gold ~6kbp/sec is more than fast enough, but we need to do more testing to get a better idea of the limits. Again, I'll never be able give an absolute minimum bp/sec because the requirements change with your number of cores, sample, and flowcell conditions, but I'm very interested to hear about results running on lower-end CPUs than ours.

Best of luck,
Sam

from uncalled.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.