Giter VIP home page Giter VIP logo

Comments (3)

nuin avatar nuin commented on August 30, 2024

It seems related to the data, it slows down at the chr13 range when running without the -t parameter.

from octopus.

dancooke avatar dancooke commented on August 30, 2024

Hi, thanks for your question. This is likely because this region is 'complex' in some way; either there is substantial true or false variation (e.g. due to miss-mapped reads or sequencing errors). It's also possible that the depth is very high in this region. The runtime of Octopus is mostly influenced by the number of candidate haplotypes that are considered, and therefore the number of proximate candidate variants. This particularly affects the cancer calling model as this is considerably more complex than other models, even more-so without a normal sample as there is generally greater uncertainty in sample genotypes.

Unless you have a strong reason to believe that these regions do not contain true variation, in which case you could skip them (e.g. using the --skip-regions or --skip-regions-file option), then I wouldn't take any further action. If the runtime is prohibitively long then you could consider ways to reduce runtime (e.g. reducing --max-haplotypes, --max-somatic-haplotypes, or --max-genotypes), although this will likely come at an accuracy cost.

from octopus.

nuin avatar nuin commented on August 30, 2024

Thanks, @dancooke. We actually looking at variants at these locations (BRCA1 and 2, mostly) and from a visual inspection there's seem to be quite a lot of insertions in this area, that might be increasing complexity. This is IonTorrent data which is know to have lower quality than Illumina, which we are testing soon.

Thanks again.

from octopus.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.