Giter VIP home page Giter VIP logo

Comments (8)

mlin avatar mlin commented on July 21, 2024

Here's another problematic unitig similar to #44 (while the original example in #44 does work now, after your previous patch)

$ time vg/vg map -f <(curl -s https://gist.githubusercontent.com/mlin/6e5b4002ef6abf3a41bc/raw/cc6be2f513fac0dad7155829ca53c998ad78e61e/1138319529_1774999222.mag | gzip -c) -d wg.index.k27e11 -t 1 -FX 1.9 > unitigs.gam
vg: path.cpp:514: vg::Path vg::merge_paths(const vg::Path&, const vg::Path&, int&, int&): Assertion `p1mp->position().node_id() == p2mp->position().node_id()' failed.
Aborted

real    0m11.512s
user    0m9.704s
sys     0m1.810s

from vg.

mlin avatar mlin commented on July 21, 2024

Erik, here are fermikit unitigs from a complete personal genome to use for testing. These are from the Illumina data in English 2015.

https://dl.dnanex.us/F/D/fb2K7XYzb2BF4f0b2PfGvY25jj35b6Z07G3f7zbg/HS1011.mag.gz (<3 GiB)

Without setting -B to a very high value, I can provoke the same error as above:

$ vg map -f HS1011.mag.gz -d vg/index -t 16 -FX 1.8
vg: path.cpp:561: vg::Path vg::merge_paths(const vg::Path&, const vg::Path&, int&, int&): Assertion `p1mp->position().node_id() == p2mp->position().node_id()' failed.

Presumably the banded alignment will be in pretty good shape if you can get this whole genome to run through!

from vg.

mlin avatar mlin commented on July 21, 2024

By the way what's the downside to setting -B to greater than the max unitig length? sensitivity loss?

from vg.

ekg avatar ekg commented on July 21, 2024

The local alignment is O(unitig length X reference length) in memory and
compute for each chunk of the reference we try. So if you are looking at
10-20kb unitigs you should be fine setting the limit greater than the
longest unitig. Memory blows up much larger than that. What is the size
distribution?
On Aug 18, 2015 7:56 PM, "Mike Lin" [email protected] wrote:

By the way what's the downside to setting -B to greater than the max
unitig length? sensitivity loss?


Reply to this email directly or view it on GitHub
#54 (comment).

from vg.

mlin avatar mlin commented on July 21, 2024

Nice- so there's no correctness tradeoff?
The N50 of fermikit unitigs is usually slightly below 1kbp, and the longest one for HS1011 is 45,649bp. With vg map -f HS1011.mag.gz -d k27e7/index -t 16 -FX 1.8 -B 100000 they all mapped in about 48 CPU-hours. 😋

from vg.

ekg avatar ekg commented on July 21, 2024

Yes! So I should move the default way up. I think I was worried about
handling 1m contigs. Thanks Mike!
On Aug 19, 2015 2:04 AM, "Mike Lin" [email protected] wrote:

Nice- so there's no correctness tradeoff?
The N50 of fermikit unitigs is usually slightly below 1kbp, and the
longest one for HS1011 is 45,649bp. With vg map -f HS1011.mag.gz -d
k27e7/index -t 16 -FX 1.8 -B 100000 they all mapped in about 48
CPU-hours. [image: 😋]


Reply to this email directly or view it on GitHub
#54 (comment).

from vg.

ekg avatar ekg commented on July 21, 2024

I just realized that by removing the assertion that's you've been hitting we naturally get a split-read mapper. There may be some problems with the way that the alignment is represented, which I'm looking into. Otherwise this will resolve the vast majority of problems that we're seeing and allow for native detection of long deletions, translocations (!) and inversions (!!). Mind blown.

However, it won't fix what happens for insertions. This requires a few more changes in path.cpp. The merge will need to look through the whole series of sub-alignments that are being combined in the banding process and detect where whole series of components should be represented as insertions. (I remember making it do that at least as I'm writing... to check).

from vg.

ekg avatar ekg commented on July 21, 2024

I think this is pretty much resolved due to some crazy restructuring of the mapper. Please keep trying to break it :), but I'll mark as closed again.

from vg.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.