Giter VIP home page Giter VIP logo

Comments (2)

Rohit-Satyam avatar Rohit-Satyam commented on August 24, 2024

I tried running the scrubbing and chimera detection codes separately. I can see that scrubbing also detects some chimera (in the yacrd file) and so I decided to see if these chimeras are the same as the ones we get when running the Chimera detection code. But there is a small overlap.

Chimera_code: Using chimera code only
ChimeraFrmScrub:Reads tagged as chimera when running Scrubb code only
ScrubChimNotCov: Reads tagged as chimera + Reads tagged as NotCovered when running Scrubb code only

The numbers in the Venn diagram are reads
image

From the Venn diagram, I can conclude that the scrubbing operation though lossy alone is sufficient to take care of both Chimeric reads and bad quality reads isn't it? So rather than running Chimera removal first and then Scrubbing will not only be time-consuming but will be redundant?

from yacrd.

natir avatar natir commented on August 24, 2024

Hello,

First of all, thank you for using yacrd.

In fact, the yacrd algorithm is divided into a detection step for poor quality zones (aka with poor coverage), a characterization step for these zones and eventually a processing step for these poor quality zones.

The characterization of poor-quality zones is based above all on their position: a zone in the middle of a read is considered chimerical, and so the read.

Between scrubbing and chimera management, it's only the last step that changes. The .yacrd file is produced by the second step.

If you manage chimeras by splitting the reads will be cut by yacrd, if you use scrubbing these same reads will also be cut, the ends of these reads will also be deleted as scrubbing removes all poor-quality regions.

To answer your question, which if I've understood correctly is whether to run chimera splitting followed by scrubbing or whether scrubbing is enough. Scrubbing cuts out the chimeras, so we can dispense with chimera splitting. However, remapping after chimera removal could result in better quality scrubbing. It's up to you to decide whether this potential increase in quality justifies the expenditure of mapping and analysis time.

If you have any interesting results, I'd be delighted to integrate your recommendations for use into the yacrd readme, indicating any publications.

If you have any interesting results, I'd be happy to add your recommendations for use to the yacrd readme, indicating that it's your contribution and citing any publications.

Thanks again for your interest and contribution.

from yacrd.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.